Quasi-ASICs: Trading Area for Energy by Exploiting Similarity in Synthesized Cores for Irregular Code

Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Steve Swanson and Michael Taylor
CS2011-0964
March 8, 2011

The transistor density continues to increase exponentially, but the power dissipation per transistor improves only slightly with each generation of Moore’s law. Given the constant chip-level power budgets, this exponentially decreases the fraction of the transistors that can be active simultaneously with each technology generation. Hence, while the area budget continues to increase exponentially, the power budget has become a first-order design constraint in current processors. In this regime, utilizing transistors to design specialized cores that optimize energy-per-computation becomes an effective approach to improve the system performance. To trade transistors for energy efficiency in a scalable manner, we propose quasi application-specific integrated circuits, or QASICs, specialized processors capable of executing multiple general- purpose applications while providing an order-of-magnitude more energy efficiency than a general-purpose processor. The QASIC design flow is based on the insight that similar code-patterns exist across applications. Our approach seeks to exploit these similar code patterns to design specialized cores that can support many of the widely used computations. Our results demonstrate that designing relatively few QASICs can support operator functions of multiple commonly used data structures and these QASICs provide 13.5× energy savings over a general-purpose processor. On a more diverse workload consist- ing of twelve applications selected from different application do- mains (including SPECINT, Sat Solver, Vision, EEMBC, among others), our results show that QASICs reduce the required number of application-specific circuits by over 50% and the area requirement by 23% compared to the fully-specialized logic while providing energy-efficiency within 1.27X of that of fully-specialized logic. Also, at system level, our approach reduces the application energy-delay metric by 46% compared to conventional processors.


How to view this document


The authors of these documents have submitted their reports to this technical report series for the purpose of non-commercial dissemination of scientific work. The reports are copyrighted by the authors, and their existence in electronic format does not imply that the authors have relinquished any rights. You may copy a report for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect the author's copyright. For information concerning the use of this document for other than research or instructional purposes, contact the authors. Other information concerning this technical report series can be obtained from the Computer Science and Engineering Department at the University of California at San Diego, techreports@cs.ucsd.edu.


[ Search ]


NCSTRL
This server operates at UCSD Computer Science and Engineering.
Send email to webmaster@cs.ucsd.edu