Optimized Trace Binaries for Architectural Evaluation

Suleyman Sair, Yuanfang Hu, Timothy Sherwood and Brad Calder
CS2002-0711
June 23, 2002

The increasing demand for high performance forces computer architects to employ a plethora of hardware optimizations. These optimizations and new architecture features are often examined with past generation compilers that do not take into account the new architecture optimizations. The use of a complier that is unaware of a set of architecture optimizations may lead to an incorrect estimation of the impact of these new optimizations. This problem is exacerbated for in-order architectures which rely on the complier to assist in scheduling. Our research focuses on efficient techniques for generating a highly optimized and scheduled binary for VLIW and in-order architectures. We propose to use a modified out-of-order simulator to generate a trace scheduled binary for in-order execution. Our constrained out-of-order machine resolves dependencies and allows independent instructions to move above stalled instructions exposing the available parallelism within a program, and performing the appropriate level of loop unrolling and inlining for the architecture. This new trace binary can then be used to guide architectural research, even when there may not yet exist an optimizing compiler for the architecture being evaluated. In this paper we examine the merits of the simulator-based trace optimizer and show how it performs compared to un-scheduled code on an in-order machine.


How to view this document


The authors of these documents have submitted their reports to this technical report series for the purpose of non-commercial dissemination of scientific work. The reports are copyrighted by the authors, and their existence in electronic format does not imply that the authors have relinquished any rights. You may copy a report for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect the author's copyright. For information concerning the use of this document for other than research or instructional purposes, contact the authors. Other information concerning this technical report series can be obtained from the Computer Science and Engineering Department at the University of California at San Diego, techreports@cs.ucsd.edu.


[ Search ]


NCSTRL
This server operates at UCSD Computer Science and Engineering.
Send email to webmaster@cs.ucsd.edu