Dynamic optimization has been proposed to overcome many limitations of static optimization, such as inaccurate assumptions about the underlying processor architecture and lack of adaption to the program’s runtime behavior. However, existing dynamic optimization systems often impose high runtime overhead (software systems) or great hardware complexity (hardware systems), and only have limited runtime adaptability. This thesis proposes a new model of optimization, where optimization is triggered by hardware optimization events and is performed concurrently on an application while it is running. We introduce an event-driven multithreaded dynamic optimization architecture, called Trident. Trident is a software/hardware solution which strives to reduce software runtime overhead as well as reduce hardware complexity and inflexibility. Trident takes advantage of two key features in modern processors, abundant chip-level parallelism (through Simultaneous Multithreading, Chip Multithreading, or a combination) and increasing hardware support for runtime performance monitoring. Trident proposes generic, lightweight extensions of the hardware monitoring mechanisms to profile the program’s execution behavior. Hardware triggers an event for optimization upon detection of any interesting behavior. Lightweight helper threads are spawned to process these events, in parallel with the main thread. The combination of event-driven and concurrent optimization makes Trident extremely low overhead in both profiling and optimization. This enables Trident to perform more expensive optimizations than existing dynamic systems, and perform continuous recurrent optimizations without fear of performance loss. The power and flexibility of Trident enable many types of optimizations. In this thesis, we demonstrate it with an aggressive optimization, called speculative dynamic value specialization. We also demonstrate Trident’s power of continuous, gradual optimization by improving traditional software prefetching to better attack the classical memory wall problem. These approaches include adaptive dynamic software prefetching via self-repairing and accelerating precomputation based prefetching.
The authors of these documents have submitted their reports to this technical report series for the purpose of non-commercial dissemination of scientific work. The reports are copyrighted by the authors, and their existence in electronic format does not imply that the authors have relinquished any rights. You may copy a report for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect the author's copyright. For information concerning the use of this document for other than research or instructional purposes, contact the authors. Other information concerning this technical report series can be obtained from the Computer Science and Engineering Department at the University of California at San Diego, firstname.lastname@example.org.
[ Search ]