CoCo: Continuous Compilation

COCO: CONTINUOUS COMPILATION

Compilers, Programming Languages, and Architecture Group
Department of Computer Science, University of Pittsburgh

Overview

People

Projects

Publications

Meetings

Conferences

Listed below are publications from the Continuous Compilation project.


Profit-driven Optimization
Min Zhao, Bruce R. Childers, Mary Lou Soffa
University of Pittsburgh, Computer Science, Technical Report 05-129
December 2005
Paper

	Abstract

	Although optimizations have been applied for a number of years to improve the performance of software, problems with respect to the application of optimizations have not been adequately addressed. For example, optimizations may degrade performance in certain circumstances. However, there is no efficient way to know when a degradation will happen. In this research, we investigate the profitability of optimizations, which is useful for determining the benefit of applying optimizations. We develop a framework that enables us to predict profitability using analytic models. The profitability of an optimization depends on code context, the particular optimization and machine resources. Thus, our framework has analytic models for each of these components. As part of the framework, there is also a profitability engine that uses models to predict the profit. In this paper, we target scalar optimizations, and in particular, describe the models for partial redundancy elimination (PRE), loop invariant code motion (LICM) and value numbering (VN). We implemented the framework for predicting the profitability of these optimizations. Based on the predictions, we can selectively apply profitable optimizations. We compared the profit-driven approach with an approach that uses a heuristic in deciding when optimizations should be applied. Our experiments demonstrate that the profitability of scalar optimizations can be accurately predicted by using models. That is, without actually applying a scalar optimization, we can determine if an optimization is beneficial and should be applied.

TDB: A Source-Level Debugger for Dynamically Translated Programs
Naveen Kumar, Bruce R. Childers, Mary Lou Soffa
ACM SIGSOFT/SIGPLAN Sixth International Symposium on Automated and Analysis-Driven Debugging (AADEBUG'05)
September 2005
Paper

	Abstract

	Debugging techniques have evolved over the years in response to changes in programming languages, implementation techniques, and user needs. A new type of implementation vehicle for software has emerged that, once again, requires new debugging techniques. Software dynamic translation (SDT) has received much attention due to compelling applications of the technology, including software security checking, binary translation, and dynamic optimization. Using SDT, program code changes dynamically, and thus, debugging techniques developed for statically generated code cannot be used to debug these applications. In this paper, we describe a new debug architecture for applications executing with SDT systems. The architecture provides features that create the illusion that the source program is being debugged, while allowing the SDT system to modify the executing code. We incorporated this architecture in a new tool, called tdb, that integrates a SDT system, Strata, with a widely used debugger, gdb. We evaluated tdb in the context of a code security checker. The results show that a dynamically translated program can be debugged at the source-level and that the approach does not overly increase the run-time performance or memory demands of the debugger.

Low Overhead Program Monitoring and Profiling
Naveen Kumar, Bruce R. Childers, Mary Lou Soffa
ACM SIGSOFT/SIGPLAN Workshop on Program Analysis for Software Tools and Engineering (PASTE'05)
September 2005
Paper	(Not yet available)

	Abstract

	Program instrumentation, inserted either before or during execution, is rapidly becoming a necessary component of many systems. Instrumentation is commonly used to collect information for many diverse analysis applications, such as detecting program invariants, dynamic slicing and alias analysis, software security checking, and computer architecture modeling. Because instrumentation typically has a high run-time overhead, techniques are needed to mitigate the overheads. This paper describes .instrumentation optimizations. that reduce the overhead of profiling for program analysis. Our approach applies transformations to the instrumentation code that reduce the (1) number of instrumentation points executed, (2) cost of instrumentation probes, and (3) cost of instrumentation payload, while maintaining the semantics of the original instrumentation. We present the transformations and apply them for program profiling and computer architecture modeling. We evaluate the optimizations and show that the optimizations improve profiling performance by 1.26.2.63x and architecture modeling performance by 2.3.3x.

Planning for Code Buffer Management in Distributed Virtual Execution Environments
Shukang Zhou, Bruce R. Childers, Mary Lou Soffa
ACM/USENIX Conference on Virtual Execution Environments (VEE'05)
June 2005
Paper

	Abstract

	Virtual execution environments have become increasingly useful in system implementation, with dynamic translation techniques being an important component for performance-critical systems. Many devices have exceptionally tight performance and memory constraints (e.g., smart cards and sensors in distributed systems), which require effective resource management. One approach to manage code memory is to download code partitions on-demand from a server and to cache the partitions in the resource-constrained device (client). However, due to the high cost of downloading code and re-translation, it is critical to intelligently manage the code buffer to minimize the overhead of code buffer misses. Yet, intelligent buffer management on the tightly constrained client can be too expensive. In this paper, we propose to move code buffer management to the server, where sophisticated schemes can be employed. We describe two schemes that use profiling information to direct the client in caching code partitions. One scheme is designed for workloads with stable run-time behavior, while the other scheme adapts its decisions for workloads with unstable behaviors. We evaluate and compare our schemes and show they perform well, compared to other approaches, with the adaptive scheme having the best performance overall.

Compile-time Planning for Overhead Reduction in Software Dynamic Translation
Naveen Kumar, Bruce R. Childers, Daniel Williams, Jack W. Davidson and Mary Lou Soffa
International Journal on Parallel Programming
Accepted December 2004, to appear.
Paper	(Not yet available)

	Abstract

	Software dynamic translation (SDT) is a technology for modifying programs as they are running. The overhead of monitoring and modifying a running program's instructions is often substantial in SDT systems. As a result, SDT can be impractically slow, especially in SDT systems that do not or can not employ dynamic optimization to offset overhead. This is unfortunate since SDT has many advantages in modern computing environments and interesting uses of SDT continue to emerge. In this paper, we describe techniques to reduce the overhead of SDT. In particular, we present a compile-time planning technique to reduce the overhead due to indirect branch handling. Our results show that this technique is very effective and can improve SDT performance by up to 36%, with an average of 20%.

Jazz: A Tool for Demand-Driven Structural Testing
J. Misurda, J. Clause, J. L. Reed, P. Gandra, B. R. Childers, and M. L. Soffa
14th ETAPS International Conference on Compiler Construction
Edinburgh, Scotland, April 2005
Paper	(Not yet available)

	Abstract

	Software testing to produce reliable and robust software has become vitally important in recent years. Testing is a process by which software quality can be assured through the collection of information about software. While testing can improve software reliability, current tools typically are inflexible and have high overheads, making it a challenge to test large software projects. In this paper, we describe a new scalable and flexible tool, called Jazz, for testing Java programs. Jazz uses a novel demand-driven approach for node, branch, and def-use structural testing. It has a graphical user interface for specifying and running tests, a test planner to determine the most efficient way to test the program, and a dynamic instrumenter to carry out a test. Jazz has a low average overhead of 17.6% for branch testing, while in comparison, a traditional tool has an overhead of 89.9%.

Demand-Driven Structural Testing with Dynamic Instrumentation
Jonathan Misurda, James Clause, Juliya L. Reed, Bruce R. Childers, and Mary Lou Soffa
ACM SIGSOFT Int'l. Conference on Software Engineering (ICSE'05)
St. Louis, Missouri, May 2005
Paper	(Now available!)

	Abstract

	Producing reliable and robust software has become one of the most important software development concerns in recent years. Testing is a process by which software quality can be assured through the collection of information about software. While testing can improve software reliability, current tools typically are inflexible and have high overheads, making it challenging to test large software projects. In this paper, we describe a new scalable and flexible framework for testing programs with a novel demand-driven approach based on execution paths to implement testing coverage. This technique uses dynamic instrumentation on the binary code that can be inserted and removed on-the-fly to keep performance and memory overheads low. We describe and evaluate implementations of the framework for branch, node and def-use testing of Java programs. Experimental results for branch testing show that our approach has, on average, a 1.6 speed up over static instrumentation and also uses less memory.

A Model-Based Framework: An Approach for Profit-driven Optimization
Min Zhao, Bruce R. Childers, and Mary Lou Soffa
ACM SIGMICRO Int'l. Conference on Code Generation and Optimization (CGO'05)
San Jose, California, March 2005
Paper

	Abstract

	Although optimizations have been applied for a number of years to improve the performance of software, problems that have been long-standing remain. To systematically tackle these problems, we need to understand the properties of optimizations. In our current research, we are investigating the profitability property, which is useful for determining the benefit of applying an optimization. Because of the high cost of applying optimizations and then experimentally evaluating their profitability, we use an analytic approach and develop a model-based framework for predicting the profitability of optimizations. In this paper, we target scalar optimizations, and in particular, describe framework instances for Partial Redundancy Elimination (PRE) and Loop Invariant Code Motion (LICM). We implemented the framework for both optimizations and compare profit-driven PRE and LICM with a heuristic-driven approach. Our experiments demonstrate that a model-based approach is effective and efficient in that it can accurately predict the profitability of optimizations without much overhead. By predicting the profitability using models, we can selectively apply optimizations. The model-based approach does not require tuning of parameters used in the heuristic approach and works well across different code contexts and optimizations.

Instrumentation in Software Dynamic Translators for Self-Managed Systems
Naveen Kumar, Jonathan Misurda, Bruce R. Childers and Mary Lou Soffa
ACM SIGSOFT Workshop on Self-Managing Systems, during ACM SIGSOFT Foundations of Software Engineering
Newport Beach, California, October 2004
Paper

	Abstract

	Self-managed software requires monitoring and code changes to an executing program. One technology that enables such self management is software dynamic translation (SDT), which allows a program.s execution to be intercepted and controlled by a separate software layer. SDT has been used to build many useful applications, including software security checkers that check for code vulnerabilities, dynamic code optimizers, and program introspection tools. While these systems use program instrumentation, the instrumentation is usually tailored to a specific application and infrastructure. What is missing is a single scalable and flexible instrumentation framework that can be used in different self-managed SDT infrastructures. In this paper, we describe such a framework, called .FIST,. that can be used and targeted by different algorithms and tools to enable instrumentation applications that are portable across SDTs and machine platforms. Our interface supports multiple levels of granularity from source level constructs to the instruction and machine level. We describe and evaluate FIST.s capabilities in the Strata system for the SPARC and the Jikes Research Virtual Machine for Java on the Intel x86.

Overhead Reduction Techniques for Software Dynamic Translation
Kevin Scott, Naveen Kumar, Bruce R. Childers, Jack W. Davidson, and Mary Lou Soffa
NSF Workshop on Next Generation Software, during the Int'l. Parallel and Distributed Processing Symposium
Santa Fe, New Mexico, April 2004
Paper

	Abstract

	Software dynamic translation (SDT) is a technology that allows programs to be modified as they are running. The overhead of monitoring and modifying a running program's instructions is often substantial in SDT systems. As a result, SDT can be impractically slow, especially in SDT systems that do not or can not employ dynamic optimization to offset overhead. This is unfortunate since SDT has obvious advantages in modern computing environments and interesting applications of SDT continue to emerge. In this paper, we investigate several overhead reduction techniques, including indirect branch translation caching, fast returns, and static trace formation, that can improve SDT performance significantly.

Compact Binaries with Code compression in a Software Dynamic Translator
Stacey Shogan and Bruce R. Childers
Conference on Design, Automation and Test in Europe (DATE'04)
Paris, France, February 2004.
Paper

	Abstract

	Embedded software is becoming more flexible and adaptable, which presents new challenges for management of highly constrained system resources. Software dynamic translation is a technology that has been used to enable software malleability at the instruction level for dynamic code optimizers, security checkers, and binary translators. In this paper, we study the feasibility of using software dynamic translation to manage program code storage in embedded systems. We explore to what extent code compression can be incorporated in a purely software infrastructure to reduce program storage requirements, while minimally impacting run-time performance and memory resources. We describe two approaches for code compression, called full and partial image compression, and evaluate their compression ratios and performance in a software dynamic translation system. We demonstrate that code decompression is indeed feasible in a software dynamic translator.

Profile Guided Management of Code Partitions for Embedded Systems,
Shukang Zhou, Bruce R. Childers, and Naveen Kumar
Conference on Design, Automation and Test in Europe (DATE'04)
Paris, France, February 2004.
Paper

	Abstract

	Researchers have proposed to divide embedded applications into code partitions and to download partitions on demand from a wireless code server to enable a diverse set of applications for very tightly constrained embedded systems. This paper describes a new approach for managing the request and storage of code partitions and we explore the benefits of our scheme.

Instrumentation for Software Dynamic Translation
Naveen Kumar and Bruce R. Childers
Workshop on Exploring the Trace Space for Dynamic Optimization Techniques, ACM Int'l. Conference on Supercomputing
San Francisco, California, June 2003.
Paper

	Abstract

	Software dynamic translators have been used for many purposes, such as dynamic code optimization, profiling, and security. Many of these applications need to instrument a program's binary code to gather run-time information about the program. Such instrumentation is varied, with different software dynamic translator applications requiring different kinds of information. Hence, there is a need for a flexible mechanism for information gathering and instrumentation in software dynamic translators. In this paper, we describe our approach to providing flexible instrumentation. We also experimentally evaluate our approach and investigate its overhead and demonstrate its flexibility for different software dynamic translation systems.

Predicting the Impact of Optimizations for Embedded Systems
Min Zhao, Bruce R. Childers, and Mary Lou Soffa
ACM Conference on Languages, Compilers, and Tools for Embedded Systems
San Diego, California, June 2003.
Paper

	Abstract

	When applying optimizations, a number of decisions are made using fixed strategies, such as always applying an optimization if it is applicable, applying optimizations in a fixed order and assuming a fixed configuration for optimizations such as tile size and loop unrolling factor. While it is widely recognized that these fixed strategies may not be the most appropriate for producing high quality code, especially for embedded systems, there are no general and automatic strategies that do otherwise. In this paper, we present a framework that enables these decisions to be made based on predicting the impact of an optimization, taking into account resources and code context. The framework consists of optimization models, code models and resource models, which are integrated for predicting the impact of applying optimizations. Because data cache performance is important to embedded codes, we focus on cache performance and present an instance of the framework for cache performance in this paper. Since most opportunities for cache improvement come from loop optimizations, we describe code, optimization and cache models tailored to predict the impact of applying loop optimizations for data locality. Experimentally we demonstrate the need to selectively apply optimizations and show the performance benefit of our framework to predicting when to apply an optimization. We also show that our framework can be used to choose the most beneficial optimization when a number of optimizations can be applied to a loop nest. And lastly, we show that we can use the framework to combine optimizations on a loop nest.

Continuous Compilation: A New Approach to Aggressive and Adaptive Code Transformation
Bruce R. Childers, Jack W. Davidson, and Mary Lou Soffa
Workshop on Next Generation Software, Int'l. Symp. on Parallel and Distributed Systems
Nice, France, April 2003.
Paper

	Abstract

	Over the past several decades, the compiler research community has developed a number of sophisticated and powerful algorithms for a variety of code improvements. While there are still promising directions for particular optimizations, research on new or improved optimizations is reaching the point of diminishing returns and new approaches are needed to achieve significant performance improvements beyond traditional optimizations. In this paper, we describe a new strategy based on a continuous compilation system that constantly improves application code by applying aggressive and adaptive code optimizations at all times, from static optimization to online dynamic optimization. In this paper, we describe our general approach and process for continuous compilation of application code. We also present initial results from our research with continuous compilation. These initial results include a new prediction framework that can estimate the benefit of applying code transformations without actually doing the transformation. We also describe results that demonstrate the benefit of adaptively changing application code for embedded systems to make trade-offs between code size, performance, and power consumption.

Retargetable and Reconfigurable Software Dynamic Translation
K. Scott, N. Kumar, S. Velusamy, B. Childers, J. Davidson, and M. L. Soffa
ACM SIGMICRO Int'l. Conf. on Code Generation and Optimization
San Francisco, March 2003.
Paper

	Abstract

	Software dynamic translation (SDT) is a technology that permits the modification of an executing program's instructions. In recent years, SDT has received increased attention, from both industry and academia, as a feasible and effective approach to solving a variety of significant problems. Despite this increased attention, the task of initiating a new project in software dynamic translation remains a difficult one. To address this concern, and in particular, to promote the adoption of SDT technology into an even wider range of applications, we have implemented Strata, a cross-platform infrastructure for buildingsoftware dynamic translators. This paper describes Strata's architecture, our experience retargeting it to three different processors, and our use of Strata to build two novel SDT systems--one for safe execution of untrusted binaries and one for fast prototyping of architectural simulators.

Last modified: December 22, 2004. This page is maintained by Bruce Childers