Abstract
We present a new approach using data flow techniques to solve compile analysis problems for languages with general purpose pointer usage. We solve the def-use associations for C and type determination for C++ as representative problems. A study of the close interaction of aliasing with these problems has led us to the development of a unified approach to solve them simultaneously with aliasing, as against a factored approach which may lead to significant loss of precision. These problems are fundamental for analysis because they provide important semantic information whose precision can greatly affect the quality and utility of almost all other compile time analyses. Our polynomial algorithms are approximate, which is expected since we have shown the NP-hardness of the problems we are solving. The robust empirical results on actual programs validate our analysis approach and demonstrate its utility. Def-use analysis links possible value setting statements for a variable (i.e., definitions) to potential value-fetches (i.e., uses) of that value. Def-use associations are thus, compile time calculable data dependences, necessary for software engineering applications such as data flow testing coverage, static slicing techniques and integrating non-interfering versions of programs. Ours is the first interprocedural def-use associations algorithm which accounts for pointer usage and yields reasonable accuracy. Type determination calculates at compile time, the possible types of objects to which a pointer may point during some execution of the program. In C++, the type of object pointed to by the receiver at a virtual call site dynamically determines the function to be invoked. Type determination enables us to replace this late binding with a direct call to an appropriate function or with inlined code in suitable circumstances, thereby eliminating the late binding overhead and improving execution efficiency. We show how our work particularly benefits architectures which use deep pipelining and branch prediction. We also bring out its utility in building a more precise call graph and improving the efficacy of subsequent analyses for C++. We present a combined algorithm for aliasing and type determination for C++: the first data flow technique for the problem for an object-oriented language.