Refactor++: Automated Support for Program Enhancements

Period of Performance: 01/01/2012 - 12/31/2012


Phase 2 SBIR

Recipient Firm

Semantic Designs
13171 Pond Springs Road
Austin, TX 78729
Principal Investigator, Firm POC


Science and engineering depend on evolving ever more complex software, typically coded in C++, to investigate phenomena or design sophisticated products. The rate of delivered value for such software is often limited by its organization, and scientists/engineers spend significant effort trying to understand the code organization and restructuring the code to enable the next interesting concept or experiment to be integrated in trustworthy fashion. This project will build Refactor++, a tool to help engineers restructure their C++ code by applying highly automated interactive support. Reliably modifying computer software is hard. We leverage a commercial pro- gram transformation system, DMS, to handle the additional difficulties of C++: complex syntax and semantics (significantly extended by the recent C++11 standards), and DMS & apos;s underlying parallel language, PARLANSE, to handle the problem caused by the growing size (millions of lines) of C++ applications. We invent code parsing algorithms to handle the pervasive problem of capturing preprocessor directives. We encode significant knowledge about C++ into program analysis tools. We build specialized refactorings for common code restructuring tasks using the analysis tools to ensure that automated code changes made by Refactor++ are reliable, so users can apply them with impunity. Many-core parallelism of burgeoning workstations will be used to reduce response times for difficult analyses of large applications, by scaling PARLANSE from 32 to 64 bits and 64 cores. To enable wide-spread adoption and therefore benefit, we leverage existing IDEs as front-ends for Refactor++. We found a novel means to parse and retain C++ preprocessor conditionals, enabling analysis tools to process the code as the user sees it; a patent is being filed. We enhanced the existing tool & apos;s C++98 front end to parse all of C++11, as well as OpenMP directives used in scientific computing. A control flow analyzer was built for C++98 to gauge the effort of building a full flow analyzer; that analyzer was harnessed to find certain types of inefficient C++ operations invisible to programmers. An initial robust Renaming refactoring was implemented for a wide variety of C++ entities. An architecture for a full generic client-server based Refactor++ tool was designed. A plan for engineering a 64 bit PARLANSE to run on both Windows and Linux systems was devised. Symbol table, control and dataflow analysis algorithms that account for preprocessor conditionals will be implemented for full C++11. A variety of useful analyzers (locate precise definition/use, find dead/duplicate code, expose potential parallelism) and refactoring actions (rename, move code entity, restructure #include nests, remove duplicated code) will be implemented on top of a structure editor embedded in a Refactoring server. A widely available client IDE (Eclipse) will be integrated with the Refactoring server. A 64 bit PARLANSE compiler ecosystem will be built using DMS & apos;s present transformation capability as the compiler foundation. Commercial Applications and Other Benefits: Modern science, ever more dependent on supercomputing, will produce results faster, improving US society and economic prowess. Already-ubiquitous embedded systems can provide more capability in shorter engineering windows. Parallel computing will become somewhat easier to accomplish due to engineers insight into operation of their software, and restructuring to avoid misunderstandings and coding flaws. The infrastructure supporting Refactor++ will be applicable to other computer languages, eventually benefiting the entire software development community.