An Integrated Suite of Text and Data Mining Tools for Program Managers

Period of Performance: 02/23/2001 - 08/23/2001


Phase 1 SBIR

Recipient Firm

Search Technology, Inc.
4960 Peachtree Industrial Blvd.
Norcross, GA 30071
Principal Investigator


This proposal describes an effort to build an integrated suite of tools for R&D Program Managers, incorporating text mining and data mining tools for information extraction and knowledge discovery from requirement sources and bibliographic databases of R&D literature. Successful program management depends in part on identifying and understanding requirements, discerning linkages among requirements (e.g., commonality, dependency, priority, etc.), and recognizing correspondence between program requirements and the capabilities of available resources. Requirements take several forms, but of particular interest are large written documents, such as Strategic Plans and R&D Master Plans. Requirements may originate from databases of operating experience and maintenance information. In either the database form or the resulting documents, mastery of these information sources presents a daunting challenge. The technologies of text and data mining have great potential for assisting Program Managers in their task of defining or understanding requirements from these very large data sources by identifying relationships among requirements and discovering connections between the requirements and other R&D activities reported in bibliographic databases. In Phase I, we will 1) analyze requirements sources, 2) prepare a report on text and data mining techniques, 3) develop a software specification, and 4) demonstrate the feasibility by developing a demonstration prototype.Successful completion of all three phases of this program will result in a powerful suite of tools for text mining. Program Managers in large organizations (government and commercial) will be able to use these tools to extract knowledge from databases of operational and maintenance experience. This knowledge will assist the Program Manager in defining, articulating, and defending programmatic requirements. The suite of tools will also allow the manager to mine clusters of requirements from free text documents such as Requirements Documents, Science and Technology Master Plans, and Strategic Plans. These requirements clusters can then be used to mine open literature S&T bibliographic databases to identify centers of excellence and assess the qualifications of individuals and organizations submitting proposals. By cross-mining requirements documents and S&T literature, the manager can also find new relationships among technologies and applications that may provide leverage points for investment of R&D resources. By mining internal research plans against patent databases, managers can enhance their protection of an organization's intellectual property by assessing how their research agenda and product development plans compare with their competitor's patent strategy.