Natural Language Retrieval for Medical Texts

Period of Performance: 09/30/1993 - 08/31/1994


Phase 2 SBIR

Recipient Firm

Conquest Software, Inc.
Columbia, MD 21046
Principal Investigator


An integrated natural language processing (NLP) and information retrieval (IR) system will be developed for text retrieval and browsing from large medical text databases. The project will be focussed on medical text collections useful for NIMH, particularly the Mind-Brain project. Texts to be included in the test database are MEDLINE, research and grant proposals, and research papers or textbooks. The integrated NLP/IR system, successfully proven in Phase I, uses natural language processing techniques, word meaning processing, and concept based information retrieval. Several machine readable dictionaries will be used, including Merriam Webster, Princeton's WordNet, the UMLS Meta-thesaurus and mesh terms, and a library of terms focussed on mental health and the Mind-Brain project. This integrated resource will be used as a knowledge base for processing natural language information retrieval requests for medical texts for researchers and practitioners. Retrieval experiments will be conducted on at least 10 Gbytes of medical text to estimate the precision, recall, speed and utility of various methods of query. The integrated system will be designed and tested to operate on databases over the Internet. This project has significant commercial application for information retrieval and has already resulted in customers in the medical area.