Optical Scanner Input of Graphical Chemical Structure

Period of Performance: 05/01/1991 - 11/30/1992


Phase 2 SBIR

Recipient Firm

Fein-marquart Associates
Baltimore, MD 21212
Principal Investigator


For many years, the National Cancer Institute (NCI) has surveyed the chemical literature to identify new chemical compounds of interest for testing against cancer. The chemical structure of all such compounds must be entered into a computer system so that the compounds can be registered against previously tested compounds, and so that records of the chemical structure can be searched and displayed as necessary by NCI, NCI contractor, and NCI supplier personnel. This chemical structure information is presently entered manually by chemists using interactive graphics terminals. With recent developments in the optical scanner and OCR (optical character recognition) areas, it is believed that a system can be developed to read chemical structure information directly from the pages of a technical journal (or other publication) and to translate automatically such information into a connection table for use in a chemical structure oriented computer system. The proposed effort is intended to assess in detail the feasibility and practicality of developing such a system. Optical scanners, together with OCR software, will be evaluated for this application. Algorithms and procedures for the recognition of various structural features, and for the translation of those features into a connection table, will be developed. Assessments of the effectiveness of the system on selected printed chemical structures will be made, and recommendations for full prototype development will be given.