An Inverse Inference Engine for High Precision Web Search

Period of Performance: 07/25/2000 - 09/29/2002

$750K

Phase 2 SBIR

Recipient Firm

Mathsoft, Inc.
1700 Westlake Avenue N., Suite 500
Seattle, WA 98109
Principal Investigator

Abstract

The Phase I work has proved the precision and scalability of the inverse inference algorithm, and its ability to perform latent semantic analysis. In Phase II, we will extend the functionality of the algorithm to encompass cross-language document retrieval, tracking of document clusters in time, and fast hierarchical clustering of large document databases. The indexing structure will evolve from an information matrix to an information tensor. The information tensor will accommodate multidimensional term attributes like word position, part of speech, and taxonomical and syntactic tags. We will embed this richer indexing structure and all search functionality in the Oracle interMedia cartridge. New query operators will provide support for word n-grams, ordered phrases, term broadening, cross document entity tracking and extraction of entity relationships. We will also improve the performance of the soft hyperlink navigation tool. We will validate the precision of our search technology by participating inthe TREC and CLEF competitions on a regular basis throughout the duration of the contract.