Efficient Search over Heterogeneous XML Databases in Unstructured Peer-To-Peer Networks

Period of Performance: 03/08/2006 - 09/09/2006

$100K

Phase 1 SBIR

Recipient Firm

Stottler Henke Associates
1650 South Amphlett Boulevard, Suite 300
San Mateo, CA 94402
Principal Investigator

Abstract

We propose a new peer-to-peer system, Emerge, that will support expressive, structured queries over peers sharing XML data in heterogeneous formats. Our system utilizes a novel combination of features to combat the query routing and semantic interoperability problems that plague existing systems. Emerge dynamically builds a set of shortcuts that leverage the content and interest locality exhibited by search requests to gradually improve query routing performance over time. These shortcuts are designed to be modular, which allows them to be easily combined with other optimization approaches and remain functional independent of the query language used. To combat the inevitable information loss that occurs as data is passed between peers with heterogeneous schemas, our system utilizes information that can be observed passively, such as query cycles and returned results, to identify and refine faulty schema mappings. Phase I research and development of a proof-of-concept limited prototype will demonstrate the feasibility and utility of Emerge's query routing optimizations and bottom-up schema refinement and will lay the groundwork for its Phase II implementation and eventual commercialization.