Petaminer: Efficient Navigation to Petascale Data Using Event-Level Metadata

Period of Performance: 01/01/2008 - 12/31/2008


Phase 1 SBIR

Recipient Firm

Tech-X Corporation
5621 Arapahoe Ave Suite A
Boulder, CO 80303
Principal Investigator
Firm POC


High energy physics experiments store petabytes of data in ROOT files described with TAG metadata. These experiments have challenging goals for efficient access to this data which have not yet been met. Physicists need to be able to compose a TAG metadata query and rapidly retrieve the set of matching events. To address this problem, we propose creating a new storage engine to directly access TAG metadata. The ability to directly read column oriented-ROOT data and to create indexes to optimize data access offers the potential to optimize the speed of TAG metadata retrieval. In addition, a lightweight Web-based interface will be created for querying TAG databases, in order to provide improved usability and global accessibility. Commercial Applications and other Benefits as described by the awardee: Column-oriented databases are an emerging technique for achieving higher performance than traditional row-oriented databases, especially in large scale data-mining scenarios. There is a large market potential for building optimized application-specific databases that vastly outperform traditional database engines.