Ghost Writer: Finding the Man behind the Pen

Period of Performance: 05/07/2012 - 03/08/2013

$79.9K

Phase 1 SBIR

Recipient Firm

Language Computer Corp.
2435 N. Central Expressway Array
Richardson, TX 75080
Principal Investigator

Abstract

Our proposed work addresses three significant challenges faced by systems which seek to enhance analysts awareness and understanding of the authors and groups behind online documents. First, we will develop a novel feature based on cognitive psychology that proposes to identify authors by their area of expertise. In addition we will address the problem of identifying predictive features by exploring novel clustering techniques for jointly learning the relative importance of features for predicting the author and group responsible for producing a document. We will utilize our significant experience at parsing foreign languages to develop and extend the available set of authorship identification tools into Chinese, Spanish, and Russian. Lastly, following all software products at Language Computer we will follow a tiered design in development which allows for feature extraction, processing, and classification on real-time streaming text sources.