Expertise Location using Automatically Generated Network Models

Period of Performance: 08/28/2000 - 08/28/2001


Phase 2 STTR

Recipient Firm

154 Whitetail Dr
Ithaca, NY 14850
Principal Investigator
Firm POC

Research Institution

Cornell University
426 Phillips Hall
Ithaca, NY 14853
Institution POC


Many information search tasks involve finding a people rather than simply a set of documents. Examples include finding a experts on technical topics in large military, governmental, and professional organizations; finding trusted vendors for electronic commerce; and finding key trend-setting individuals for targeted marketing applications. In order to solve such tasks efficiently it is necessary to understand the social context of people within in an organization; that is, knowledge of both the expertise, preferences, and habits of different individuals (e.g. who is an expert on air campaign planning) and of the personal relationships between individuals (e.g. who are colleagues, who has worked for whom in the past, etc.). The key notion is that a network of personal relationships defines the sphere of trust and influence of an individual. For example, I have more reason to trust the expertise and opinions of an individual if we are connected by a short chain of closely related people. Because this kind of highly structured information is rarely directly available tasks such as expertise location are traditionally performed manually and high sub-optimally. We are building tools to support automation of these tasks using techniques from data mining, artificial intelligence, and graph analysis. Components include: (1) tools for gathering unstructured data about people, projects, and organizations from multiple sources; (2) data mining algorithms for automatically structuring this information both in terms of profiles of individuals and a social network model of the relationships between individuals; and (3) search and visualization tools for efficiently finding people with various characteristics based on this structured data.