Efficient Statistical Algorithms for Dropout Data

Period of Performance: 09/29/2000 - 12/28/2001


Phase 1 SBIR

Recipient Firm

Insightful Corporation
Seattle, WA 98109
Principal Investigator


Missing and dropout data are common features in longitudinal studies. In many cases, the dropout process is related to the outcome process. This situation creates tremendous difficulties in analyzing such data. No commercial software currently considers the dropout mechanisms in dealing with non-random dropout. Consequently, the results are biased and misleading. The ultimate objective of our research is the development of S+DROPOUT: a software package for handling various dropout mechanisms. The proposed research will simultaneously consider the dropout and the response processes. We will develop model-based approaches and hierarchical structures for testing the dropout mechanisms. Efficient EM algorithms and Gibbs sampling will be developed for fitting various models. Since these algorithms are relying heavily on the modeling assumptions of the uncollected data, the validity of the assumptions has to be verified in data analysis. To perform this investigation, we provide an analytic and graphic suite for sensitivity analysis. The S+ DROPOUT module will be implemented as a module in the S-Plus language. A comprehensive case study guidebook will also be developed using real problems involving dropout data. PROPOSED COMMERCIAL APPLICATION: S+DROPOUT will be a module in the S-Plus software system. This module will be attractive both to the existing S-Plus user base, as well as the broader community of biomedical researchers and data analysts. This research will also lead to the development of short courses, books, and other educational materials.