SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Gorelick MH. J. Clin. Epidemiol. 2006; 59(10): 1115-1123.

Affiliation

Department of Pediatrics, Section of Emergency Medicine, Medical College of Wisconsin, Milwaukee, WI, USA. mgorelic@mcw.edu

Comment In:

J Clin Epidemiol 2007;60(9):979

Copyright

(Copyright © 2006, Elsevier Publishing)

DOI

10.1016/j.jclinepi.2004.11.029

PMID

16980153

Abstract

OBJECTIVE: The purpose of this study is to determine the effect of three common approaches to handling missing data on the results of a predictive model. STUDY DESIGN AND SETTING: Monte Carlo simulation study using simulated data was used. A baseline logistic regression using complete data was performed to predict hospital admission, based on the white blood cell count (WBC) (dichotomized as normal or high), presence of fever, or procedures performed (PROC). A series of simulations was then performed in which WBC data were deleted for varying proportions (15-85%) of patients under various patterns of missingness. Three analytic approaches were used: analysis restricted to cases with complete data, missing data assumed to be normal (MAN), and use of imputed values. RESULTS: In the baseline analysis, all three predictors were all significantly associated with admission. Using either the MAN approach or imputation, the odds ratio (OR) for WBC was substantially over- or underestimated depending on the missingness pattern, and there was considerable bias toward the null in the OR estimates for fever. In the CC analyses, OR for WBC was consistently biased toward the null, OR for PROC was biased away from the null, and the OR for fever was biased toward or away from the null. Estimates for overall model discrimination were substantially biased using all analytic approaches. CONCLUSIONS: All three methods of handling large amounts of missing data can lead to biased estimates of the OR and of model performance in predictive models. Predictor variables that are measured inconsistently can affect the validity of such models.


Language: en

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print