Multiple imputation of multiple multi-item scales when a full imputation model is infeasible

Abstract Background Missing data in a large scale survey presents major challenges. We focus on performing multiple imputation by chained equations when data contain multiple incomplete multi-item scales. Recent authors have proposed imputing such data at the level of the individual item, but this can lead to infeasibly large imputation models. Methods We use data gathered from a large multinational survey, where analysis uses separate logistic regression models in each of nine country-specific data sets. In these data, applying multiple imputation by chained equations to the individual scale items is computationally infeasible. We propose an adaptation of multiple imputation by chained equations which imputes the individual scale items but reduces the number of variables in the imputation models by replacing most scale items with scale summary scores. We evaluate the feasibility of the proposed approach and compare it with a complete case analysis. We perform a simulation study to compare the proposed method with alternative approaches: we do this in a simplified setting to allow comparison with the full imputation model. Results For the case study, the proposed approach reduces the size of the prediction models from 134 predictors to a maximum of 72 and makes multiple imputation by chained equations computationally feasible. Distributions of imputed data are seen to be consistent with observed data. Results from the regression analysis with multiple imputation are similar to, but more precise than, results for complete case analysis; for the same regression models a 39Â % reduction in the standard error is observed. The simulation shows that our proposed method can perform comparably against the alternatives. Conclusions By substantially reducing imputation model sizes, our adaptation makes multiple imputation feasible for large scale survey data with multiple multi-item scales. For the data considered, analysis of the multiply imputed data shows greater power and efficiency than complete case analysis. The adaptation of multiple imputation makes better use of available data and can yield substantively different results from simpler techniques.

Tags
Data and Resources
To access the resources you must log in

This item has no data

Identity

Description: The Identity category includes attributes that support the identification of the resource.

Field Value
PID https://www.doi.org/10.6084/m9.figshare.c.3604730.v1
PID https://www.doi.org/10.6084/m9.figshare.c.3604730
URL http://dx.doi.org/10.6084/m9.figshare.c.3604730.v1
URL http://dx.doi.org/10.6084/m9.figshare.c.3604730
Access Modality

Description: The Access Modality category includes attributes that report the modality of exploitation of the resource.

Field Value
Access Right not available
Attribution

Description: Authorships and contributors

Field Value
Author Plumpton, Catrin
Author Morris, Tim
Author Dyfrig Hughes
Author White, Ian
Publishing

Description: Attributes about the publishing venue (e.g. journal) and deposit location (e.g. repository)

Field Value
Collected From Datacite
Hosted By figshare
Publication Date 2016-01-01
Publisher Figshare
Additional Info
Field Value
Language UNKNOWN
Resource Type Collection
keyword FOS: Biological sciences
keyword FOS: Mathematics
keyword arxiv.Statistics::Applications
keyword arxiv.Statistics::Methodology
system:type other
Management Info
Field Value
Source https://science-innovation-policy.openaire.eu/search/other?orpId=dedup_wf_001::ba292a83530985684229335d9cd4f48e
Author jsonws_user
Last Updated 20 December 2020, 02:44 (CET)
Created 20 December 2020, 02:44 (CET)