Statistical modelling of selective non-participation in health examination surveys
Abstract
Health examination surveys aim to collect reliable information on the health and risk factors
of a population of interest. Missing data occur when some invitees do not participate the
survey. If non-participation is associated with the variables to be studied, then the estimates
based only on the participants cannot be generalised to the population of interest. In this
case, the estimates have selection bias, which misleads the decision-makers.
The purpose of this thesis is to develop statistical methods to reduce the selection bias in
the cross-sectional data using additional data sources. The data, which we use, comes from
the National FINRISK Study, and we aim to estimate the prevalences of self-reported daily
smoking and self-reported heavy alcohol consumption. The sources of additional information
are follow-up data consisting of hospitalisations and causes of deaths, and questionnaire data
collected from the non-participants of health examination by contacting them again, called
re-contact data. Follow-up data give indirect information after the follow-up period about
the health behaviour of non-participants during the health examination while the re-contact
data give information similar to the health examination survey. This thesis presents methods
for utilising these sources of additional information. Multiple imputation has been applied
for the use of re-contact data, and Bayesian statistical modelling has been implemented for
the use of follow-up data.
The thesis demonstrates that the use of additional data sources and these statistical methods leads to prevalence estimates for daily smoking and heavy alcohol consumption that are
higher than those obtained from the participants only. Multiple imputation can be utilised for
prevalence estimation if the re-contact data are available. Bayesian modelling is appropriate
for the situation where re-contact data are not available but the follow-up data are and have
follow-up period long enough to indicate about the differences between the participants and
non-participants.
This thesis presents means for reducing the selection bias caused by non-participation. It
is important to reduce the magnitude of the bias for obtaining more reliable information for
example to support decision making. The statistical methods used in this thesis can also be
applied to other fields of research than in the health studies.
Main Author
Format
Theses
Doctoral thesis
Published
2018
Series
Subjects
ISBN
978-951-39-7352-0
Publisher
University of Jyväskylä
The permanent address of the publication
https://urn.fi/URN:ISBN:978-951-39-7352-0Käytä tätä linkitykseen.
ISSN
1457-8905
Language
English
Published in
Report / University of Jyväskylä. Department of Mathematics and Statistics