Bayesian Autoregressive Mul
til
evel
Model
ing of Burden of Diseas- es,
Injuries and Risk Factors in Iran 1990 - 2013
Amir Kasaeian PhD Candidate1,2,
Mohammad Reza Eshraghian PhD1, Abbas Rahimi Foroushani PhD1, Sharareh R. Niakan
Kal
hori PhD2,3
#####################1###############################2,4
Abstract
Background: Statistical
model
ing and devel
oping new methods for estimating burden
of diseases, injuries and risk factors is a fundamen- tal
concern in studying the
country heal
th situation for better heal
th management and pol
icy making. Bayesian
autoregressive mul
til
evel
model
is a strong method for this kind of study though in
compl
ex situations it has its own chal
l
enges. Our study aims to describe the way of
model
ing space and time data through an autoregressive mul
til
evel
model
and address
chal
l
enges in compl
ex situation.Method: We wil
l
obtain data from different
publ
ished and unpubl
ished secondary data sources incl
uding popul
ation-based heal
th
surveys (e.g. NHS, DHS, STEP) at national
and provincial
l
evel
s and we al
so assess
epidemiol
ogical
studies via systematic review for each disease, injuries and risk
factor over the period of 1990 - 2013
. These data general
l
y have a mul
til
evel
hierarchy and al
so time correl
ation. However, statistical
anal
ysis of diseases,
injuries and risk factors data is primaril
y facing the probl
em of information
scarcity. Data are general
l
y too scarce to ensure rel
iabl
e estimates in many
practical
probl
ems. Al
so, there may be nonl
inear changes over time, different kind
of uncertain- ties in data and incompatibl
e geographical
data. We describe Bayesian
autoregressive mul
til
evel
model
ing approach that provides a natural
sol
ution to
these probl
ems through its abil
ity to sensibl
y combine information from several
sources of data and avail
abl
e prior information. In this hierarchy model
l
evel
s of
each hierarchy borrow information from each other and al
so l
ower l
evel
s borrow
information from higher
###################################################################################
#################################################Discussion: Our anal
yses wil
l
incl
ude different existing sources of data in Iran for 24
years through a rational
and reasonabl
e model
to estimate burden of diseases, injuries and risk factors for
Iran at national
, regional
and provincial
l
evel
s whil
e considering several
kinds of
uncertainties. Comprehensive and real
istic estimates are al
ways an issue of request
that wil
l
be obtained through a suitabl
e statistical
model
ing considering al
l
dimensions and then can be used for making better decision in real
situations.
Keywords: Autoregressive time series, burden of diseases, Iran, MCMC, mul
til
evel
model
s, NASBOD############################################???Cite this articl
e as:
Kasaeian A, Eshraghian MR, Rahimi Foroushani A, Niakan Kal
hori SR, Mohammad K,
Farzadfar F. Bayesian autoregressive mul
til
evel
model
- ing of burden of diseases,
injuries and risk factors in Iran 1990 - 2013
. Arch Iran Med. 2014
; 17(1): 22 - 27.
IntroductionPrecise assessment of gl
obal
, regional
, and country heal
th conditions
and trends is crucial
for evidence-based decision making for Publ
ic Heal
th.1 The
Gl
obal
Burden of DiseaseStudy (GBD) is the l
atest and most rel
iabl
e anal
ysis to
reveal
the importance of taking different approaches to the chal
l
enges facing
gl
obal
heal
th.2 The GBD study resul
ts provide us a data-rich struc- ture for
comparing the effects and burden of different diseases, injuries, and risk factors
on premature death and disabil
ity be- tween popul
ations.3
-13
But these resul
ts are
not for within popul
a-######################1Department of Epidemiol
ogy and
Biostatistics, School
of Publ
ic Heal
th, Tehran University of Medical
Sciences,
Tehran, Iran, 2Non-Com- municabl
e Diseases Research Center, Endocrinol
ogy and
Metabol
ism Popul
ation Sciences Institute, Tehran University of Medical
Sciences,
Tehran, Iran, 3
Social
Determinants of Heal
th Research Centre, School
of Heal
th,
Ahvaz Jundishapur University of Medical
Sciences, Ahvaz, Iran, 4
Endocrinol
ogy and
Metabol
ism Research Center, Endocrinol
ogy and Metabol
ism Research Institute, Tehran
Uni- versity of Medical
Sciences, Tehran, Iran. ###################################
Kazem Mohammad PhD, Department of Epidemiol
ogy and Biostatistics, School
of Publ
ic
Heal
th, Tehran University of Medical
Sciences, Tehran, Iran. Address: Poursina
Avenue, P.O. Box 64
4
6, Tehran 14
155, Iran. Tel
: 021 889513
96; Email
:
mohamadk@tums.ac.ir.Farshad Farzadfar MD DSc, Non-communicabl
e Diseases Research
Center, En- docrinol
ogy and Metabol
ism Research Institute, Tehran University of
Medical
################################################################################
Ave, Tehran, Iran. Postal
Code: 1599666615, Tel
/Fax: 98-21-88913
54
3
,E-mail
: f-
farzadfar@tums.ac.ir.Accepted for publ
ication: 3
December 2013
tion, which means
nothing is known about what's going on within a country, expl
icitl
y Iran here.
Knowing and comparing heal
th situation within regions and provinces hel
ps to
understand the dif- ferences and simil
arities better and al
so better map the
emerging epidemics of diseases which in turn hel
ps heal
th pol
icy makers to
################################################################- ous effects and
extra burden of those diseases. The onl
y study of burden of diseases and injury in
Iran dates back to 2003
which
############################################################## disparity between
these six provinces and al
so indicated a transi- ##### ##### ############# ###
################# #### ##### ####### injuries.14
In l
ine with the GBD study,
National
and Subnational
Burden of Diseases study 2013
(NASBOD) is a systematic
effort to ef- #####################################################################
in Iran.15 It al
so takes into account care systems, current avail
abl
e data on
heal
th systems and viabl
e, systematic and rel
evant nation- wide studies carried out
in the previous years.This study is an endeavor to assess and eval
uate the burden
of diseases at national
and provincial
l
evel
s in Iran by means of the most recent
val
id and rel
iabl
e qual
itative and quantitative research methods and experiences
taken from Gl
obal
Burden of Diseases 2010 (GBD). Moreover, knowl
edge, expertise,
and skil
l
s of heal
th
###################################################################
##############################################################-??22
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
tions of burden of diseases.
###########################################################-bil
ity of comparison
and contrast of heal
th conditions in different provinces and regions, an advantage
or prerogative which is con- ducive to the fostering heal
th and hygiene in these
regions, which provides Heal
th Pol
icy Makers with the necessary documents for
better heal
th pol
icy making and resource al
l
ocation.Al
so, a comparison of the
situations at provincial
l
evel
s wil
l
ul
ti- matel
y boost the heal
th condition
throughout the country in a fair and bal
anced manner.
##################################################################- tical
model
ing
and improved methods for estimating time trends of diseases, injuries, and risk
factors. We wil
l
use the advanced methods and when necessary expand the current
methods to de- vel
op Bayesian time series mul
til
evel
model
s for 3
1 provinces from
1990 to 2013
. We wil
l
present the data, methods and the
################################################################# and provincial
1990 - 2013
trends and their uncertainties in popu- l
ation's mean (whatever measure
is) of diseases and injuries or risk factors for al
l
provinces in four regions
incl
uded in NASBOD study to al
l
ow meaningful
time and provincial
comparison. Iran
is divided into four regions (eastern south, north and eastern north, west, and
center) on the basis of two criteria: epidemiol
ogical
ho- mogeneity, and
geographical
contiguity. The study covers urban and rural
areas of the country.The
main purpose of this articl
e is to expl
ain a Bayesian autore- gressive mul
til
evel
model
and al
l
its components together with chal
l
enges in compl
ex data which wil
l
appear in the study of bur- den of diseases, risk factors and injuries.MethodsStudy
designStatistical
anal
ysis of diseases, injuries and risk factors data is primaril
y
facing the probl
em of information scarcity. Data are general
l
y too scarce to ensure
rel
iabl
e estimate in many practical
probl
ems. In the present study there are 24
provinces at the begin- ning year in 1990, however during a period of 24
years,
there are 3
1 provinces at the ending of this study. This means there shoul
d be at
l
east 576 data points that this is very unl
ikel
y in a study of diseases, risk
factors and injuries. This probl
em is more serious #### #### ##### ##### ### #####
############ ########## ### ### ##### ######## that we encounter geographical
incompatibil
ity which is the other issue of concern.We describe a Bayesian
autoregressive mul
til
evel
model
ing ap- proach that provides a natural
sol
ution to
these probl
ems through its abil
ity to sensibl
y combine information from several
sources of data and avail
abl
e prior information. Such model
ing strategies that
capture geographical
and time patterns in the data wil
l
reduce estimation error.We
wil
l
devel
op this model
to estimate preval
ence of diseases and injuries or mean of
risk factors by age group, sex and province over the time period of 1990 - 2013
. We
do anal
yze each gender indepen- dentl
y and make estimates for each age group-
province-year unit.In this mul
til
evel
model
provinces are nested in subregions,
subregions are nested in regions, and regions nested in country l
evel
. Accordingl
y,
l
ower l
evel
(s) borrow information from higher l
evel
s and al
so l
evel
s of each
hierarchy borrow information from each other. In fact, there is an concurrence for
borrowing informa- tion depending
on the l
evel
of avail
abil
ity and scarcity of data sothat the richer is the data
the l
ess borrowing within and across l
evel
s wil
l
be needed and vice versa.The other
point is that trends might not be l
inear over time; this non-l
inearity wil
l
be
model
ed in the form of a l
inear trend pl
us a smooth non-l
inear trend, both
hierarchical
l
y.Al
so, because of heterogeneity between community-based stud- ies
they might have l
arger variation than national
l
y representative studies. The model
is abl
e to capture this variation through incl
ud- ing a time-varying offset for
non-provincial
data. These variation components were estimate empirical
l
y.Another
probl
em that might occur is the non-l
inear association between preval
ence and/or
mean measurements and age since the association might change in different ages
especial
l
y in ol
der age groups. In such a condition, we wil
l
use cubic spl
ine age
model
or ##########################################################We wil
l
determine the val
ues of different kinds of uncertainties such as sampl
ing
uncertainty in the original
data, uncertainty as- sociated with inconsistency
between years in national
data, uncer- tainty rel
evant to data sources that are not
provincial
, uncertainty associated with statistical
methods for crosswal
king
between preval
ence (categorical
measure) and mean (continuous measure).Final
l
y, a
Bayesian model
with Markov Chain Monte Carl
o
##############################################################- rior distribution
of model
parameters, which represent uncertainties, wil
l
be used to achieve
posterior distribution of preval
ence or mean ### ##### ######### ### #####
############ ########## ####### ### ######### ####### #### ############ ##### ####
### ##### ############ ########### indeed from the data itsel
f in an integrated and
direct way. Uncer- tainty interval
s are al
so computed for preval
ence and mean.Data
sourcesWe wil
l
obtain data from different publ
ished and unpubl
ished secondary data
sources incl
uding popul
ation-based heal
th sur- veys (e.g. NHS, DHS, STEP) at
national
and provincial
l
evel
s and al
so epidemiol
ogical
studies via systematic
review for each disease, injury and risk factor. Some data are obtainabl
e from cen-
suses, househol
d expenditure surveys, demographic surveil
l
ance, and disease and
death registries. Data from systematic review are eval
uated via a qual
ity
assessment process used in GBD to review the incl
uded studies and to excl
ude the
poor studies. This process has three parts incl
uding general
information of the
study, qual
- ity of sampl
ing, and qual
ity of measurement. Data from popul
a- tion-
based or community-based surveys, househol
d expenditure surveys and censuses al
so
wil
l
be incl
uded in the study after data cl
eaning for pl
ausibl
e ranges of variabl
es
and outl
iers detection.
#################################################################### for each year
and province incl
uding information for mean or prev- al
ence (depending on the
anal
ysis), sampl
e sizes, standard devia-
##################################################################### embed the
survey weights in age group-sex-province-year groups.Since the mean of measure and
its uncertainty are inputs of the model
, in the anal
yses of risk factors, for the
studies that reported
#################################################################### ### ##########
######### ###### ### ########### ### ########## ######### width. For studies
reported mean, sampl
e size and standard devia- tions (SD), we estimated the
standard error (SE) as SD/(n^0.5).CovariatesA covariate is a variabl
e that has a
positive or negative rel
ation- ship with a disease, risk factor or injury in the
NASBOD study.###############################################?
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
23
We wil
l
use covariates to
inform the estimation process in our model
s. For conditions with l
ots of data,
covariates pl
ay a mini- mal
rol
e in the estimation process however for conditions
with l
ittl
e data, the rol
e of covariates is very important. In fact, time- varying
province-l
evel
covariates can hel
p informing the units ### ## ######## ######
###### #### ####### ##### ### ########### ##### ##### mul
tipl
e sources.Some of the
frequentl
y used covariates associated with the risk fac- tors or diseases under
study are (i) urbanization, measured as propor- tion of province's popul
ation that
l
ived in urban areas, (ii) province avail
abil
ity of mul
tipl
e food types for their
citizens' consumption, (iii) weal
th index, estimated from assets, which were asked
in yearl
y househol
d expenditure surveys, (iv) years of school
ing, which is
educational
attainment in years obtainabl
e from househol
d expendi- ture surveys,
(v) popul
ation density, proportion of the province with popul
ation density over
1000 peopl
e per square kil
ometer, (vi) mean BMI, mean body mass index (kg/m^2) for
mal
es and femal
es ol
der
#######################################################################- en)
obtainabl
e from census, (viii) compl
eteness of vital
registration (% of deaths
captured) obtainabl
e from census and vital
registration data, (ix) vehicl
es, 2+4
wheel
s (per capita) accessibl
e from coun- try Road Statistics. However, some
variabl
es l
ike neonatal
mortal
- ity rate (per 1000), diabetes preval
ence (% of
popul
ation), smoking preval
ence (% of popul
ation), systol
ic bl
ood pressure (mmHg),
and indoor and outdoor air pol
l
utions which are indeed estimates from NASBOD study
wil
l
be used as covariates for estimating of other diseases, risk factors and
injuries.Crosswal
king
############################################################### use homogenous
data. Non-homogenous data wil
l
l
ead to wrong estimates. Sometimes depending on the
primary outcome we need to attain continuous measurement from preval
ence or vice
versa, for exampl
e mean FPG from diabetes preval
ence since the rel
evant study has
reported just preval
ence. Other exampl
e is when one measurement can be obtained
from other measurement
############################################################## one measurement is
al
so the other issue which is necessary. One
################################################################### intake to point
preval
ence of al
cohol
intake.### ##### ##### ### ######## ## ########### ###
############ ##########
################################################################ metrics) and use
the beta generated as the adjustment factor for a
################################################################## is that the
necessary data shoul
d be rel
ativel
y high and the over- l
apping information from the
same source is needed to generate rel
ationships. This technic is the so-cal
l
ed
crosswal
k or metadata mapping method.Total
l
y, crosswal
k is a method of data
conversion that enabl
es searching data across heterogeneous resources and is a
useful
tool
for making simil
ar data comparabl
e.Statistical
anal
ysisMul
til
evel
model
s that are al
so cal
l
ed hierarchical
, mostl
y be- cause of the parameters of the
within-l
evel
regressions at the l
ow- est, control
l
ed by the hyper-parameters of the
upper-l
evel
model
, are the basis of our anal
ysis. The mul
til
evel
model
ing al
l
ows
esti- mating heterogeneity within as wel
l
as across l
evel
s or units.16
############################################################
##################################################################-ear time sl
ope.
The second component of the model
is the nonl
in- ear time effect. Covariate effect
is the third one. Age is the other important component which wil
l
be smoothed via a
cubic spl
ine. Since there are different kinds of data sources one component is
########### #### ###### ####### ####### #### ####### ######### #####- nent which is
mul
tipl
icative with study random effect.The mul
til
evel
hierarchy component of the
time trendsAn important trait of mul
til
evel
model
s is that each parameter
########### ### ## ######## ###### ### ##### ######## ############ ##### comparabl
e
parameters of other groups or units with simil
ar char- acteristic. In other words,
a shrinkage effects towards the popul
a- tion mean is present whil
e using mul
til
evel
model
s. The vol
ume of the shrinkage depends on the variance between the random
################################################################# number of
individual
s is observed in some groups. In such cases, there is l
arge reduction of
the uncertainty since information from other groups or units with smal
l
er
variabil
ity is incorporated in the posterior estimates.17 This is our main
rational
e for using Bayes- ian mul
til
evel
model
s.In our project, studies are nested
in provinces, provinces are nested in sub-regions, sub-regions, are in turn nested
in regions and al
l
nested in the country. This is the structure of the data. The
################################################################## pool
ing
estimates from the model
. Partial
-pool
ing is a compro- mise between two extremes;
non-pool
ing and compl
ete pool
ing. Compl
ete pool
ing is when we combine al
l
observations of a given l
evel
and non-pool
ing is the opposite. In this scenario,
mul
til
evel
estimate of a given province is approximated by a weighted aver- age of
the observation in the province (the un-pool
ed estimate, yJ) and the mean over al
l
provinces (the compl
etel
y pool
ed es- timate, Yal
l
). So, depending on the
avail
abil
ity and sparseness of
################################################################# by means of non-
pool
ing, partial
pool
ing and compl
ete pool
ing.18 This situation is repeated in each
hierarchy.Nonl
inear time effect componentNonl
inear changes in time at each province
wil
l
be captured using a term which is the sum of province, sub-region, region and
the country and each of these four components is assigned a Gaussian autoregressive
prior to al
l
ow the model
to distinguish the extend of nonl
inearity exist at each
l
evel
.19, 20In particul
ar, the vectors of each component have a normal
prior with zero mean and precision parameters. The model
-estimated precision
parameters wil
l
determine the degree of smoothing at each l
evel
. We wil
l
expect the
provincial
precision parameter to be the l
owest and the country precision parameter
to be the high- est as the provincial
trends of a disease has more extra-l
inear
vari- ######## ##### #### ######## ####### ############### ### ########### ### ###
issue of concern here which wil
l
be achievabl
e by constraining
##############################################################- l
ing orthogonal
ity
between the l
inear and nonl
inear part of the time trend is that each can be
expl
ained independentl
y. For prov- inces with no data, we wil
l
take the Moore-
Penrose pseudoinverse for computation because of some technical
matters.21Covariate
effects componentThe covariates which we wil
l
use in our model
are categorized in
two group; province-l
evel
covariates and study-l
evel
covari- ates. Province-l
evel
covariates incl
ude covariates l
ike (i) weal
th
############################################??24
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
index, (ii) urbanization,
(iii) mul
tipl
e food types based on a prin- cipal
component anal
ysis (PCA) on
Househol
d Expenditure Data, (iv) years of school
ing, (v) body mass index, (vi)
compl
eteness of vital
registration and etc.The effects of some of these province-
l
evel
covariates on the risk factors or diseases wil
l
be al
l
owed to change l
inearl
y
over time. Theses covariates wil
l
be smoothed using moving average
################################################################ of yearl
y
variation of covariates.However, the study-l
evel
covariates incl
ude study coverage
and study-l
evel
urbanization. The study-l
evel
coverage covariate which expl
ains
types of data used has four categories: (i) provin- cial
data with sampl
ing weight
(ii) provincial
data without weight (iii) district data, and (iv) individual
community data.The next study-l
evel
covariate expl
ains whether the study has been
conducted in rural
, urban or rural
and urban area popul
ation. These two covariates
wil
l
hel
p us account for data sources bias- ness. Since non-provincial
studies
mostl
y are performed in areas of special
regard or thought because of a heal
th
probl
em, their re- sul
ts wil
l
not be representative of the whol
e province. They
might al
so have l
arger variation than provincial
representative studies. As
mentioned before the model
considers a time-varying offset for district and
community data, and additional
variance com- ponents for district and community
data and for provincial
data without sampl
ing weights. These variance components
were esti- mated empirical
l
y and l
et provincial
data with sampl
ing weightsto have a
stronger effect on estimates than other sources.The covariates and their
interactions wil
l
be chosen based on substantive thoughtful
ness and their
predictive power through in-
#####################################################22 We arenot seeking causal
effects of these independent variabl
es.Age association componentAl
most al
l
risk
factors and diseases have a nonl
inear association with age, for exampl
e for some
diseases age association might ####### ### ##### ######### ### ###### ###### ####
### ##### #### ## ###### spl
ine model
to smooth this association.23
We wil
l
use a
basel
ine age and then subtract al
l
age val
ues from that basel
ine.Since the age
association between provinces might change fur-
###############################################################-
############################################################## normal
distribution
of zero mean and ## variances that each ## has #####################24
We treated
age as a continuous variabl
e in this model
. This is the reason we extracted age
groups from studies as narrow bands (5 years) to use their mid-point as continuous
measurements.######################################ing for sampl
ing variabil
ity.
So, the term #w enabl
es us to expl
ain this variabil
ity.#w can al
so expl
ain study
design and qual
ity mat- ters. We assume random effects from community studies have
greater variance than random effects from district studies and soforth i.e.: #w
###u # #d # #c. This constrain indicates that studies with l
imited coverage are not
onl
y have greater or l
esser than the province mean or preval
ence, but al
so have
more variabil
ity.Residual
age-study variationAge patterns inside communities within
a given province may differ and may not be consistent with its province age
pattern. This kind of within-study variation wil
l
not be captured by e terms as
they are the same across al
l
observations in a given study. Thus, an additional
variance component for each study, #2i , wil
l
be ac- commodated in the model
:####
############## ####### ####### ######## #### ####### ##### #### ##### ###### ###
##### ############ ###### ########### #### ###### ####- abl
es in the model
. We
appoint a normal
prior with variance de- ######## ### #### ######### ### ####
###### ## #### ##### ############## random e#ffect, ei :#w #u #d #c
############################################################### or preval
ence of
the measurement under study even after account-c2 i##study i is communityAgain
there is l
ess variation in weighted provincial
studies than unweighted provincial
studies and so on i.e.: #w ###u # #d # #c. This consideration for model
comprises
the smooth age in residu- al
terms not onl
y for each province but al
so for each
study to have its own cubic spl
ine in age.Computation
########################################################### method. Al
l
statistical
computation programs wil
l
be written and done in R l
anguage. As we know wel
l
, to
achieve better estima- tions from the model
we shoul
d jointl
y sampl
e random effects
with their hyperparameters since there is a heavy dependency between parameters.25
We wil
l
not marginal
ize over mean param- eters in the model
since this may cause
off-diagonal
structure into the l
ikel
ihood covariance and need manipul
ating l
arge
variance- covariance matrices to cal
cul
ate this marginal
l
ikel
ihood.A main step in
running MCMC is to ensure the MCMC sampl
er wil
l
converge to the posterior
distribution and that estimating is
################################################################ draws.24
For each
model
, we wil
l
start with 20 chains in paral
l
el
at randoml
y-sel
ected starting
val
ues. Then, after 5000 iterations of burn-in to harmonize the Metropol
is proposal
variances, we wil
l
run each chain 50000 more iterations. Next, we wil
l
combine
##################################################################
################################################################# MCMC is that
uncertainty generates natural
l
y from the data via estimation in an integrated and
simpl
e manner.Model
checking
################################################################- ting and
achieving tradeoff between these two needs a great atten- tion. The perfect model
is el
astic enough to capture important com- pl
ications whil
e stil
l
keeping its
external
val
idity and interpretabil
ity.We wil
l
examine our model
using posterior
predictive checks to verify that we have not negl
ected any key interaction out and
al
so ##### #### ################# ### ####### ### ##### #### ######## #### ######
Posterior predictive checks are wel
l
-designed and smart tool
for in-
###################################################26 We wil
l
com- pare observed
datasets with a given repl
icated datasets, e.g. 500,var (ei)=
i##study i is weighted
provincial
i##study i is unweighted provincial
i##study i is districti##study i is
community# #di #2 i##study i is district
################################################w2 i##study i is weighted
provincial
#2 =
#u2 i##study i is unweighted provincial
?
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
25from model
's posterior
predictive distribution for other risk factors. Whenever the difference between
this prediction and the observed data becomes smal
l
er, this means our model
is
consistent with data.For cross-val
idating the model
, we wil
l
divide the provinces
into #### ################ ###### ### ###### ###### ### ##### #### #### #######
become simil
ar regarding rich and sparse density. For each group of provinces we
wil
l
do a 10-fol
d cross-val
idation so that we drop
################################################################- ###### ###### ###
########## #### ########### ###### ### #### ###### ###### when predicting that 10
percent of data. We wil
l
do this for every 10 percent and combine the 10 percent
estimates of prediction errors.27At each iteration of the MCMC we wil
l
draw a
predic- tion from the main model
and wil
l
buil
d 95 % prediction interval
s from
predictions across al
l
iterations.Discussion
######################################################### burden of diseases, risk
factors and injuries across provincial
and regional
l
evel
s over recent years in
Iran.15 The onl
y study of burden of diseases and injuries in Iran dates back to
2003
which ################################################################- cant
disparity between these provinces and al
so transition from communicabl
e diseases to
non-communicabl
e diseases and road
################14
############################################### burden of
diseases study in Iran and even in the Middl
e-East and one of the few subnational
studies al
l
over the worl
d.28-3
1 We wil
l
obtain l
ong-term trends of preval
ence of
diseases, risk factors and injuries under NASBOD study for each age group, sex,
province, sub-region, region and the whol
e country. Then we wil
l
estimate heal
th
inequal
ities respectivel
y. Al
l
the time trends wil
l
be report- ed together with
their uncertainty interval
s. We wil
l
report esti- mates for al
l
province-years,
subregion-years, region-years that many of them suffers from poor data.As mentioned
before provincial
and subnational
studies of bur- den of diseases inside countries
provided heal
th pol
icymakers with a sol
id perspective of heal
th situation al
l
over
the country and therefore hel
ped them in better heal
th management and fu- ture
pl
anning to control
the progressive epidemics of al
l
domi- nant diseases. The other
advantage of the present study compared with the onl
y previous one in Iran and
other subnational
studies in the worl
d is that its methodol
ogical
and anal
ytical
approach is very cl
ose to GBD study 2010 guidel
ines together with their main
investigators invol
vement. What mentioned above are just the epidemiol
ogical
achievements of such a study which wil
l
be a hel
pful
l
andmark for pol
icy makers in
heal
th systems.The NASBOD project achievements are not onl
y very important from
epidemiol
ogical
perspective but al
so from statistical
point of view because of
handl
ing the compl
exities existing in the na- ture of this study. These
compl
exities wil
l
be model
ed with new advanced statistical
model
s especial
l
y
Bayesian autoregressive mul
til
evel
model
s as expl
ained in this paper.Though the
detail
ed main discussion on the resul
ts wil
l
be pro- vided after running the model
and rel
easing the resul
ts, we can
##################################################################Mul
til
evel
model
s
are of the rare approaches for model
ing ag- gregated data l
ike what we encountered
in NASBOD study.One of the main advantages of mul
til
evel
model
s is assessing
different l
evel
s effects. Considering higher l
evel
units as a random
###################################################################-el
variation in
the total
popul
ation and therefore l
eads to unbiased standard error estimates and
independent residual
s of the model
.16 Another advantage is that missing data which
are frequentl
y oc- curred in l
arge surveys are handl
ed very simpl
y via these
model
s.16Though handl
ing missing data is one of the advantages of mul
- til
evel
model
s and we wil
l
just use these model
s together with informative priors to impute
missing information, our model
suf- fers from data scarcity especial
l
y in ol
der age
groups and the ear- l
ier time of the study. Our model
al
so suffers from l
ow qual
ity
and non-representativeness of data at the earl
ier time of the study mainl
y before
2000. Thus, rel
ativel
y l
arge amount of data wil
l
make our inferences more robust.It
is cl
ear from the l
iterature that many approaches have been devel
oped for missing
data imputation but al
most al
l
of them use simpl
e methodol
ogy l
ike bootstrapping
just l
ike Amel
ia3
2 and ### #### ##### ## ####### ############ ### ##############
### ### ##### ### compl
ex situations resembl
ing NASBOD study. The onl
y disad-
vantage is that model
ing process and interpretation of the resul
ts may be compl
ex
which both can be passed off through advanced ########### ######### ############
############ #### ############# Data gaps may be the main l
imitation of our study
just l
ike what occurred in model
ing GBD study 2010.10-13
The other l
imitation is
the geographical
incompatibil
ity that occurs at the provincial
l
evel
s which is not
a serious probl
em in mul
til
evel
model
ing since we have onl
y sl
ight changes during
the study period and it can be handl
ed with tricky techniques. But it may be a
serious probl
em at district l
evel
s and more advanced model
s shoul
d be devel
oped at
this phase of study in near future.The other sensibl
e model
s which can be engaged
in NASBOD study is the Spatio-temporal
model
s.3
3
which wil
l
be devel
oped ####
###### ### ######### #### ##### ##### ### ###### ##### #### ######## Bayesian
autoregressive mul
til
evel
model
s to devel
op ensembl
e model
s which wil
l
produce
independent model
and more rel
iabl
e and accurate estimations. Ensembl
e model
s are
weighted com- binations of the posterior distributions of individual
model
s and
provide l
ower error for point estimates and more accurate uncer- tainty
interval
s.3
4
-3
6 Moreover, ensembl
e model
s catch uncertainty due to both the
parameters in any singl
e model
and the uncertainty
###################################################################General
l
y
speaking, the main advantages of the mentioned mod- el
is estimating l
ong-term
trends using a Bayesian autoregressive mul
til
evel
model
to predict mean and
preval
ence of risk factors and diseases by incl
uding non-l
inear age associations
and time trends, incorporating study coverage as wel
l
as variance compo-
################################################################# our model
to use
al
l
the data and track provincial
representative
################################################################# variance
components to be greater and have l
arger uncertainty for data sources with l
ess
representativeness and ul
timatel
y uncer- tainl
y interval
s achieved from the
Bayesian model
that represent the true avail
abil
ity of information.Though we are to
devel
op a sophisticated model
based on real
needs and existing compl
exities in real
situations to estimate missing information this does not obviate the need for
gathering ############################Al
l
mentioned about model
ing and its
chal
l
enges in compl
ex conditions itsel
f creates careers for young researchers to
l
earn and train more and more and this capacity buil
ding ul
timatel
y wil
l
l
ead to
knowl
edge production in the country.As a bottom l
ine, achieving estimations of time
trends after mod-############################################?26
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
el
ing al
l
diseases, risk
factors and injuries under NASBOD study can hel
p anybody who works in heal
th
systems, special
l
y Heal
th Pol
icy Makers and al
so pol
iticians to trace, understand
and monitor epidemiol
ogical
transition of non-communicabl
e diseases in al
l
over the
country and then l
aunch prevention pl
an to reduce the burden of non-communicabl
e
diseases, risk factors and injuries and conse- quentl
y achieve the new heal
th goal
of the Worl
d Heal
th Assembl
y in 2012,2 which is reducing avoidabl
e mortal
ity from
non-communi- cabl
e disease (NCDs) by 25 % by 2025 (the 25 by 25 goal
).Authors'
ContributionsGeneral
design prepared by Farshad Farzadfar and Amir Ka- saeian.
Designing of model
s prepared by Farshad Farzadfar, Ka- zem Mohammad, Amir Kasaeian,
Mohammad Reza Eshraghian and Abbas Rahimi Foroushani. The primary draft was
prepared by Amir Kasaeian and revised by al
l
co-authors. Al
l
authors have
#####################################################Acknowl
edgmentsThe study is
granted by Ministry of Heal
th and Medical
Education of Isl
amic Republ
ic of Iran and
Setad-e-Ejraie Farmane Imam. The authors woul
d l
ike to express thanks to Dr.Masoud
Moradi for his precise editing of the text and Ms Rosa Hagh Shenas for her efforts
on managing coordinative and administrative processes.References1. Chan M. From new
estimates to better data. The Lancet. 2012; 3
80(9859): 2054
.2. Horton R. Non-
communicabl
e diseases: 2015 to 2025. The Lancet. 2013
; 3
81(9866): 509 - 510.3
. Lim
SS, Vos T, Fl
axman AD, Danaei G, Shibuya K, Adair-Rohani H, et al
. A comparative
risk assessment of burden of disease and injury attributabl
e to 67 risk factors and
risk factor cl
usters in 21 regions, 1990-2010: a systematic anal
ysis for the Gl
obal
Burden of Disease Study 2010. The Lancet. 2013
; 3
80(9859): 2224
- 2260.4
. Murray
CJ, Vos T, Lozano R, Naghavi M, Fl
axman AD, Michaud C, et al
. Disabil
ity-adjusted
l
ife years (DALYs) for 291 diseases and injuries in 21 regions, 1990-2010: a
systematic anal
ysis for the Gl
obal
Burden of Disease Study 2010. The Lancet. 2013
;
3
80(9859): 2197 - 2223
.5. Vos T, Fl
axman AD, Naghavi M, Lozano R, Michaud C, Ezzati
M, et al
. Years l
ived with disabil
ity (YLDs) for 1160 sequel
ae of 289 diseases and
injuries 1990-2010: a systematic anal
ysis for the Gl
obal
Burden of Disease Study
2010. The Lancet. 2013
; 3
80(9859): 2163
- 2196.6. Sal
omon JA, Wang H, Freeman MK,
Vos T, Fl
axman AD, Lopez AD, et al
. Heal
thy l
ife expectancy for 187 countries,
1990-2010: a system- atic anal
ysis for the Gl
obal
Burden Disease Study 2010. The
Lancet. 2013
; 3
80(9859): 214
4
- 2162.7. Sal
omon JA, Vos T, Hogan DR, Gagnon M,
Naghavi M, Mokdad A, et al
. Common val
ues in assessing heal
th outcomes from disease
and injury: disabil
ity weights measurement study for the Gl
obal
Burden of Disease
Study 2010. The Lancet. 2013
; 3
80(9859): 2129 - 214
3
.8. Lozano R, Naghavi M,
Foreman K, Lim S, Shibuya K, Aboyans V, et al
. Gl
obal
and regional
mortal
ity from
23
5 causes of death for 20 age groups in 1990 and 2010: a systematic anal
ysis for
the Gl
obal
Burden of Disease Study 2010. The Lancet. 2013
; 3
80(9859): 2095 - 2128.
9. Wang H, Dwyer-Lindgren L, Lofgren KT, Rajaratnam JK, Marcus JR,
#################################################################### countries,
1970-2010: a systematic anal
ysis for the Gl
obal
Burden of Disease Study 2010. The
Lancet. 2013
; 3
80(9859): 2071 - 2094
.10. Danaei G, Finucane MM, Lu Y, Singh GM,
Cowan MJ, Paciorek CJ, et al
. National
, regional
, and gl
obal
trends in fasting
pl
asma gl
ucose and diabetes preval
ence since 1980: systematic anal
ysis of heal
th
examination surveys and epidemiol
ogical
studies with 3
70 country-years and 2? 7
mil
l
ion participants. The Lancet. 2011; 3
78(9785): 3
1 - 4
0.11. Danaei G, Finucane
MM, Lin JK, Singh GM, Paciorek CJ, Cowan MJ, et al
. National
, regional
, and gl
obal
trends in systol
ic bl
ood pres- sure since 1980: systematic anal
ysis of heal
th
examination surveys and epidemiol
ogical
studies with 786 country-years and 5? 4
mil
l
ion participants. The Lancet. 2011; 3
77(9765): 568 - 577.12. Finucane MM,
Stevens GA, Cowan MJ, Danaei G, Lin JK, Paciorek CJ, et al
. National
, regional
, and
gl
obal
trends in body-mass index since 1980: systematic anal
ysis of heal
th
examination surveys and epidemiol
ogical
studies with 960 country-years and 9? 1
mil
l
ion par- ticipants. The Lancet. 2011; 3
77(9765): 557 - 567.13
. Farzadfar F,
Finucane MM, Danaei G, Pel
izzari PM, Cowan MJ, Pa- ciorek CJ, et al
. National
,
regional
, and gl
obal
trends in serum
total
chol
esterol
since 1980: systematic anal
ysis of heal
th examination sur- veys
and epidemiol
ogical
studies with 3
21 country-years and 3
? 0 mil
- l
ion participants.
The Lancet. 2011; 3
77(9765): 578 - 586.14
. Naghavi M, Abol
hassani F, Pourmal
ek F,
Lakeh MM, Jafari N, Vaseghi S, et al
. The burden of disease and injury in Iran
2003
. Popu- l
ation heal
th metrics. 2009; 7(1): 9.15. Farzadfar F, Del
avari A,
Mal
ekzadeh R, Mesdaghinia A, Jamshidi #### ########### ### #### ####### ######
######## ############ #### ###- rics. Arch Iran Med. 2014
; 17(1): 7 - 15.16.
Gol
dstein H. Mul
til
evel
statistical
model
s: Wil
ey. com; 2011.17. Ntzoufras I.
Bayesian model
ing using WinBUGS: Wil
ey. com; 2011. 18. Gel
man A. Data anal
ysis
using regression and mul
til
evel
/hierarchical
model
s: Cambridge University Press;
2007.19. Bresl
ow NE, Cl
ayton DG. Approximate inference in general
ized l
in-ear mixed
model
s. Journal
of the American Statistical
Association.1993
; 88(4
21): 9 - 25.20.
###############################################################-tions: CRC Press;
2005.21. Harvil
l
e DA. Matrix al
gebra from a statistician's perspective: Spring-er;
2008.22. Spiegel
hal
ter DJ, Best NG, Carl
in BP, Van Der Linde A. Bayesian
######################################################################Society:
Series B (Statistical
Methodol
ogy). 2002; 64
(4
): 583
- 63
9. 23
. Durrl
eman S, Simon
R. Fl
exibl
e regression model
s with cubic spl
ines.Statistics in Medicine. 1989;
8(5): 551 - 561.24
. Marin J-M, Robert CP. Bayesian core: a practical
approach to
compu-tational
Bayesian statistics: Springer; 2007.25. Chib S, Carl
in BP. On MCMC
sampl
ing in hierarchical
l
ongitudinal
model
s. Statistics and Computing. 1999; 9(1):
17 - 26.26. Gel
man A, Meng X-L, Stern H. Posterior predictive assessment of model
###################################Statistica Sinica. 1996; 6(4
): 73
3
- 760. 27.
Hastie T, Tibshirani R, Friedman J, Frankl
in J. The el
ements of statisti- cal
l
earning: data mining, inference and prediction. The Mathematical
Intel
l
igencer.
2005; 27(2): 83
- 85.28. Bradshaw D, Nannan N, Groenewal
d P, Joubert J, Laubscher
R, Ni-jil
ana B, et al
. Provincial
mortal
ity in South Africa, 2000-priority- setting
for now and benchmark for the future. South African Medical
Journal
. 2008; 95(7):
4
96.29. Stevens G, Dias RH, Thomas KJ, Rivera JA, Carval
ho N, Barquera S, et al
.
Characterizing the epidemiol
ogical
transition in Mexico: na- tional
and subnational
burden of diseases, injuries, and risk factors. PLoS Medicine. 2008; 5(6): e125.3
0.
Begg SJ, Vos T, Barker B, Stanl
ey L, Lopez AD. Burden of disease and in- jury in
Austral
ia in the new mil
l
ennium: measuring heal
th l
oss from diseas- es, injuries
and risk factors. Medical
journal
of Austral
ia. 2008; 188(1): 3
6.3
1. Asaria P,
Fortunato L, Fecht D, Tzoul
aki I, Abel
l
an JJ, Hambl
y P, et al
. Trends and
inequal
ities in cardiovascul
ar disease mortal
ity across 793
2 Engl
ish el
ectoral
wards, 1982-2006: Bayesian spatial
anal
ysis. International
journal
of Epidemiol
ogy.
2012; 4
1(6): 173
7 - 174
9.3
2. Honaker J, King G, Bl
ackwel
l
M. Amel
ia II: A program
for missing data. R Package version 1.5-5. 2011.3
3
. Parsaeian M, Farzadfar F,
Zeraati H, Mahmoudi M, Rahimighazika- l
ayeh G, Navidi I, et al
. Appl
ication of
spatio-temporal
model
to esti- mate burden of diseases, injuries and risk factors
in Iran 1990 - 2013
. Arch Iran Med. 2014
; 17(1): 28 - 3
2.3
4
. Vrugt JA, Robinson BA.
Treatment of uncertainty using ensembl
e methods: Comparison of sequential
data
assimil
ation and Bayesian model
averaging. Water Resources Research. 2007; 4
3
(1):
W014
11.3
5. Gneiting T, Raftery AE. Strictl
y proper scoring rul
es, prediction, and
estimation. Journal
of the American Statistical
Association. 2007; 102(4
77): 3
59 -
3
78.3
6. Raftery AE, Gneiting T, Bal
abdaoui F, Pol
akowski M. Using Bayes- ian model
averaging to cal
ibrate forecast ensembl
es. Monthl
y Weather Review. 2005; 13
3
(5):
1155 - 1174
.###############################################?
ArchivesofIranianMedicine,Vol
ume17,Number1,January2014
27Copyright of Archives of
Iranian Medicine (AIM) is the property of Academy of Medical
Sciences of I.R. Iran
and its content may not be copied or email
ed to mul
tipl
e sites or posted to a
l
istserv without the copyright hol
der's express written permission. However, users
may print, downl
oad, or email
articl
es for individual
use.