Anda di halaman 1dari 79

2/22/2011 Cross-sectional studies 1

Study designs: Cross-sectional studies,


ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/

Principles of Epidemiology for Public Health (EPID600)

2
Signs from around the world
In a Copenhagen airline ticket office:
We take your bags and send them in all
directions.
3
Signs from around the world
In a Norwegian cocktail lounge:
Ladies are requested not to have
children in the bar.
4
Signs from around the world
Rome laundry:
Ladies, leave your clothes here and
spend the afternoon having a good time.
5
Faster keyboarding - 1
I cdnuolt blveiee taht I cluod aulaclty uesdnatnrd waht I
was rdanieg. The phaonmneal pweor of the hmuan mnid,
aoccdrnig to a rscheearch at Cmabrigde Uinervtisy. It
dn'seot mttaer in waht oredr the ltteers in a wrod are, the
olny iprmoatnt tihng is taht the frist and lsat ltteer be in
the rghit pclae. The rset can be a taotl mses and you can
sitll raed it wouthit a porbelm.
Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)


6
Faster keyboarding - 2
Most of my friends could read this with understanding
and rather quickly I might add. Then I had them read a
statistical bit of literature:
Miittluvraae asilyans sattes an idtenossiy ctuoonr epilsle
is the itternoiecsno of a panle pleralal to the xl-yapne and
the sruacfe of a btiiarave nmarol dbttiisruein.
Gary C. Ramseyer's First Internet Gallery of Statistics Jokes
http://davidmlane.com/hyperstat/humorf.html (#162)


2/22/2011 Cross-sectional studies 7
Study designs: Cross-sectional studies,
ecologic studies (and confidence intervals)
Victor J. Schoenbach, PhD home page
Department of Epidemiology
Gillings School of Global Public Health
University of North Carolina at Chapel Hill
www.unc.edu/epid600/

Principles of Epidemiology for Public Health (EPID600)

10/15/2001 Cross-sectional studies 8
Today outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals
2/10/2009 Cross-sectional studies 9
Cross-sectional studies
Cross-sectional studies include surveys
People are studied at a point in time, without
follow-up.
Can combine a cross-sectional study with follow-up
to create a cohort study.
Can conduct repeated cross-sectional studies to
measure change in a population.

2/22/2011 Cross-sectional studies 10
Cross-sectional studies
Number of uninsured Americans rises to 50.7
million. (USA Today, 9/17/2010; data from Census Bureau)
In 2007-2008, almost one in five children older than
5 years was obese. (Health, United States, 2010; data from
the National Health and Nutrition Examination Survey)
35% (~7.4 million) of births to U.S. women during
the preceding 5 years were mistimed or unwanted
(2002 National Survey of Family Growth, Series 23, No. 25, Table 21)
[Source: www.cdc.gov/nchs/]
2/10/2009 Cross-sectional studies 11
Cross-sectional studies
Incidence information is not available from a typical
cross-sectional study
Sometimes can reconstruct incidence from historical
information
Example: the incidence proportion of quitting
smoking, called the quit ratio:
ex-smokers / ever-smokers
is calculated from survey data.
10/15/2001 Cross-sectional studies 12
Measure prevalence at point in time
Snapshot of a population, a still life
Can measure attitudes, beliefs, behaviors, personal or
family history, genetic factors, existing or past health
conditions, or anything else that does not require follow-
up to assess.
The source of most of what we know about the
population
2/22/2011 Cross-sectional studies 13
Population census
A cross-sectional study of an entire
population
Provides the denominator data for
many purposes (e.g., estimation of
rates, assessing generalizability,
projecting from smaller studies)
A huge effort people can be difficult to
find and to count; may not want to
provide data
Some countries maintain accurate and
current registries of the entire country
2/22/2011 Cross-sectional studies 14
National surveys conducted by NCHS
National Health Interview Survey (NHIS)
household interviews
National Health and Nutrition Examination
Survey (NHANES) interviews and physical
examinations
National Survey of Family Growth (NSFG)
household interviews
National Health Care Survey (NHCS)
medical records
2/22/2011 Cross-sectional studies 15
National surveys
Designed to be representative of the entire country
Modes: household interview, telephone, mail
Employ complex sampling designs to optimize efficiency
(tradeoff between information and cost)
Logistically challenging (answering machines, cellphones, . . .)
See presentation by Dr. Anjani Chandra at
www.minority.unc.edu/institute/2003/materials/slides/Chandra-20030522.ppt
10/15/2001 Cross-sectional studies 16
Example: National Health Interview Survey
Conducted every year in U.S. by National
Center for Health Statistics (CDC)
Stratified, multistaged, household survey
that covers the civilian noninstitutionalized
population of the United States
Redesigned every decade to use new
census

2/10/2009 Cross-sectional studies 17
multistaged
Improves logistical feasibility and reduces costs
(though reduces precision)
1. Divide population into primary sampling units
(PSUs)

PSU = primary sampling unit: metropolitan statistical
area, county, group of adjacent counties

2/10/2009 Cross-sectional studies 18
multistaged
2. Select sample of census block groups (SSUs)
within each selected PSU
3. Map each selected census block group or
examine building permits
4. Select one cluster of 4-8 housing units
dispersed evenly throughout the block
NCHS draws a new representative sample for
each weeks interviews
10/15/2001 Cross-sectional studies 19
stratified
US divided into 1,900 PSUs
Largest 52 PSUs are self-representing
Rest of PSUs divided into 73 categories (strata),
based on socioeconomic and demographic variables
Sampling takes place separately within each category
(stratum)


7/30/2010 Cross-sectional studies 20
Sample size and Precision
Sample
size
Lower
95%
Point
estimate
Upper
95% Width
100 0.17 0.25 0.33 0.16
400 0.21 0.25 0.29 0.08
900 0.22 0.25 0.28 0.06
1600 0.23 0.25 0.27 0.04
0.25 0.188 0.43301
3/6/2006 Cross-sectional studies 21
Weighted sampling
Hypothetical Unweighted Weighted
Age group Pop (1,000's) Sample Sample
20-39 yrs 18,000 900 400
40-59 yrs 18,000 900 400
60-69 yrs 8,000 400 400
Total 44,000 2,200 1,200
10/15/2001 Cross-sectional studies 22
stratified
Also place census blocks into categories and
sample within each
Oversample some strata

2/10/2009 Cross-sectional studies 23
Defined population
Studies, especially cross-sectional studies, are easiest to
interpret when they are based in a population that has some
existence apart from the study itself (defined population)
1. Political subdivision (city, county, state)
2. Institutional (HMO, employer, profession)
Probability sampling enables statistical generalizability to
the defined population
2/22/2011 Cross-sectional studies 24
Surveys of sentinel populations
HIV seroprevalence survey in three county STD
clinics in central NC in 1988
3,000 anonymous, unlinked, leftover sera
Anonymous questionnaire for demographics
and risk factors
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern
state. Ann Epidemiol 1993;3:281-288]
10/15/2001 Cross-sectional studies 25
HIV seroprevalence
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern
state. Ann Epidemiol 1993;3:281-288]
Group % HIV+
Homosexual men 46
Bisexual men 25
Heterosexual men 1.6
Women 0.6
Total 2.5

10/14/2003 Cross-sectional studies 26
Characteristic Gay Hetero Women
Syphilis
(history/current)
53 9.0 3
Gonorrhea (history) 37 2.6 1
Anal intercourse 41 1.7 2
Paid for sex 5.2
Seroprevalence (% HIV+) by risk factors
[Schoenbach VJ, Landis SE, Weber DJ, Mittal M, Koch GG, Levine PH. HIV
seroprevalence in sexually transmitted disease clients in a low-prevalence southern state.
Ann Epidemiol 1993;3:281-288]
10/15/2001 Cross-sectional studies 27
Interpretation
Measures prevalence if incidence is our
real interest, prevalence is often not a good
surrogate measure
Studies only survivors and stayers
May be difficult to determine whether a
cause came before an effect (exception:
genetic factors)
10/15/2001 Cross-sectional studies 28
Other points
Can choose by exposure or overall
Can choose by disease may not be
distinguishable from a case-control study with
prevalent cases
10/15/2001 Cross-sectional studies 29
Outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals
10/15/2001 Cross-sectional studies 30
Ecologic studies
Most study designs cross-sectional, case-
control, cohort, intervention trials can be carried
out with individuals or with groups
Group-level studies which use routinely collected
data are easier and less costly
Group-level studies that involve interventions
may not be easier or less costly
3/6/2006 Cross-sectional studies 31
Types of group-level variables
Summary of individual-level variable (e.g.,
median household income, % with high
school diploma)
Property of the aggregate (e.g.,
neighborhood grocery stores, seat belt
legislation, community competence)
2/22/2011 Cross-sectional studies 32
Interpretation
Link between summary exposure variable and
individual-level outcome must be inferred
Inference from group to individual is not
always sound
Example: Male Circumcision and HIV
Source: Bongaarts J, et al. The relationship between male circumcision and HIV infection in African populations. AIDS 1989; 3(6): 373-7.
2/22/2011 Cross-sectional studies 33
(Slope indicates strength of relationship;
r indicates linearity)
10/15/2001 Cross-sectional studies 34
Outline
Cross-sectional studies (and sampling)
Ecologic studies
Confidence intervals
3/8/2006 Cross-sectional studies 35
Confidence intervals
Provide a plausible range for the quantity
being estimated
Width indicates the precision of an estimate
for a given level of confidence
Confidence intervals quantify only random
error from sampling variation, not systematic
error from nonresponse, study design, etc.
10/15/2001 Cross-sectional studies 36
Confidence level vs. precision
The more vague my estimate, the more
confident I can be that it includes the
population parameter: I am 100%
confident that the prevalence of HIV is
between 0 and 100%.
The more specific my estimate, the lower
my confidence: I am 0% confident that
the prevalence of HIV is 5.23%
10/12/2004 Cross-sectional studies 37
Confidence intervals interpretation
Simple interpretations are typically not
precise
Precise interpretations are typically not
simple
10/15/2001 Cross-sectional studies 38
Simple but imprecise
There is 95% confidence that the interval
contains the true value

True, but begs the question how to
define confidence
10/15/2001 Cross-sectional studies 39
Simple but imprecise
There is a 95% probability that the interval
contains the true value

Not quite correct: probability (as
conventionally defined) applies to a process,
not to a single instance
3/7/2006 Cross-sectional studies 40
Probability applies to a process: example
A 95% confidence interval can be viewed as a
measurement or estimation process that will
be correct (the interval includes the true
value of the parameter) 95% of the time and
incorrect 5% of the time.
Let us make up another estimation process
that will be correct (about) 95% of the time.
6/29/2002 Cross-sectional studies 41
Why probability applies to a process
Estimate your gender by flipping a coin 5 times -
if the result is 5 heads estimate your gender to
be its opposite; otherwise estimate your gender
to be what you think it is now.
Probability that estimate will be correct is
(1 Probability of 5 heads) = 0.97 = 97%
Probability that estimate will be incorrect is 3%
6/29/2002 Cross-sectional studies 42
Why probability applies to a process
So we now have a measurement process that
will be correct 97% of the time. We will use it
to measure your gender.
Flip the coin 5 times, and suppose you get 5
heads
Is there a 97% probability that you are of the
opposite sex?
2/22/2011 Cross-sectional studies 43
Precise but not simple
A 95% confidence interval is:
1. obtained by using a procedure that will include
the population parameter being estimated 95%
of the time
2. the set of all population values which are likely
to yield a sample like the one we obtained
10/15/2001 Cross-sectional studies 44
Suppose that this line represents the value
of the parameter we are trying to estimate
True value
10/15/2001 Cross-sectional studies 45
Possible estimates of that parameter in N
identical studies (shows sampling variation)
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo o ooooooooooooooooo o o
Study estimates
True value
10/15/2001 Cross-sectional studies 46
One possible true value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
95% of the distribution
True value
10/15/2001 Cross-sectional studies 47
Estimate from one study of a given size
Estimate
?
10/14/2003 Cross-sectional studies 48
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo
o
ooooooooooooooo
o o
A possible true value with < 2.5% chance of
being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 49
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oooooooooooooooo
o o
A possible true value with > 2.5% probability
of being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 50
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooo
A possible true value with > 2.5% probability
of being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/15/2001 Cross-sectional studies 51
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
oo
o oooooooooooooo
A possible true value with < 2.5% probability of
being observed at or beyond the estimate
95% of the distribution
Estimate
?
10/14/2003 Cross-sectional studies 52
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
oo
o oooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo
o
ooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo
o
oooooooooooooooo
o o
What the confidence interval represents
95% confidence interval
?
10/15/2001 Cross-sectional studies 53
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
oo
o oooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo
o
ooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o ooooooooooo
o
oooooooooooooooo
o o
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
ooooooooooooooooo
o o
What the confidence interval represents
95% confidence interval
3/8/2006 Cross-sectional studies 54
One possible true value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
1.96 x s.e. | 1.96 x s.e.
True value
10/15/2001 Cross-sectional studies 55








Confidence intervals another take
10/15/2001 Cross-sectional studies 56









O
One possible population
10/15/2001 Cross-sectional studies 57









O
Another possible population
10/15/2001 Cross-sectional studies 58









O
A 3rd possible population
10/15/2001 Cross-sectional studies 59









O
A 4th possible population
10/15/2001 Cross-sectional studies 60









O
A 5th possible population
10/15/2001 Cross-sectional studies 61









O
A 6th possible population
O
O
O
10/15/2001 Cross-sectional studies 62









O
etc.
O
O
O
10/15/2001 Cross-sectional studies 63









O
There are 1.6 x 10
60
possible populations
(no cases all cases)
O
O
O
10/15/2001 Cross-sectional studies 64








Suppose this is the population
(prevalence = 15%)
O
O
O
O
O
O O
O
O
O
O
O
O
O
O
O
O O O
O
O
O
O
O
O
O
O
O
O
O
10/15/2001 Cross-sectional studies 65








Take a sample (n=10)
O
O
O
O
O
O O
O
O
O
O
O
O
O
O
O
O O O
O
O
O
O
O
O
O
O
O
O
O
10/15/2001 Cross-sectional studies 66









The sample
O
O
10/15/2001 Cross-sectional studies 67








Make point estimate of prevalence
O
O
6/29/2005 Cross-sectional studies 68
Interval estimate
What are all the possible populations that
would be expected to yield this prevalence
in a sample of size 10?

10/15/2001 Cross-sectional studies 69









O
This one is not possible
3/8/2006 Cross-sectional studies 70









O
Possible, but VERY UNLIKELY
O
3/8/2006 Cross-sectional studies 71









O
Not quite 2.5% probability (2.1%, in fact)
O
O
O
O
3/8/2006 Cross-sectional studies 72









O
Yields just about 2.5% (3%, actually) probability of
selecting 2 (or more) cases in 10
O
O
O
O
O
3/8/2006 Cross-sectional studies 73
One possible true value and how it would
manifest, on average, in N identical studies
o
oo
oooo
oooooo
oooooooo
oooooooooo
o
o ooooooooooo
o
oo
o ooooooooooooooooo
o o
95% of the distribution
True value
3/8/2006 Cross-sectional studies 74









O
Just above 2.5% (actually 2.6%) probability of
selecting 2 (or fewer) cases in 10
O
O
O
O
O O O O
O
O O
O
O
O
O
O
O
O O O O
O O O
O O
O
O
O O O O O
O O O
O O
O
O
O O
O
O O
O
O
O
O O
O
O
O
O
O
O O
O
O O
O O
O
O
O O O O
O O O O O O
O
O
O
O
O O
O
O
O O O O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O O
3/8/2006 Cross-sectional studies 75









O
Just below 2.5% (actually 2.4%) probability of
selecting 2 (or fewer) cases in 10
O
O
O
O
O O O O
O
O
O
O
O
O
O
O
O
O O O O
O O O
O O
O
O
O O O O O
O O O
O O
O
O
O O
O
O
O
O
O
O O
O
O
O
O
O
O O
O
O O
O O
O
O
O O O O
O O O O O O
O
O
O
O
O O
O
O
O O O O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O O
O
3/8/2006 Cross-sectional studies 76
Interval estimate for 2/10
Lower bound: 2.5% (5 cases)
Upper bound: 55% (110 cases)
Meaning: Our sample of 10 with 2 cases provides
evidence to exclude, at conventional error
tolerance, populations with fewer than 5 cases or
more than 110 cases. Populations with 5-110
cannot be excluded as likely sources for this
sample.
3/8/2006 Cross-sectional studies 77
Interval estimate for 2/10
Actual population prevalence was 15%,
which in fact is between 2.5% and 55%.
2.5% to 55% is a very wide interval, i.e.,
a very imprecise estimate
To make it more precise, we need a
larger sample

78
Signs from around the world Germany
A sign posted in Germany's Black Forest:
It is strictly forbidden on our black forest
camping site that people of different sex, for
instance, men and women, live together in
one tent unless they are married with each
other for that purpose.
79
Signs from around the world Finland
On the faucet in a Finnish washroom:
To stop the drip, turn cock to right.

Anda mungkin juga menyukai