Anda di halaman 1dari 24

Causal Comparative Research

Like correlational research, causal comparative research is sometimes treated as a


type of descriptive research because is too describes conditions that already exist. Causal
comparative research, however, also attempts to determine reasons, or causes, for the existing
condition. Causal comparative is thus a unique type of research, with its own research
procedures.
As you learned in Chapter 1, in causal comparative research the researcher attempts to
determine the cause, or reason, for existing differences in the behavior or status of groups or
individuals. In other words, established groups already different on some variable, and the
researcher attempts to identify the major factor that has led to this differenc. Such research in
sometimes called ex post facto, which is, Latin for after the fact, because both the effect
and the alleged cause have already occurred and must be studied in retrospect. For example, a
researcher may hypothesize that participation in presschool education is the major factor
contributing to differences in the social adjustment of first graders. To examine this
hypothesis, the researcher would select a sample of first graders who had participated in
preschool education and a sample of first graders who had not and would then compare the
social adjustment of the two groups. If the children who participated in preschool education
exhibited the higher level of social adjustment, the researchers hypothesis would be
supported. Thus, the basic causal comparative approach involves starting with an effect
( i.e.,social adjustment) and seeking possible causes (i.e., did preschool affect it).
A variation of the basic approach starts with a cause and investigates its effect on
some variable; such research is concerned with questions of what is the effect of X? for
example, a researcher may investigate the long range effect of failure to be promoted to the
seventh grade on the self-concept of children. The researcher may hypothesize that children
who are socially promoted (i.e., promoted despite failing grades) have higher self-concepts at
the end of the seventh grade than children who are retained or held back to repeat the sixth
grade. At the end of a school year, the researcher would identify a group of seventh graders
who had been socially promoted to the seventh grade the year before and a group of sixth
graders who had repeated the sixth grade ( i.e., the cause ). The self-concepts of the two
groups (i.e., the effect) would be compared. If the socially promoted group exhibited higher
scores on a measure of self-concept, the researchers hypothesis would be supported. The
basic approach, which involves starting with effects and investigating causes, is sometimes
referred to as retrospective causal comparative research. The variation, which starts with

causes and investigates effects, is called prospective causal-comparative research.


Retrospective causal comparative studies are much more common in educational research.
Beginning researchers often confuse causal-comparative research with both
correlational research and experimental research. Correlational and causal comparative
research are probably confused because of the lack of variable manipulation common to both
and the similar cautions regarding interpretation of results. There are definite differences,
however. Causal comparative studies typically involve two (or more ) groups of participants
and one dependent variable, whereas correlational studies involve two ( or more) variables
and one groups of participants. Also, causal-comparative studies focus on differences
between groups, whereas correlational studies involve relations among variables. A common
misconception held by beginning and even more experienced researchers is that causal
comparative research is somehow better or more rigorous than correlational research. Perhaps
this misconception arises because the term causal comparative sounds more official than
correlation, and we all have heard the research mantra : correlation does not imply
causation. In fact, however, both causal comparative and correlation methods fail to
produce true experimental data a point to remembers as you continue your causal
comparative and correlational research.
It is understandable that causal comparative and experimental research are at first
difficult to distinguish; both attempt to establish cause effect relations, and both involve
group comparisons. The major difference is that in experimental research the independent
variable, the alleged cause, is manipulated by the reseacher, whereas in causal-comparative
research the variable is not manipulated because it has already occurred (as a result,
reseachers prefer the term grouping variable, rather than independent variable). In other
words, in an experimental study, the researcher selects a random sample from a population
and then randomly divides the sample into two or more groups. In this way , the researcher
manipulates the independent variable; that is, the researcher determines who is going to get
what treatment. Any participants group assignment is independent of any characteristic he or
she may possess. In causal-comparative research , in contrast, individuals are not randomly
assigned to treatment groups because they are in established groups ( e.g., male/female ;
college graduates / non graduates ) before the research begins. In causal comparative
research the groups are already differ in terms of the key variable in questions. The difference
was not brought about by the reseacher; it is not independent of the participants
characteristics.

Grouping variables in causal-comparative studies cannot be manipulated (e.g.,


socioeconomic status ), should not be manipulated (e.g., number of cigarettes smoked per
day), or simply are not manipulated but could be ( e.g., method of reading instruction ).
Indeed, it is impossible or not feasible to manipulate an independent variable for a large
number of important educational problems. For instance, researchers cant control an
organismic variable, which is a characteristic of a subject or organism. Age and sex are
common organismic variables. Additionally, ethical considerations often prevent
manipulation of a variable that could be manipulated but should not be, particularly when the
manipulation may cause physical or mental harm to participants. For example, suppose a
researcher were interested in determining the effect of mothers prenatal care on the
developmental status of their children at age 1. Clearly, it would not be ethical to deprive a
group of mothers-to-be of prenatal care for the sake of a research study when such care is
considered to be extremely important to the health of both mothers and children. Thus, causal
comparative research permits investigation of a number of variables that cannot be studied
experimentally.
Figure 9.1 shows grouping variables often studied in causal comparative research.
These variables are used to compare two or more levels of a dependent variable. For
example, a causal comparative researcher may compare the retention of facts by
participants younger than 50 to the retention of facts by participants older than 50, the
attention span of students with high anxiety to that of students with low anxiety, or the
achievement of first graders who attended preschool to the achievement of first graders who
did not attend preschool. In each case, preexisting participant groups are compared.
Like correlational studies, causal comparative studies help to identify variables
worthy of experimental investigation. In fact, causal comparative studies are sometimes
conducted solely to identify the probable outcome of an experimental study. Suppose, for
example, a superintended were considering implementing computer assisted remedial math
instruction in his school system. Before implementing the instructional program, the
superintendent might consider trying it out on an experimental basis for a year in a number of
schools of classrooms. However, even such limited adoption would require costly new
equipment and teacher training. Thus, as a preliminary
CONDUCTING A CAUSAL COMPARATIVE STUDY
The basic causal-comparative design is quite simple, and althoughthe grouping
variable is not manipulated, control procedures can be exercised to inprove interpretation of

result. Causal-comparative studies also involve a wider variety of statistical techniques than
the other types of research thus far discussed.
Design and Procedure
The basic causal-comparative design involves selecting two groups that differ on
some variable of interest and comparing them on some dependent variable. As table 9.1
indicates, the researcher select two groups of participants, which are sometimes referred to as
experimental and control groups but should more accurately be referred to as comparison
groups. The groups may differ in to ways: either one groups possesses a charactheristic that
the other does not ( case A), or both group have a characteristic but to differing degress or
amounts ( case B ). An example of case A is a comparison of two groups, one compsed of
children with brain injuries. An example of case B is a comparison two groups, one
composed of individuals with strong self concepts and the other composed of individuals
with weak self-concepts. Another case B example is a comparison of the algebra achievement
of two groups, those who have learned algebra via traditional instruction and those who had
learned algebra via computer-assistedinstruction. In both case A and case B design, the
performance of the groups is compared using some valid measure selected from the types of
instruments discussed in chapter 6.
Defenition and selection of the comparison groups are very important parts of the
casual-comparative procedure. The variable differentiating the groups must be clearly and
operationally defined becaused each group represents a different population and the way in
which the group are defined affects the generalizability of the result. If a researcher wanted to
compare a group of students with a unstable home life to a group of students with a stable
home life, the terms unstable and stable would have to be operationally defined. An unstable
home life could refer to any number of thing, such as life with a parent who abuses alcohol,
who is violent, or who neglects the child. It could to a combination of these or another
factors. Operational defenition help define the population and guide sample selection.
Random selection from the defined population is generally the preferred method of
participant selection. The important consideration is to select samples that are reprensetativ
populations. Not that in casual-comparative research the researcher sample from two already
existing populations, not from single population. The goal is have groups that are sa similiar
as possible on all relevant variable expect the grouping variable. To determine the equality of
groups information on a number of background and current status variables may be collected
and compared foe each group. For example, information one age, years of experience,
gender, and prior knowledge may be obtained and examined for the groups being compared.

The more similiar the two groups are on such variable , the more homogeneous they are on
everything but the variable of interest. This homogeneity makes stronger study and reduces
the number of possible alternative explanationsof thr research findings. Not surprisingly,
then, a number of control produres correct for identified inequalities on such variable.
CONTROL PROCEDURES
Lack of randomization, manipulation, and control are all sources of weakness in
casual-comparative study. in other study designs, random assigments of the participants to
groups is probably the best way to try to ensure equality of groups, but random assigment is
not possible in causal comparative studies because the groups are naturally formed before the
start of the study. without random assigment, the groups are more likely to be different on
some important variable (e.g.,gender, experience, age) other than the variable under study.
This other variable may be real cause of the observed difference between thethr groups. For
example a researcher who simply compared a group of students who had received preschool
education to a group who had not may conclude that preschool adeucation result in firstgrade reading achievement. However, if all presschool program in the region in which the
study was conducted were private and required high tuition, the researcher would really be
investigating the effects of presschool education combinated with membership in a well-to-do
family. Perhaps parents in such families provide early informal reading instruction for their
children. In this case, it is very difficult to disentangle the effects of presschool education
from the effects of affluent families on first grade reading. A researcher aware of the situation
could control for this variable by studying only chidren of well-to-do parents. Thus, the two
groups to be compared would be equated with respect to the extraneous variable of parents
income level. This example is but one illustration of a number of statistical and nonstatistical
methods that can be applied in an attemp to control for extranenous variable.
The following section describe three control techniques; matching, comparing
homogeneous groups or subgroups, and analysis of covariance. These and other
tehnicquesare discussed in more detail in chapter 10.
Matching
Matching is a tehnique for equating groups on one or more variables. If researcher
identify a variable likely to influence performance on the dependent variable, they may
control for that variable by pairwise matcing of participants. In other words, for each
participant in one group, the researcher f inds a participant in the other group with the same
or very similiar score on the control variable. If a participant in either group does not have a
suitable match, the participant is eliminated from the study. Thus, the resulting matched

groups are identical or very similiar with respect to the identified extrancous variable. For
example, if a researcher matched participants in each group on IQ, a participant in one group
with an IQ of 140 would be matched with participant with an IQ at or near 140 in the other
group. A major problem with pair-wise matching is that invariably some participant have no
match and must there fore be eliminated from the study. The problem becomes even more
serious when the researcher attemps to match participants on two or more variables
simultaneously.
Comparing homogeneous groups or subgroups
Another way to control extrancous variable is to compare groups that are
hamogeneous with resspect to the extrancous variable. In the study about preschool at
tendance and first-grade achievement, the dicision to compare children only from well-to-do
families is an attempt to control extrancous variables by comparing homogeneous groups. If,
in another situation, IQ were an identified extraneous variable, the researcher could limit
groups only to participantswith IQS between 85 and 115 (i.e,.average IQ). This procedure
may lower the number of participants in the study and also limit the generalizability of the
findings because the sample of participants includes such a limited range of IQ.
A similiar but more satisfactory approach is to from subgroups within each group to
reprensent all level of the control variable. For example, each group may be divided into
subgroups based on IQ: high (e.g,.116 and above), average (e.g,85 to 115), and low (e.g,.84
and below). The existence of comparable subgroups in each group control for IQ. This
approach also permits the researcher to determine whether the target grouping variable
affects the dependent variable differently at levels of IQ, the control variable. That is, the
researcher can examine whether the effect on the dependent variable is different for each
subgroup.
If subgroup comparison is so interest, the best approach is not to do separate analyses
for each subgroup but to build the control variable into research design and analyze the
results with a statistical tecnique called factorial analysis of variance. A factorial analysis of
variance (discussed further in chapter 13) allows the researcher to determine the effects of the
grouping variable ( for causal-comparative designs) or independent variable ( for
experimental designs) and the control both separately and in combination. In other words,
factorial analysis of variance tests for interaction between the independent/grouping variable
and the control variable such that the independent/grouping variable operates differently at
each level of the control variable. For example, a causal-comparative study of the effects of
who different methods of learning fraction my include IQ as a control variable. One potential

interaction between the grouping and control variable would be that a method involving
manipulation of blocks is more effective than other methods for students with lower Iqs, but
the manipulation methods is no more effective than other methods for students with higher
IQs.
Analysis of Covariance
Analysis of covariance is a statistical technique used to adjust initial group differences on
variables used in causal-comparitive and experimental studies. In assence, analysis of
covariance adjust scores on a dependent variable for initial differences on some variable other
variable related to performance on the dependent variable. For example, suppose we palned a
study to compare two methods, X and Y, of teaching fifth graders to solve math ability prior
to intriducing the new teaching methods, we found that the group to be taught by method X.
This difference suggests that the method Y group will be superior to the method X group at
the end of the study because members of the group began with higher math ability than
members of the other group. Analysis of covariance statistically adjusts the scores of the
method Y group to remove the initial advantage so that at the end of the study the results can
be fairly compared, as if the two groups started equally.

Data Analysis and Interpretation


Analysis of data in causal-comparative studies involves a variety of descriptive and
inferential statistics. All the statistics that may be used in a causal-comparative study may
also used used in experimental study and a number of them are described in Chapter12 and
13. Briefly, however the most commonly used descriptive statistics are the mean. Which
indicates the average performance of a group on a measure of some variable, and the standard
deviation. Which indivates the spread of a set of scores around the mean-that is, whether the
scores are relatively close together and clustered around the mean or widely spread out
around the mean (see Chapter 12). The most commonly used inferential statistics are the t
test, used to determine whether the scores of two groups are significantly different from one
another, analysis if variance, used to test for significant differences among the scores for three
or more groups; and cbi square, used to compare group frequencies-that is, to see if an event
occurs more frequently in one group than another (see Chapter 13).
Again, remember that interpreting the findings in a causal-comparative study
requiresconsiderable caution. Without randomization, manipulation, and control factor, it is
difficult to establish cause-effect relations with any great degree of confidence. The cause-

effect relation may in fact be the reserve of the one hypothesized (i.e., the alleged cause may
be the effect and vice versa). Reverse causality is not a reasonable alternative in every case,
however. For example, preschool training may affect reading achievement in third grade
cannot affect perschool training similarly, ones gender may affects ones achievement in
mathematics certainly does not affect ones gender! When reserved causality is plausable, it
should be investigated. For example, it is equally plausible that excessive absenteelsm
produces, or leads to, involvement in criminal activities as it is that involvement in criminal
activity proceduces, or leads to, excessive absenteeism. The way to determine the correct
order of causality-which variable caused which-isto determine which one occurred first. If, in
the preeding example, a period of excessive absenteeism were frequently followed by a
student getting in trouble with the law, then the researcher could reasonably conclude that
excessive absenteeism leads to involvement in criminal activities. On the other hand, if a
students first involvement in criminal activities were preceded a period of good attendance
but followed by a period of poor attendance, then the conclusion that involvement in criminal
activities leads to excessive absenteeism would be more reasonable.
The possibility of a third, common explanation is also plausible in many situations.
Recall the example of parental attitude affecting both self-concept and achievement,
presented earlier in the chapter. As mentioned, one way to control for a potential common
cause is to compare homogeneous groups. For example, if students in both the strong selfconcept group and the weak self-concept group could be selected from parents who had
similiar attitudes, the effects of parents attitudes would be removed because both groups
would have been exposed to the same parental attitudes. To investigate or control for
alternative hypotheses, the researcher must be aware of them and must present evidence that
they are not better explanations for the bahavioral differences under investigation.
Causal comparative research definition and purpose
1. In causal-comparative research, the researcher attempts to determine the cause, or
reason, for existing differences in the behavior or status of groups.
2. The basic causal-comparative approach is retrospective; that is, it starts with an effect
and seeks its possible causes. A variation of the basic approach is prospective that is,
starting with a cause and investigating its effect on some variable.
3. An important difference between causal-comparative and correlational research is that
causal comparative studies involve two ( or more ) groups of participants and one
grouping variable, whereas correlational studies typically involve two ( or more )

variables and one group of participants. Neither causal comparative nor


correlational research produce true experimental data.
4. The major difference between experimental research and causal-comparative research
is taht in experimental research the researcher can randomly form groups and
manipulate the independent variable. In causal comparative research the groups are
already formed and already differ in terms of the variable in question.
5. Grouping variables in causal comaparative studies cannot be manipulated, should
not be manipulated, or simply are not manipulated but could be.
6. Causal-comparative studies identify relations that may lead to experimental studies,
but only if a relation is established clearly. The alleged cause of an observed causalcomparative effect may in fact be the supposed cause, the effect, or a third variable
that may have affected both the apparent cause and the effect.
7. The basic causal comparative design involves selecting two groups differing on
some variable of interest and comparing them on some dependent variable. One group
may possess a charateristic that the other does not, or one group may possess more of
a charateristic tahn the other.
8. Samples must be representative of their respective population and similiar with
respect to critical variables other than the grouping variable.
Control Procedures
9. Lack of randomization, manipulation, and controls are sources of weakness in causal
comparative design. It is possible that the groups re different on some other major
variable besides the target variable of interest, and this other variable may be the
cause of the observed difference between the groups.
10. Three approach to overcoming problems of initial groups difference on an extraneous
variable are matching, comparing homogeneous groups or subgroups, and analysis of
covariance.
Data analysis and interpretation
11. The descriptive statistics most commonly used in causal comparative studies are the
mean, which indicates the average performance of a group on a measure of some
variable, and the standard deviation, which indicates how spread out a set of scores is
that is, whether the scores are relatively close together and clustered around the
mean or widely spread out around the mean.
12. The inferential statistics most commonly used in causal comparative studies are the
test, which is used to determine whether the scores of two groups are significantly
different from one another , analysis of variance, used to test for significant

differences among the scores for three or more groups ; and chi square, used to see if
an event occurs more frequently in one group than another.
13. Interpreting the findings in a causal comparative study requires considerable
causation. The alleged cause - effect relation may be the effect, and vice versa, or a
third factor may be the cause of both variables. The way to determine the correct
order of causality is to determine which one occurred first.

Comparing Longitudinal Academic


Achievement of Full Day and Half Day
Kindergarten Students

ABSTRACT
The authors compared the achievement of children who were enrolled in full day
kindergarten (FDK) to a matched sample of students who were enrolled in half day
kindergarten (HDK) on mathematics and reading achievement in Grade 2, 3, and 4, several
years after they left kindergarten. Result showed that FDK students demonstrated
significantly higher achievement at the end of kindergarten than did their HDK counterparts,
but that advantage disappeared quickly by the end of the first grade. Interpretations and
implications are given for that finding. Key words : academic achievement of full and halfday kindergarten students, mathematics and reading success in elementary grades.
Coinciding with increases in pre-kindergarten enrollment and the number of parents working
outside of the home, full-day kindergarten (FDK) has become exceedingly popular in the
United States ( Gullo & Maxwell, 1997 ). The number of students attending FDK classes in
the United States rose from 30% in the early 1980s ( Holmes & McConnell, 1990 ) to 55% in
1998 ( National Center for Education Statistics, 2000 ), reflecting societal changes and newly
emerging educational priorities. Whereas kindergarten students were required to perform
basic skills, such as reciting the alphabet and counting to 20, they are now expected to
demonstrate reading readiness and mathematical reasoning while maintaining the focus and
self-control necessary to work for long periods of time (Nelson, 2000).
In contrast, the popularity of half-day kindergarten (HDK) has decreased for similar
reasons. For example, parents prefer FDK over HDK for the time it affords (Clark & Kirk,
2000) and for providing their children with further opportunities for academic, social, and
personal enrichment (Aten, Foster, & Cobb , 1996; Cooper, Foster, & Cobb, 1998a, 1998b).
The shift in kindergarten preferences has resulted in a greater demand for research on
the effects of FDK in comparison with other scheduling approached ( Gullo & Maxwell,
1997 ). Fusaro (1997) cautioned that Before a school district, decides to commit additional
resources to FDK classes, it should have empirical evidence that children who attend FDK
manifest greater achievement than children who attend half-day kindergarten (p.270).
According to the literature, there is mounting evidence that supports the academic, social, and
language development benefits of FDK curricula (Cryan, Sheehan, Wiechel, & Bandy
Hedden, 1992; Hough & Bryde, 1996; Karweit, 1992; Lore, 1992; Nelson, 2000). Successful
FDK programs specifically extend traditional kindergarten objectives and use added class
hours to afford children more opportunities to fully integrate new learning (Karweit, 1992).

Furthermore, most education stakeholders support FDK because they believe that it provides
academic advantages for students, meets the needs of busy parents, and allows primary
school teachers to be more effective (Ohlo State Legislative Office of Education Oversight
[OSLOEO], 1997 ).
Lenght of School Day
According to Wang and Johnstone (1999), the major argument for full-day kindergarten is
that additional hours in school would better prepare children for first grade and would result
in a decreased need for grade retention (p.370). Furthermore, extending the kindergarten day
provides educational advantages resulting from increased academic emphasis, time on task,
and content coverage (Karweit, 1992; Nelson, 2000; Peck, McCaig, & Sapp, 1998).
Advocates of FDK also contend that a longer school day allows teachers to provide a relaxed
classroom atmosphere in which children can experience kindergarten activities in a less
hurried manner ( McConnell & Tesch, 1986). Karweit (1992) argued that consistent school
schedules and longer school day help parents to better manage family and work
responsibilities while providing more time for individualized attention for young children.
Critics of FDK express concern that children may become overly tired with a full
day of instruction, that children might miss out on important learning experiences at home,
and that public schools should not be in the business of providing custodial child care for 5year-olds (Elicker & Mathur, 1997, p. 461). Peck and colleagues (1998) argued that some
FDK programs use the extra time to encroach on the first-grade curriculum in an ill-advised
attempt to accelarate childrens cognitive learning. However, in a 9-year study of
kindergarten students, the Evansville-Vanderburgh School Corporation (EVSC; 1988) found
that school burnout and academic stress were not issues for FDK students. Others conclude
convincingly that the events that occur in classrooms (e.g., teacher philosophy, staff
development), rather than the length of the school day, determine whether curricula and
instruction are developmentally appropriate for young students (Clack & Kirk, 2000; Elicker
& Mathur, 1997; Karweit, 1994).

Parent Choice
A critical factor driving the growth of FDK is greater parent demand for choice in
kindergarten programs. Although surveys of parents with children in HDK often mention the

importance of balancing education outside the home with quality time in the home, Elicker
and Mathur (1997) found that a majority of these parents would select a FDK program for
their child if given the opportunity. However, Cooper and colleagues (1998a) found that
parents of FDK students were even more supportive of having a choice of programs than
were parents of HDK students.
Although some parents expressed concern about the length of time that children were
away from home, most were content with her option of FDK (Nelson,2000): In addition to
the belief that FDK better accommodates their work schedules (Nelson), parents of full-day
children expressed higher levels of satisfaction with program schedule and curriculum, citing
benefits similar to those expressed by teachers: more flexibility; more time for child
initiated, in-depth, and creative activities; and less stress and frustration (Elicker & Mathur,
1997, p.459). Furthermore, Cooper and colleagues (1998a) found that parents of full-day
students were happy with the increased opportunities for academic learning afforded by FDK
programs.
Student Achievement
Most researchers who compared the academic achievement levels of FDK and HDK
kindergarten students found improved educational performance within FDK programs (Cryan
et al., 1992; Elicker & Mathur, 1997; Holmes & McConnell, 1990; Hough & Bryde 1996;
Koopmans, 1991; Wang & Johnstone, 1999). In a meta-analysis of FDK research, Fusaro
(1997) found that students who attended FDK demonstrated significantly higher academic
achievement than did students in half-day programs. Hough and Bryde (1996) marched six
HDK programs with six FDK programs and found that FDK students outperformanced HDK
students on language arts and mathematics criterion-referenced assessments. In a study of
985 kindergarten students, Lore (1992) found that 65% of the students who attended a FDK
program showed relatively stronger gains on the reading and oral comprehension sections of
the Comprehensive Test of Basic Skills. In a 2-year evaluation of a new FDK program,
Elicker and Mathur (1997) reported that FDK students demonstrated significantly more
progress in literacy, mathematics, and general learning skills, as compared with students in
HDK programs. However, some researchers have not found significant differences between
the academic achievement of students from FDK and HDK programs (e.g., Gullo &
Clements, 1984; Holmes & McConnell, 1990; Nunnally, 1996).
Longitudinal Student Achievement

Evidence supporting the long-term effectiveness of FDK is less available and more
inconsistent than is its short-term effectiveness (Olsen & Zigler, 1989). For example the
EVSC (1988) reported that FDK students had higher grades than did HDK students
throughout elementary and middle school, whereas Koopmans (1991) found that the
significance of the differences between all-day and halfday groups disappears in the long run
[as] test scores go down over time in both cohorts (p.16). Although OSLOEO (1997)
concluded that the academic and social advantages for FDK students were diminished after
the second grade, Cryan and colleagues (1992) found that the positive effects from the added
time offered by FDK lasted well into the second grade.
Longitudinal research of kindergarten programming conducted in the 1980s (Gullo,
Bersani, Clements & Bayless, 1986; Puleo,1988) has been criticized widely for its
methodological flaws and design weaknesses. For example, Elicker and Mathur (1997)
identified the noninclusion of initial academic abilities in comparative models as a failing of
previous longitudinal research on the lasting academic effects of FDK.
Study Rationale
In 1995, the Poudre School District (PSD) implemented a tuition-based FDK program in
addition to HDK classes already offered. Although subsequent surveys of parent satisfaction
revealed that FDK provided children with further opportunities for academic enrichment
(Aten et al., 1996; Cooper et al., 1998a, 1998b), researchers have not determined the veracity
of these assumptions. Thus, we conducted the present study to address this gap in the
empirical evidence base.
Research Questions
Because of the inconclusiveness in the research literature on the longitudinal academic
achievement of FDK versus HDK kindergarten students, we did not pose a priori research
hypotheses. We developed the following research questions around the major main effects
and interactions of the kindergarten class variable (full day vs. half day), covariates (age and
initial ability), and dependent variables (K 5 reading and mathematics achievement).
1. What difference exists between FDK and HDK kindergarten students in their
mathematics and reading abilities as they progress through elementary school, while
controlling for their initial abilities?
2. How does this differential effect vary, depending on student gender?

Methods
Participants
The theoretical population for this study included students who attended elementary
school in moderately sized, middle-to-upper class cities in the United States. The actual
sample included 489 students who attended FDK or HDK from 1995 to 2001 at one
elementary school in a Colorado city of approximately 125,000 residents. Because this study
is retrospective, we used only archival data to build complete cases for each student in the
sample. Hence,no recruitment strategies were necessary.
Students were enrolled in one of three kindergarten calsses; 283 students (57.9%)
attended half-day calsses (157 half-day morning and 126 half-day afternoon) and 206
students (42.1%) attended full-day classes. Students ages ranged from 5 years 0 months to 6
years 6 months upon entering kindergarten; overall average age was 5 years 7 months. The
total study include 208 girls (44.0%) and 265 boys(56.0%). The majority of students received
no menetary assistance for lunch, which was based on parent income (89.0%,n=424);49
students (10.0%) received some assistance. Twenty-six students (5.3%) spoke a language at
home other than english. The majority of students (90.5%, n= 428) were caucasian; 31
students (6.3%) were Hispanic; and 14 students (2.8%) were African American, Native
American, or Asian American. Those data reflect the community demographics within the
school district. Because of the potential for individual identification based on the small
number of students within the various ethnic group and those receiving lunch assistance, our
analyses excladed ethnicity and lunch assistance as control variables.
Intervention
We excluded from the study students who awitched during the academic year from FDK to a
HDK (or vice versa). FDK comprised an entire school day, beginning at 8;30 a.m and ending
at 3;00 p.m, HDK morning classes accured from 12:15 p.m to 3:00 p.m. FDK recessed at
lunch and provided at 30-min rest period in the afternoon when students typically napped,
watched a video, or both. HDK students also recessed but did not have a similiar rest period.
Both kindergarten programs amployed centers ( small ability-based groups) as part of their
reading and mathematics intruction, and all kindergarten teachers met weekly to discuss and
align their curriculum. The amount of time spent on reading instruction was two or three hen
times greater than that dedicated to mathematics.
Reading curriculum. The kindergarten reading curriculum was based predominantly on the
Open Court system, which emphasizes phonemic awareness. Students learned to segment and

blend words by pronouncing and repronouncing words when beginnings and endings were
removed. Teacher also include daily letters to the class on which students identified the
letters of the day and circled certain words. Teachers also read stories to students, helped
students write capital and lowercase letters and words, and encouraged them to read on their
perform other reading activities. Teachers axpected the students to know capital and
lowercase letters and their sounds, and some words by sight when they completed
kindergarten.
Mathematics Curriculum. The kindergarten mathematics curriculum was predominantly
workbook based and intergrated into the whole curriculum. Students works with mathematics
problems from books, played numbers games with the calender, counted while standing in
line for lunch and recess, and practiced mathematical skills in centers. One a weak, the
principal came into the kindergarten classes and taugts students new mathematics games with
cards or chips. The games included counting-on, skip-counting, and simple addition and
subtraction. Students were expected to leave kindergarten knowing how to count and perform
basic numerical operation (i.e., adding and subtracting 1).
Measure
Initial Reading-ability Covariate. When each participant entered kindergarten, school
personal ( kindergarten teacher or school pricipal) assessed them for their ability to recognize
capital and lowercase letters and to produce their sounds. This letter-knowledge assesment
requested that students name all uppercase and lowercase letters (shown out of order) and
make the sounds of the uppercase letters. Students received individual testing, and school
personal recorded the total number of letter that the student identified corretly out of a
possible 78 letters. Letters name and sound knomledge are both essential skills in reading
development (stage, sheppard, davidson, & Browning, 2001). Simply put, theory suggest that
letter name knomledge facilitates the ability to produce letter sounds, whereas letter-sounding
ability is the foundation for word decoding and fluent reading (Ehri,1998;Kirby &
Parrila,1999;Trieman, Tincoff, Rodriguez, Mouzaki & Francis, 1998). Predictive validity is
evidenced in the numerous studies in which researchers have reported high correlations
(r=.60 to r=.90) between letter-naming and letter sounding ability and subsequent reading,
ability and achievement measures (Daly, Wright, Kelly, & Marten, 1997;Kirby &Parilla,
1999; McBride-Cang,1999; Stage et al.,2001).
Initial Mathematics Ability Covariate. When the students entered kindergarten, school
personal (kindergarten teacher or school principal) assesed their initial mathematics ability.
The assesment consisted of personal asking studnts to identify number from 0 to 10. They the

recorded total number that the students named out of possible 11. The ability to recognize
numbers and perform basic numberical , such accounting to 10, is recognized as important
indicators of kindergarten readiness (Kurdek & Sinclair, 2001). Researchers have shown that
basic number skills (counting and number recognition) in early kindergarten predict
mathematics achievement in first grade (Bramlett, Rowell, &Madenberg,2000) and in fourth
grade (Kurdek & Sinclair).
k-2 Reading Fluency dependent Variable: One-minute Reading (OMR) Assasment. The
school principal assesed k-2 reading achievement by conducting 1-min, gradeappropriate
reading samples with each student at the beginning and end of the school year. The
kindergarten reading passage contained 67 words, the first-grade passage had 121 words, and
ithe second-grade passage included 153 words. Students who finished a passage in less than 1
min returned to the beginning of the passage and continued reading until the minute expired.
The principal recorded the total number of words that a student read correctly in 1 min.
Students who read passage from grades higher than their own were excluded from subsequent
analysis.
The OMR is a well-known curriculum-based measure of oral fluency that is
theoretically and empirically linked to concurrent and future reading achievement (Fuchs,
Fuchs, Hosp, & Jenkins, 2001). Scores on the OMR correctly highly with concurrent criteria
(r=.70 to .90; Parker, Hasbroukck, & Tindal, 1992). Evidence of oral fluency criterion
validity includes high correlations with teacher student-ability judgments (Jenkins & Jewell,
1993), standardized student achievement test scores (Fucsh, Fucsh, & Maxwell, 1988;
Jenkins & Jewell), reading inventiries (Parker et al., 1992), and reading comprehension tests
(Hintze, Shapiro, Conte, & Basile, 1997; Kranzler, Brownell, & Miller, 1998).
Dependent variables for Reading-and Mathematics-achievement-level Tests: Reading
and Mathematics Level. The Northwest Evaluation Association (NWEA) developed
standardized reading-School Diskrit. NWEA generated the tests from a large data bank of
items that were calibrate on a common scale using rasch measurement tecniques. The tetss
measure student performance on a rasch unit (RIT) scale that denotes a students
ability,independent of grade level. the elementary school conducted reading and mathematics
level tests once a year in te spring with all second-through sixth-grade student who could read
and write. NWEA (2003) reported that the levels tests correlate highly with other
achievement tests, including the colorado state assessment program test (r = .84 to .91) and
the lowa tests of basic skills (r = .74 to .84 ). Test-retest realibity results were similiarly
favorable, raging from .72 to .92, depending on grade level and test (NWEA).

Result
Rationale for analyses. We considered several alternatives when we analyzed the data from
this study. Our first choice was to analyze the data by using three multiway mixed analyses of
covariances (ANCOVAs) with kindergarten group and gender as the between-groups factors
and the repeated measurements over time as the whitin-subjects factor. However, we rejected
that analytic tecnique for two reasons. First and foremost, all three analyses evidenced
serious violations of sphericity. Second, this analytic design requires that all case have all
measure on the dependent variable (the within-subjects factor). That requirement reduced our
sample size by as much as 75% in some of the analysis when compared with our final choice
of separate univariate, between-groups ANCOVAs.
Our second choice was to analyze the data with three 2 x 2 ( kindergarten group [full day vs
half day] x gender) between-groups multivariate analtsis of variance (MANCOVAs) with the
multiple dependent variables measures included simultaneously in the analysis. Field (2000)
recommended switching from repeated-measure ANCOVAs to MANCOVAs when sample
sizes are relatively high and violations of sphericity are fairly severe, as in our
situation.unfortunalely, there also are difficulties when researchers user MANCOVAs. First,
the analysis and interpretation of MANCOVA are extraordinarily complex and cumber some.
More important, a number of statiscians (e.g., Tabachnick & Fidell, 1996) have counseled
against using MANCOVA when strong intercorrelation exist between the dependent
measures. Finally, our data violated the homogeneity of cavariance matrices, which is an
additional assumption of MANCOVA.
Our final choice was to conduct separate univariate ANCOVAs with appropriate Bonferroni
adjustments to prevent inflation in the type i error rate. For the OMR, we began our analysis
with five 2 x 2 (kindergarten group x gender) ANCOVAs with initial reading ability as the
covariance. We measured OMR at the end of kindergarten and at the beginning and end of
first and second grades. The alpha level was set at .01 for each the five analyses.
For the reading-level analyses, we conducted three 2 x 2 ANCOVAs because reading
achievenment tests were given inthe spring of the second, third, and fourth grades. The alpha
level was set at .017 for each of the analyses. For the mathematics levels analyses, we
conducted three 2 x 2 ANCOVAs with the mathematics achievenment tests given in spring of
the second, third, and fourth grades. The alpha levels was also set at .017 for those analyses.
Assessing assumptions. We began our univariate ANCOVA analyses by testing for univariate
and multivariate normality. Univariate normality existed in all 11 analyses, at least with
respect to skewness. There were two instances in which univariate kurtosis exceeded

acceptable boundaries for normality. Although there there were a limited number of instances
in which multivariate normality was mildly violated, visual inspection of the histograms and
Q-Q plots suggested no substantive deviation from normality, excpt for the OMR test given at
the end of kindergarten. Hence, we eliminated the test from our final set of analyses. Given
the large sample sizes and the relative robustness of ANCOVA against violations of
normality, we proceeded with the remaining 10 ANCOVAs.
. We next assessed the assumption of homogeneity of regression slope, which, if violated,
generates much more difficulty in the interpretation of the result of the analyses. Neither of
the five OMR analyses violated that assumption. However, the third-grade reading-level
analysis violated the assumption. Hence, we removed that analysis from the study, leaving
only two analyses of reading achievent at the second and fourth grade levels.
Finally, we assessed the correlation between the covariate and the dependent variable.
We began by assuming that the participants age (measured in months) might be correlated
significantly with the dependent variables and should be included in our analyses as a
covariate. Tables 1 and 2 show the results of this analysis and that none of the correlations
were statistically significant. Hence, we did not include age in the analyses as a covariate
Initial reading and mathematics abilities were the other convariates included in the
analyses. Our a priori assumption was that those covariates had to correlate significantly with
their appropriate dependent variable to the included in the analyses. As tables 1 and 2 show,
all of the final correlation were statistically significant, confirming the propriety of their use
as covariates.
Findings
Tables 3, 4, and 5 show the source tables for the OMR, the reading levels, and the
mathematics levels, respectively. In each table, the kindergarten grouping independent
variable is included in the table, regardlessof whether it achieved statistical significance.
Gender, on the other hand, is included in the source tables only in those analyses in which it
achieved statistical significance (second-grade mathematics achievement).
Table 3 shows that kindergarten class was statistically significant at the end of
kindergarten, F(1,400) = 35.08,p<.001, at the beginning of first grade, F(1,261)=11.43, p<.01,
and at the end of first grade, F(1,194)=6.26, p< .05. The covariate, as expected, was strongly
significant at all levels, and gender was not statistically significant at any level in the
analyses. Significance levels and the estimates of effect size declined as the participants
progressed in school within and across academic years.

Table 4 shows that the covariate was highly significant (as expected) but with no
statistically significant effect for either kindergarten class or gender. Table 5 shows a similar
pattern in the two preceding tables, with (a) a statistically significant covariate, (b) absence of
statistical significance for the kindergarten class, and (c) declining estimates of effect size as
time in school increased. Gender was statistically significant at the second grade.
Table 6 shows the subsample sizes, means, standard deviations, and corrected effect
sizes for each of the two kindergarten alternatives across all dependent measures. The only
effect size estimate whose magnitude approaches Cohens (1998) standard for minimal
practical significance(.25) is the first one reported in Table 6 (.44). That effect size indicates
that FDK confers a smaal-moderate advantage on reading ability at the end of the
Kindergarten experienc. At the beginning and end of first grade, that advantage is no longer
practically significant, although it is still positive. Beginning in second grade, the advantage
in reading and mathematics is neither practically significant nor positive for FDK students.
Follow-up Interviews
As a follow to our analyses, we interviewd the four kindergarten teacher in january 2004, for
their views on (a) the kindergarten curriculum, (b) their perceived differences between FDK
and HDK programming, and (c) their explanations for the findings that we observed between
FDK and HDK students in reading and mathematics achievenment. The teacher were woman
who had taught for 14,9,8 and 6 years, respectively. They had previously taught FDK and
HDK kindergarten and had been teaching kindergarten at the elementary school research site
for 10,9,6,and 4 years, respectively. Two of the teachers were still teaching kindergarten; the
other two of the teacher were teaching second and sixth grades; respectively. One teacher
admitted that she had a half day blas , whereas another teacher was a proponent of full day
kindergarten.
All interviews consisted of open-ended question and lasted between 30 min and 1 hr. The
interviews were taperecorded and transcribed and returned to the teacher for review. After
approval of the transcripts, we coded the interviews by using constant comparative analytic
tecniques (strauss & Corbin, 1994), which involved inductively identifying themes and
developing written summaries.
When questioned about the differences between FDK and HDK, all teacher stated that they
would have expected FDK students, in general, to perform better academically than HDK
students at the end of kindergarten. They attributed the difference to the increased time that
FDK student spent reviewing and practicing material. However, consistent with our findings,
all teacher were equally doubtful that the differnces would last. They believed that the

academic disparity between FDK and HDK students would disapear during first through third
grades. For example, one teacher stated that that kids, by third grade, catch up or things kind
of level out so i dont think thered be much of a difference.
Although teachers agreed that the FDK advantage probably did not extend past early
elementary education, their explanations for the ephemeral differences varied and fell into
three general categories : (a) effects of differentiated in struction, (b) individual student
development, and (c) individual student attributes.
Differentiated Instruction. All teacher, in various ways, suggested that differentiated
instructins would need to occur in every grade subsequent to kindergarten to, at least
partially, maintain higher achievement levels evidenced by FDK students. When asked to
define differentiated instruction, one teacher said:
What is means to me is that i need to meet that child where they are. I mean I need to have
appropriate instruction for that child... I need to make sure that theyre getting what need
where they are... But, I think you need to set the bar pretty high and expect them to reach
that; on the other hand, I think you need to not set it so high that youre going to frustrate the
kids that arent ready.
However, the kindergarten teachers recognized the challenges of using differentiated
instruction and were careful not to place blame on first-through third-grade teachers. One
teacher stated,Iam not saying that not everyone does differentiated instruction. But I think
that you have to be careful you dont do too much whole group teaching to a group of kids
thats way past where theyre at. Although all of the teachers agreed that differentiated
instruction would be necessary to maintain differences after kindergarten, not all of them
believed that this tecnique would be singularly sufficient. Some teachers believed strongly
that the leveling out was predominantly a result of individual student development or
student attributes, or both, rather than teaching methods.
Students Development. Two teachers felt that the leveling out of academic differences
between FDK and HDK students by second grad resulted from natural developmental
growth occuring after kindergarten. They explained:
You have kids that cannot hear a sound. They cannot hear, especially the vowel sounds. They
are not ready to hear those. They are not mature enough to hear those sounds. You could go
over that eigth billion times and they just arent ready to hear those sounds. They go into first
grade and they just arent ready to hear those sounds. They go into first grade and theyve
grown up over the summer and... it cliks with them. And they might have been low in my

class, but they get to first grade and theyve middle kids. Theyve kind of reached where their
potential is.
I mean, theres big developmental gap in K,1,2 and by third grade the kids that look[sic]
behind, if theyre going to be average or normal, they catch-up by then... like some kids in
second grade, they still struggle with handwriting and reversal and by now its a red flag if
theyre still doing that, developmentally everyting should be fitting together in their little
bodies and minda and they should be having good smooth handwriting in the right direction.
And if thats not happening then thats flag. And by third grade... if theyre not forming like
an average student then theres something else that needs to be looked at. So its a
development thing and its just when kids are ready.
Yet, both of those teachers acknowledged that HDK students do have to work to catch up to
FDK students, citing (a) less time to spent on material, (b) differences in FDK and HDK
teachers instructional philosophies, and (c) lack of familiarity with all-day school as
disadvantages that HDK students must overcome in first grade to equal their FDK
counterparts.
Students attributes. A final explanation that teachers offered for the leveling out of differences
suggested that individual student attributes accounted for student differences in subsequent
grades. Three teachers believed that, no matter what kindergarten program students attended,
their inherent level of academic ability or level of parent involvement, or both, were most
important in eventually determining how individual students would compare with other
students. For example, I think they get to where their ability is, regardless of... You can give
them a good start and I think that can make a difference, but a high kid is going to be high
whether they were in full or half. And those grays kids, you can give them a boost and they
can be higher than maybe they would have been in half-day, you now you can give them a
better start.
Thus, these three teachers believed that student attributes, such as inherent ability or degree
of parent involvement in their schooling, would ultimately play a more significant role in
how students would eventually compare with one another in second and third grades,
regardless of whether they attended FDK or HDK programs.
Discussion
What can be determined about the effects of FDK versus HDK kindergarten as a resukt of our
analyses? Children who attend FDK can and do learn more through that experience than do

their HDK counterparts. Nenetheless, the additional learning appears to decline rapidly, so
much so that by the start of first grade, the benefits of FDK have diminished to a levels that
has little practical value. That effect was consistent across two measures of reading and one
measure of mathematics. That effect also was consistent across gender, given that there was a
gender by kindergarten group interaction in only one of the analyses.
Our findings are consistent with past meta-analytic research (Fusaro,1997) and highquality empirical studies (e.g., Hough & Bryde, 1996) in that FDK confers initial benefits on
academic achievement but that these benefits diminish relatively rapidly (OSLOE, 1997). We
are unclear why the rapid decline occurs, but we offer this insight from several school
administrators and teachers with whom we interacted in our discussions of these data :
Teachers in the first few grade are so concerned with students who enter their classes
[with] nonexistent reading and math skills that they spend the majority of their time
bringing these students up to minimal math and reading criteria at the expense of
working equally hard with students whose reading and math achievement are above
average. Hence, the high-achieving students gains at the end of kindergarten
gradually erode over the next few years with lack of attention.
We concur with Fusaro (1997) that districts must make their choices involving FDK
with a full understanding of what the benefits may be for academic achievement and
nonachievement outcomes. Our findings of initial gains place the onus of maintaining those
gains on schools and teachers through their own internal policies, procedures, and will to
sustain those gains.
Our study, of course, is not without limitations. We studied only one school, albeit
over a relatively long period of time, with well-established measures and with reasonably
well-equated groups. The greatest reservation we have about the generalizability of our
findings clearly focuses on the predicted decline in long-term benefits of FDK for schools,
making it a priority to assure that teachers each one as far as possible during the academic
year rather than to move all students to a common set of expected learning at the end of the
academic year. We recognize that school policies, procedures, and culture play important
roles in the variability in student achievement, regardless of the skill levels of students
entering first grade. Although our results will likely generalize to a wide variety of
elementary school children, they also will likely generalize to those children who attend
schools whose instructional policies and practices in the early grades are similar to the school
in this study.

NOTE
The authors appreciate the thoughtful participation of Suzie Gunstream and the other
elementary teachers whose invaluable practicioner insights helped us make sense of the
findings.

Anda mungkin juga menyukai