Am Educ Res J 2011 Fantuzzo 763 93

American Educational Research Journal http://aerj.aera.
net
An Integrated Curriculum to Improve Mathematics, Language, and Literacy for Head Start Children
John W. Fantuzzo, Vivian L. Gadsden and Paul A. McDermott Am Educ Res J 2011 48: 763 originally published online 21 October 2010 DOI: 10.3102/0002831210385446 The online version of this article can be found at: http://aer.sagepub.com/content/48/3/763
Published on behalf of
American Educational Research Association
and
http://www.sagepublications.com
Additional services and information for American Educational Research Journal can be found at: Email Alerts: http://aerj.aera.net/alerts Subscriptions: http://aerj.aera.net/subscriptions Reprints: http://www.aera.net/reprints Permissions: http://www.aera.net/permissions
>> Version of Record - May 9, 2011 OnlineFirst Version of Record - Oct 21, 2010 What is This?
Downloaded from http://aerj.aera.net at UNIV PENDIDIKAN SULTAN IDRIS on June 19, 2012
American Educational Research Journal June 2011, Vol. 48, No. 3, pp. 763793 DOI: 10.3102/0002831210385446 2011 AERA. http://aerj.aera.net
An Integrated Curriculum to Improve Mathematics, Language, and Literacy for Head Start Children
John W. Fantuzzo Vivian L. Gadsden Paul A. McDermott University of Pennsylvania
This article reports on the development and field trial of an integrated Head Start curriculum (Evidence-Based Program for Integrated Curricula [EPIC]) that focuses on comprehensive mathematics, language, and literacy skills. Seventy Head Start classrooms (N = 1,415 children) were randomly assigned to one of two curriculum programs: EPIC or the Developmental Learning Materials Early Childhood Express, with curricula implemented as standalone programs. EPIC included instruction in mathematics, language, literacy, and approaches to learning skills; formative assessment; and a learning community for teachers. Multilevel growth modeling through four direct assessments revealed significant main effects and growth rates in mathematics and listening comprehension favoring EPIC, controlling for demographics and special needs and language status. Both programs produced significant growth rates in literacy. KEYWORDS: at-risk students, early childhood, child development, hierarchical modeling, classroom research hildren from economically disadvantaged households, especially those from minority families in large urban areas, are among the most vulnerable to poor academic outcomes (Aber, Jones, & Raver, 2007). Research demonstrates that these children show significantly lower reading proficiency relative to peers at all grade levels (Jencks & Phillips, 1998), with reading achievement gaps exceeding 1.2 standard deviations manifested as early as preschool (Jencks & Phillips, 1998; West, Denton, & Reaney, 2002). Research has also found that children living in povertyfrom all ethnic groupsperform worse in mathematics than their nonimpoverished peers (Chatterji, 2006; Chernoff, Flanagan, McPhee, & Park, 2007). Moreover, findings from the Early Childhood Longitudinal Study showed that only 40% of children from low-income homes demonstrated
Fantuzzo et al. mathematics proficiency versus 65% of children from middle-income homes and 87% of children from high-income homes (Chernoff et al., 2007). While such comparative data do not tell the full story of the experiences of children from low-income households, they underscore some of the academic challenges facing these young children as they prepare to enter formal schooling and as schools prepare to welcome them. These studies also point to the importance of identifying useful approaches to addressing the challenges they face. Several studies have found positive associations between comprehensive early childhood educational experiences and cognitive achievement for vulnerable young children (Campbell, Pungello, Miller-Johnson, Burchinal, & Ramey, 2001; Fantuzzo et al., 2005; Schweinhart, 2004). For example, Head Start, the nations primary early childhood program for low-income children, has evidenced positive cognitive outcomes for children. A recent nationally representative, randomized control evaluation of the benefits of Head Start found that participant children produced higher performance on cognitive outcomes (i.e., prereading, prewriting, and vocabulary), as compared to children who participated in other early childhood programs (U.S. Department of Health and Human Services [USDHHS], 2005). Taking into consideration the demonstrable benefits of quality early childhood education for children from low-income homes and the paucity of rigorous evaluations of specific curricular programs, the Institute of Education Sciences commissioned the Preschool Curriculum Evaluation Research (PCER) initiative in 2002 (PCER Consortium, 2008). Although wrap-around early childhood programs, such as the Abecedarian Project and the Perry Preschool Project, had evidenced positive outcomes for children from low-income homes, implementation of these programs required intensive service increments that restructured the daily routine of programs, making them very costly to operate and operable only in selected settings
JOHN W. FANTUZZO is the Albert M. Greenfield Professor of Human Relations at the University of Pennsylvania Graduate School of Education, 3700 Walnut Street, Philadelphia, PA 19104-6216; e-mail: johnf@gse.upenn.edu. His research interests include early childhood risk, early childhood education, Head Start, and child maltreatment and family violence. VIVIAN L. GADSEN is the William T. Carter Professor of Child Development and Education, the director of the National Center on Fathers and Families, and the associate director of the National Center on Adult Literacy at the University of Pennsylvania Graduate School of Education; e-mail: viviang@gse.upenn.edu. Her research interests include literacy and at-risk youth, fathers and families, intergenerational learning, and parental engagement. PAUL A. MCDERMOTT is a professor at the University of Pennsylvania Graduate School of Education; e-mail: drpaul4@verizon.net. His research interests include multivariate statistics, multilevel modeling, longitudinal analysis, item response theory, and test construction.
764
Integrated Curriculum for Head Start able and willing to restructure their programs. As a result of reviewing these program characteristics, PCER raised the question of whether preschool programs that were integrated into the daily routines of early childhood programs could provide similar benefits. PCER addressed this question by using randomized control trials (RCTs) to evaluate the efficacy of comprehensive preschool curricula for improving the mathematics, language, and literacy skills of older preschool children (4 to 5 years old). The PCER evaluation focused on curricula impact in 14 preschool programs. Programs were selected for evaluation if they had sufficient standardized training procedures and published materials to support implementation of the curriculum by research resources other than the curriculum developer. The selected programs were evaluated against the local curricula, with classrooms being assigned to either the experimental or the comparison local curriculum. Each program site provided professional development and support during the 1st year and an RCT during the 2nd year to test the efficacy of a full year of implementation. Overall, 2,911 children, 315 preschool classrooms, and 208 preschools from 16 geographical locations participated. School readiness was assessed using a battery of standardized measures in the fall and spring of the trial year and once more a year later. Findings from the evaluation indicated that no single comprehensive program produced significant mathematics, language, or literacy differences when compared to controls; only combinations of programs were effective. The findings also indicated that no single combination of programs produced significant differences in both mathematics and language or literacy outcomes. In the PCER evaluation, only 2 of the 14 programs yielded significant mathematics, language, or literacy outcomes.1 Notably, both of these programs combined all or part of the Developmental Learning Materials (DLM) Early Childhood Express curriculum (Schiller, Clements, Sarama, & Lara-Alecio, 2003) with other curricula. The first of these combinations was the DLM with Open Court Reading (Adams et al., 2000). Here, the impact of the combined programs was compared to the impact of the local control curriculum for 297 children across 30 classrooms. The DLMOpen Court teachers were provided 6 days of initial professional development and support and monthly 2-hour professional development meetings, with half of the teachers receiving mentoring visits. The DLMOpen Court curriculum produced significant improvements over controls in literacy and language but not in mathematics. A second combination added the Pre-K Mathematics and DLM Early Childhood Express Math Software to a variety of standard programs, like High/Scope (Weikart & Schweinhart, 2005) and the Creative Curriculum (Dodge, Colker, & Heroman, 2002), which were in use already. The condition in which Pre-K Mathematics was implemented with DLM Early Childhood Express Math Software provided computer-based mathematics activities supplemented by 4 days of teacher professional development 765
Fantuzzo et al. and support during the training year and 2 days of refresher training in the trial year, plus twice monthly professional development and support sessions throughout both years. Children in the experimental condition demonstrated significant improvements over controls in seven mathematics subskill areas. No improvements over controls were found for literacy or language areas. The DLM (Schiller et al., 2003) is a comprehensive, research-based curriculum created according to state and national early childhood guidelines. The DLM targets childrens cognitive, social-emotional, aesthetic, and physical development through 20 thematic units that can be used individually or collectively. Each unit consists of 36 weekly themes that address language and early literacy, math, science, social studies, fine arts, health and safety, personal and physical development, and technology. Each thematic unit includes approximately 200 hands-on learning activities that are designed to promote childrens social, emotional, intellectual, aesthetic, and physical development. Randomized control trial studies have indicated that the DLM has been used with varying levels of success to impact the mathematics, language, and literacy outcomes for older preschool children (PCER Consortium, 2008). Findings from PCER have been supported by reviews of randomized control studies conducted by the What Works Clearinghouse, which found no discernible effects on oral language, print knowledge, phonological processing, or math for the Creative Curriculum when used alone. The Creative Curriculum is a comprehensive curriculum for 3- to 5-year-old children and focuses on childrens social-emotional, physical, cognitive, and language development. According to the What Works Clearinghouse, one study of the Creative Curriculum met its evidence standards, and two studies met its evidence standards with reservations. The studies included a total of 844 children from 101 classrooms in more than 88 preschools in Tennessee, North Carolina, and Georgia. No studies of High/Scope and Open Court met the What Works Clearinghouses standards of evidence, although several experimental and quasiexperimental studies that are not randomized control trials have found positive effects. High/Scope is well known for its focus on language and literacy, mathematics, science, social-emotional development, physical development, and the arts. Its reports indicate that the curriculum has produced evidence that the programs improve childrens school success, later socioeconomic success, and social responsibility. The plan-do-review sequence of the program encourages children to achieve their goals through decision-making and problem-solving situations throughout the day. Open Court includes eight thematic units that focus on issues from childrens identity to transitions. Phonological, phonemic, and print awareness as well as comprehension are incorporated into each session, which typically includes 1.5 to 2 hours of daily instruction.
766
Integrated Curriculum for Head Start The PCER studies cast considerable light on the relative efficacy of stand-alone curriculum packages, indicating that none outperformed local business-as-usual preschool curricula for mathematics, language, or literacy outcomes. Improvement was evident only when different programs were combined with one another or when circumscribed parts of a published program were applied on top of an existing local program. Moreover, all discoveries pertained to 4-year-olds only, leaving unaddressed the more than one third of preschoolers in Head Start who are between 3 and 4 years old (USDHHS, 2008a). Furthermore, combination and add-on programs are logistically problematic. They tend to impede the formation of a clear program model definition, making it difficult to determine the footprint and impact of the program itself. Combinations and add-ons also raise significant questions about the feasibility (both in terms of cost and personnel) of implementation or replication independent of highly resourced experimental studies. This is a critical point inasmuch as a major goal of PCER was to determine the efficacy of comprehensive interventions that early childhood programs (such as Head Start) could actually use. At least three questions emerge: What conceptual framework should guide the design of integrated interventions in the context of Head Start? Are programs being developed and tested within Head Start realistic for the context where they are expected to be applied? And can those programs be implemented with the existing set of resources? A developmental-ecological conceptual framework (Bronfenbrenner & Morris, 1998; Zigler, Gilliam, & Jones, 2006) is a fitting model to inform the design of integrated curricula to maximize cognitive and language skills in a Head Start context. This approach considers multiple and transactional competencies as well as changes in functioning across these competencies over time. Development is understood in terms of the central tasks that children are expected to perform as a function of their age and culture (e.g., transitioning to school). To meet the challenges these tasks present, children bring to bear all of their competencies across skill areas. Because development is seen as a progressive process, childrens ability to use their competencies to negotiate tasks at one point of development has an impact on their ability to negotiate tasks at later points (Shonkoff & Phillips, 2000). In this approach, proximal context (e.g., the classroom) plays an important role in determining the course of development. Development does not result from a single influential factor within the child; rather, it is determined by multiple simultaneous influences occurring both within and around the child. These multiple influences, in transaction with childrens prior adaptation history and current capacities, determine developmental success (Luthar et al., 2000). McCall (2009), in a recent issue of the Society for Research in Child Developments Social Policy Report, highlights the importance of developing realistic evidence-based programming using a developmental-ecological 767
Fantuzzo et al. perspective. He asserts that we cannot just develop and test a comprehensive program without realizing that context matters. He emphasizes that to enhance effectiveness and replicability of programs, we must develop and evaluate programs for policy-relevant populations with careful consideration to the natural context. Furthermore, he suggests that being mindful of the resources required to implement an effective program and implementing and evaluating it in partnership with the existing context is essential to producing robust and sustainable outcomes, especially for children in resourcechallenged contexts. Drawing upon a developmental-ecological approach, this article reports on the development and efficacy trial for a new, stand-alone preschool curriculum program designed to improve mathematics, language, and literacy among Head Start children, while being mindful of the natural context. It is known as the Evidence-Based Program for an Integrated Curriculum (EPIC; Fantuzzo, Gadsden, & McDermott, 2003). Without combining different programs or superimposing additional curricula onto an already existing program, EPIC is a unified program intended to incorporate systematically the components of content, instruction, professional development, and repeated criterion-based assessments. Its culminating field trial is then extended to include all of the age levels (3 to 5 years) enrolled in Head Start classrooms across one of the nations largest urban areas, with randomization to contrast it against the DLMthe one published curriculum shown by PCER to be consistently associated with any improvements over preexisting local curricula. Unlike PCER, EPICs RCT was able to concentrate exclusively on low-income children and to assess differential growth rates while testing for relevant covariation and interactions (see Willett, Singer, & Martin, 1998) with childrens age upon program entry, special needs, and language status; prior curricular exposure; variation at the child level between time intervals separating assessments; child age variation between classrooms; number of classroom assistants; and variation in teachers Head Start and career-long teaching experiences.
Method
Participants Conducted through academic year 20072008 (AY0708), the experimental trial focused on a sample of 1,415 students, comprising the enrollments of 70 classes drawn randomly from the 250 Head Start classes operated by the school district of Philadelphia, Pennsylvania. The sample included 50.2% females and 49.8% males, ranging 35 to 70 months in age (M = 50.1, SD = 6.8). In contrast to PCER, 34.6% of children were younger than 4 years old. Approximately 12.8% were considered dual-language learners (DLLs), and 9.3% retained special needs status. Ethnicity was not systematically 768
Integrated Curriculum for Head Start reported, with estimates confounded by multiple attributions and 13.9% with no attributions. For this reason, ethnicity was not used as a covariate in subsequent analyses. Based on reported attributions, 60.6% of children were African American, 14.5% Latino, 4.2% Caucasian, and 6.0% other ethnic minorities. Nearly one third of children had been enrolled in those same classes during academic year 20062007 (AY0607), the training year that preceded the trial year, whereas the other two thirds were newly enrolled for the AY0708 trial year with no prior preschool exposure. As pertains nationwide to children enrolled in Head Start, children met the Head Start requirements for admission, which requires 90% of enrolled children to be from those families whose incomes were below the federal poverty level or who were eligible for public assistance (USDHHS, 2008b). Interventions EPIC. The development of EPIC centered on three activities: (a) building curriculum modules guided by theoretical and empirical literature pointing to critical cognitive and language abilities that are indicators of early school success and practices that promote development of these abilities, (b) integrating the modules through pilot research, and (c) conducting a large randomized control trial. The EPIC integrated curriculum was primarily focused on areas that have been found to be critical to childrens development of language and literacy: alphabet knowledge and phonemic awareness, print concept, vocabulary, listening comprehension, and mathematics (see McCardle, Scarborough, & Catts, 2001; National Institute of Child Health and Human Development, 2000; National Early Literacy Panel, 2009; Snow, 1991; Snow, Burns, & Griffin, 1998), bolstered by intentional instruction in approaches to learning. Mathematics development, which is reliably observed throughout childrens first 5 years, conveys both mediating and causal effects associated with later mastery of cultural symbol systems and general strategic approaches to learning (Kilpatrick, Swafford, & Findell, 2001). During the development phase of the integrated curriculum, two sources of information were used to provide an evidence base for the scope and sequence of EPIC: Head Starts National Indicators (USDHHS, 2006) and the Prekindergarten Pennsylvania Learning Standards for Early Childhood (Pennsylvania Department of Education and Department of Public Welfare, 2005). Data were collected on the Head Start population at multiple times across the pilot years using a range of measures: the Peabody Picture Vocabulary TestIII (PPVT-III; Dunn & Dunn, 1997), the Oral and Written Language Scales (OWLS; Carrow-Woolfolk, 1995), the Expressive OneWord Picture Vocabulary TestRevised (Gardner, 1990), the Test of Early Mathematics AbilityThird Edition (TEMA-3; Ginsburg & Baroody, 2003), the Preschool Child Observation Record (High/Scope Educational Research Foundation, 2003), and the Learning Express (McDermott et al., 769
Fantuzzo et al. 2009). Information about skill progression was used by curriculum developers to detect empirically the skills recently mastered and the subskills being newly encountered by most children. Thereafter, curriculum contents were sequenced in a similar fashion, such that the main foci of lessons comported to the empirical levels at which most children were functioning. Longitudinal studies of preschool learning behaviors provided the evidence base for the approaches to learning component of the scope and sequence (McDermott & Fantuzzo, 2000). With respect to context, EPIC was developed in partnership with exemplary Head Start teachers in a large, urban public school district and was designed to fit within the existing expectations for the delivery of Head Start services (i.e., in alignment with state standards and Head Start indicators). The program was delivered by Head Start teachers as part of their routine practice, supported by indigenous supervisory staff, and conducted within the Head Start programs regular allotment of professional development resources. For the duration of the study, EPIC was the sole intervention for the participating teachers rather than an add-on intervention to an existing program. The EPIC program consists of its integrated curriculum practices, curriculum-based assessment, and professional training and support. The EPIC curriculum couples tested methods of instruction with the EPIC Scope and Sequence to provide intentional and systematic classroom experiences across eight units of instruction. The EPIC Scope and Sequence is an evidence-based mapping of the primary skill areas and serves as the foundation for the curriculum. The EPIC Scope defines subskills in each cognitive skill area. Each curriculum unit targets a specific set of integrated instructional objectives that reflect skill levels of the EPIC Scope and Sequence. The objectives serve as the basis for the unit activities, resulting in intentional instruction of targeted skills along an evidence-based developmental sequence. The EPIC curriculum incorporates evidence-based best practices derived from studies that had tested separately curriculum methods for advancing early language and literacy (Wasik & Bond, 2001) and mathematics (Frye, 1991) skills with Head Start children. These instructional methods were built into the daily classroom routine. Routine experiences include interactive reading, large-group and small-group activities, transition activities, environmental changes, and home connections. Interactive reading involves reading storybooks and dialoguing with children about the books using key vocabulary. Each unit of the curriculum contains key books chosen for their ability to support targeted skills and expand, connect, and enhance concept development. Large-group EPIC experiences foster a sense of belonging and encourage childrens interest and motivation for learning. Teachers dialogue with children to introduce key concepts and vocabulary and to engage in activities that apply concepts in practice. Small-group activities provide 770
Integrated Curriculum for Head Start opportunities for introducing, teaching, and practicing skills, and support a more focused observation of children to adapt instruction to the individual skill levels of each student. Transition activities are designed to support continuous learning during the short periods of time in which students move from one structured group activity to another. EPIC environmental changes establish the environment as the third teacher by using specially designed stations, bulletin board displays, and innovative props and visual cues to reinforce key skills and vocabulary for children, parents, and teachers. Classroom areas are modified, as needed, to reinforce targeted concepts and vocabulary featured in each unit. The EPIC curriculum also capitalizes on the role that families play as home educators. EPIC home connections are weekly home learning activities that parallel classroom learning experiences. They provide family members with concrete opportunities to reinforce childrens learning key vocabulary and concepts. They were developed to foster ongoing, two-way exchanges with family members about students skill development and to celebrate families contributions to student achievement. Additionally, a distinctive component of the EPIC intervention was evidence-based approaches to learning modules designed to enhance mathematics, language, and literacy skills development (McDermott & Fantuzzo, 2000; Shure & DiGeronimo, 1996). During the EPIC development and piloting phase, four learning behavior modules were integrated into the cognitive skills scope and sequence: Attention Control, Frustration Tolerance, Group Learning, and Task Approach. Attention Control emphasizes developing skills related to focusing attention and completing tasks (i.e., persistence and effectively dealing with distractions). Frustration Tolerance is designed to help children recognize and verbalize frustration and use effective strategies to deal with frustration (e.g., taking a break, asking for help, and practicing with assistance). Group Learning targets cooperative learning skills (e.g., taking turns, helping others, contributing to group activities and completing activities with others). Task Approach focuses on skills related to generating new ideas and trying out these different ideas to solve problems (e.g., brainstorming, creating alternative solutions, and testing them) and carry out a simple plan. EPIC also includes curriculum-based assessments that are used by teachers to identify the individual competencies and learning needs of each child. EPIC Integrated Check-Ins (ICIs) are brief assessments of skill levels across the integrated scope and sequence of the curriculum. These skills directly map onto state standards for early childhood education and the national Head Start indicators. They include alphabet knowledge, phonemic awareness, vocabulary, print concepts, listening comprehension, mathematics, motor, social-emotional, and approaches-to-learning skills. Each skill is assessed across a developmental sequence of five levels that have been established and validated by empirical research (Fantuzzo, Gadsden, & 771
Fantuzzo et al. McDermott, 2008). ICIs are completed for each child by a teacher as part of the routine implementation of the curriculum. These formative assessments are embedded in units as standardized curriculum activities that are repeated three times throughout the year. ICIs help teachers monitor childrens progress and create a classroom profile of individual student differences in ability levels to inform instruction. Another critical element of the EPIC intervention is the means by which teachers and teaching assistants receive professional development and ongoing support as they use the ICIs and implement the integrated curriculum. EPIC uses a learning community model of professional development based on distributed leadership principles (Spillane, 2006). This model seeks to build reciprocal teaching and learning relationships among educators who have different levels of expertise and experience with the EPIC curriculum. Productive learning relationships are established within the classroom and across classrooms by having experienced educators share effective classroom strategies as they implement the curriculum. The EPIC learning community meets routinely throughout the year in three different learning contexts: teaching teams, small groups, and large group. A teaching team includes a classroom teacher and teaching assistant. Teaching teams meet on a weekly basis to review childrens response to the curriculum activities from the previous week and plan for their coordinated implementation for the upcoming week. Small groups meet prior to the onset of each curriculum unit and consist of five to six teaching teams and a mentor teacher. The mentor teachers are experienced EPIC teachers who are trained to help their peers implement EPIC while they are concurrently implementing EPIC in their own classroom. In small-group learning community meetings, teams share their experiences with the previous unit and are introduced to the new unit. Mentor teachers are in regular weekly communication with their teams and the Head Start educational coordinator. The educational coordinator is given special support to gain expertise in the EPIC curriculum and is assigned to visit classrooms and engage in onsite support. Large-group learning community meetings occur quarterly and provide all involved with an opportunity to discuss implementation issues and share best practices. In our comparison classrooms for the EPIC study, teachers in the DLM group used the Preschool Child Observation Record (High/Scope Educational Research Foundation, 2003) to conduct individual assessments of children and monitor their progress. A comparable number of assessments was conducted in both conditions. Allocation of indigenous staff to support curriculum implementation was comparable for EPIC and DLM classrooms. Both had access to mentor teachers and educational coordinators and the same amount of professional development days for meetings. The difference was the model of professional development and the curriculum and assessment content. DLM teachers received professional 772
Integrated Curriculum for Head Start development in the form of didactic workshops. Educational coordinators presented material on specific topics or introduced new regulations to groups of teachers and teaching assistant in their regions. Teachers were provided opportunities to ask questions or discuss content. Intervention implementation. At the end of each unit of the EPIC integrated curriculum, teaching teams reported the degree to which they implemented each routine component of the curriculum: interactive reading, large-group activities, small-group activities, environmental changes, and transition activities. On average, they reported completing 97% (SD = .02) of interactive reading activities, 89% (SD = .02) of large-group activities, 86% (SD = .05) of small-group activities, and 80% (SD = .04) of transition activities. The overall implementation across the year was 88% (SD = .04). These ratings matched supervisor reports of implementation. In addition, teachers and teaching assistants were asked to provide anonymous ratings of their overall satisfaction with the EPIC program on a 4-point Likert-type scale: not satisfied, somewhat satisfied, satisfied, or very satisfied. The average rating across teachers and teaching assistants of being satisfied or very satisfied with EPIC across the year was 98% (SD = .02). There was no significant difference between teacher and teaching assistant independent ratings. Data on the fidelity of implementation of EPIC and DLM across the year were collected by the educational coordinators of the Head Start program as part of their routine supervision of classrooms. Educational coordinators, at the beginning of each year, received preparation to assess the fidelity of curriculum implementation. This preparation included how to collect teachers lesson plans and observe their classrooms at regular intervals throughout the year. They were instructed to note where the teacher was in the sequence of the curriculum and whether there was a general correspondence between the plan and the observed activity. A 5-point overall rating of implementation was used: 1 = very poor, 2 = poor, 3 = fair, 4 = well, and 5 = very well. Ratings were collected at three points in time in accordance with programs evaluation of student progress and parenting reporting. The overall rating for both EPIC and DLM was rated well. For EPIC classrooms, the median rating was 4, with a range of 3 to 5. The median for the DLM classrooms was 4 also, with a range of 3 to 5. There were no significant differences in supervisor ratings across programs. Instrumentation The central focus of this study was the relative effectiveness of curricula over the period of an RCT. This required assessments of child cognitive growth at multiple time points that would allow estimation of stable growth trajectories rather than simple comparisons of group means at the close of the study. The multiple assessments were intended to inform the growth rates throughout a school year and also to fortify against the complete loss 773
Fantuzzo et al. of outcomes information associated with participants who attend part but not all of the year. Pilot studies (McDermott, Angelo, Waterman, & Gross, 2006) had clarified that the best norm-referenced tests (NRTs; e.g., the Test of Early Reading Ability, 3rd ed. [TERA-3]; Reid, Hresko, & Hammill, 2001; and the OWLS; Carrow-Woolfolk, 1995) were poorly suited to the task because the average growth ranges for Head Start children were essentially trivial (four to five additional items correctly answered over the school year), reducing the sensitivity of those tests for a conventional pretest-posttest design and effectively vitiating any growth sensitivity over briefer intervals. These results followed from the fact that for requisite nomothetic and commercial purposes, popular NRTs feature item content that is properly centered around the 50th percentile in difficulty, whereas the nations Head Start population is relatively challenged, with performance centering around the 15th to 20th percentile (U.S. Department of Education, 2007), leaving markedly few items appropriate for Head Start children. In this respect, it was deemed imperative that any demonstration of effectiveness for a curriculum should be based on a wide-scoped definition of the component subskills that comprised any content area rather than a set of narrowly circumscribed subskills that would not be representative of a larger cognitive domain. Nor would experimental results be generalizable were tested subskills to be chosen as a function of their tendency to inflate the apparent usefulness of a given curriculum. Within this context, we constructed and validated a set of measures that were highly sensitive to growth over brief intervals, that broadly represented the target cognitive domains, and that could be administered repeatedly over the school year with minimal time investment and maximal precision. Performance was assessed through the Learning Express (LE; McDermott et al., 2009), an individually administered adaptive battery referenced to Head Starts National Indicators (USDHHS, 2006) and Prekindergarten Pennsylvania Learning Standards for Early Childhood (Pennsylvania Department of Education and Department of Public Welfare, 2005). Although all LE content is unique and not identical to any government or commercial tests, content breadth was further demonstrated by the alignment of each item to the content of Head Starts National Reporting System (USDHHS, 2003), the TERA-3 (Reid et al., 2001), the PPVT-III (Dunn & Dunn, 1997), the OWLS (Carrow-Woolfolk, 1995), the Expressive One-Word Picture Vocabulary TestRevised (Gardner, 1990), the TEMA-3 (Ginsburg & Baroody, 2003), the Preschool Child Observation Record (High/Scope Educational Research Foundation, 2003), and the Galileo Skills Inventory (Version 2; Assessment Technology, 2002). The LE contains 325 items distributed over two equated forms (to reduce practice effects) and four subscales (Alphabet Knowledge, Vocabulary, Listening Comprehension, and Mathematics).
774
Integrated Curriculum for Head Start A total of 56 distinct subskills are featured, with each subscale incorporating multiple subskills representing varied complexity and breadth and requiring varied response modes for children (i.e., pointing, vocal expression, and object manipulation). Subscales are calibrated via two-parameter logistic item response models with scaled scores (M = 200, SD = 50) derived through Bayesian ex a posteriori estimation. Basal adaptive testing is applied to ensure administration of all subscales within 20 to 30 minutes. Dimensionality was confirmed through full-information bifactor analysis with all subscales ..90 in reliability and producing information curves designed for precision longitudinal measurements across equated forms (McDermott et al., 2009). Concurrent validity was supported through relationships with NRTs and teachers assessments of literacy and numeracy (McDermott et al., 2009). Based on May and June 2005 assessments, each LE subscale was significantly and more highly correlated with its hypothetical counterpart NRT scale. Specifically, LE Alphabet Knowledge correlated .68 with TERA-3, LE Vocabulary correlated .69 with PPVT-III, LE Listening Comprehension .63 with OWLS, and LE Mathematics .69 with TEMA-3. Similarly, spring 2007 LE Alphabet Knowledge, Vocabulary, and Listening Comprehension scores, respectively, correlated .62, .56, and .52 with teachers concurrent Child Observation Record (COR) Language and Literacy scale scores, and LE Mathematics correlated .63 with COR Mathematics scores. Predictive validity was assessed through correlation of October 2006 LE Alphabet Knowledge, Vocabulary, and Listening Comprehension scores, respectively, with the counterpart COR Language and Literacy scores from late spring 2007 (rs = .57, .56, and .52) and LE Mathematics with COR Mathematics (r = .58). Note also that as revealed in the content validity analyses of LE items as compared to items in those various external criterion measures (see McDermott et al., 2009), the LE subscales cover a markedly broader array of skills than the NRT devices and cover many skills prescribed by the national Head Start Indicators (USDHHS, 2006) not covered through NRTs. Procedures The Learning Express battery was administered to each individual child by a trained assessor during a single session ordinarily taking 20 minutes but no more than 30 minutes. Private locations were identified in each Head Start center for individual testing. Children were escorted to testing in the order of the class list, with no more than 5 children removed for testing simultaneously and always with the teachers knowledge. Standardized questions inquired as to each childs status in terms of special needs, English as primary or secondary language, and health at the time of testing and the teachers discretion as to whether testing was advisable. A flipbook binder of item stimuli was placed on a table and oriented toward the child. 775
Fantuzzo et al. As each successive item was exposed to the child, the assessor asked a question (which appears in print on the reverse side of the item page facing the assessor) requiring the child to point to the correct choice, vocally express the answer, or manipulate objects. A standardized prompt was also available for non sequitur child responses or no response. Assessments were conducted at four points in time throughout the training and trial years: October, January, March, and June. The Learning Express has equivalent alternate forms. Items were designed in pairs whose members were intended to reflect comparable content and equal difficulty, with one member of a pair assigned to Form A and the other to Form B. For all item response theory (IRT) equating studies, equivalent-groups equating with linking items was applied, where forms were of equal length and multiple-groups calibration was used with children randomly assigned to forms. Equating accuracy was tested through comparison of uniformity of all four moments of the distributions (means, variances, skewness, kurtosis) across forms via Kolmogorovs D (Conover, 1999) at each wave. Equating was deemed successful should the distribution of scaled scores across forms remain identical after equating (per Kolen & Brennan, 2004). Thus, Kolmogorovs D estimated the similarity of the two form distributions for each subscale at each wave, and the KolmogorovSmirnov goodness-of-fit index tested the probability that D was greater than the observed value under the null hypothesis of no difference between forms (Conover, 1999). For every subscale and at every wave, the score distributions for the equated forms were essentially equivalent, yielding very small D values (M = .06, SD = .02, range = .03 to .09). To minimize practice effects over repeated waves of assessment within a school year, the two forms were applied in a counterbalanced fashion. Children appearing as odd numbers on a class list were administered Form A at Wave 1, whereas those appearing as even numbers were administered Form B. Administration was reversed for each subsequent wave such that, for example, approximately half of the children during AY0607 received form sequence ABAB and half BABA. Each year, the order of classrooms to be assessed was random for Wave 1, and from wave to wave, there was an effort to maintain the same approximate order for assessing each child (e.g., a given child assessed at the start of Wave 1 was likely to be assessed at the start of other waves). This process served to minimize disparities between children in the time intervals separating their assessments (although time measures were kept to correct for any such disparities in subsequent individual growth modeling). Child assessments were conducted at each classroom by an independent team of 45 trained assessors in AY0607 and 38 trained assessors in AY0708. Assessors consisted of undergraduate- or graduate-level students who were recruited at the beginning of each academic year through e-mail to psychology and education departments in the greater Philadelphia region. Ages 776
Integrated Curriculum for Head Start ranged from 18 to approximately 60 years (median ages in the mid- to late 20s), with more than 40% being ethnic minorities (primarily African American) and nearly 20% males. Thirty-five hours of professional development and support were provided during the training and trial years in early September, followed by 15 to 20 hours practicing administrations. The assessment team was employed to evaluate Head Start childrens growth. They did not work with and were not informed about the EPIC implementation activity. Their assessment results were not shared with those directing the implementation until after the trial was completed. Data Analysis Multiple sets of multilevel individual growth-curve models were constructed for each content area over the trial year. The temporal variability of performance within children across the four assessment waves comprised Level 1 in these models, whereas performance variation between children within classrooms was held at Level 2 and variation between classrooms at Level 3. Because there was variability among testing dates within any given wave, time in days was applied as the principal time-varying measure (the metameter) and zero-centered around the mean testing day during Wave 4, May to June 2008 (final status). The final status metameter was appropriate because the contrasts for the effects of primary interest (those involving EPIC vs. DLM) were necessarily focused on performance at the close of the trial year. Random intercepts and slopes were estimated and tested for each model at both the child level (Level 2) and classroom level (Level 3). The initial model for each content area was an unconditional type entering only the metameter with classroom means as outcomes in order to decompose hierarchically the unexplained content variance within and between children and between classrooms. Given the disparate time intervals separating waves and distinguishing the timing between any given childs assessments, the unconditional models included those assuming compound symmetry for the within-child covariance matrices as well as models positing a spatial power law (essentially a first-order autoregressive model sensitive to the differential intervals), a Gaussian spatial model, a spherical spatial model, and a completely generalized (unstructured) model (Wolfinger, 1993, 1996). Models were compared through Akaikes information criterion (Burnham & Anderson, 1998, 2004) and chi-square deviance tests (Littell, Milliken, Stroup, Wolfinger, & Schabenberger, 2006) to identify the best error covariance structure for each content area. Additionally, linear, quadratic, and cubic fixed effects and higher-order random slopes were tested for each content area. Subsequent conditional models sequentially added and tested covariates and interactions under full maximum-likelihood constraints, and the final model for each content area featured only statistically significant random 777
Fantuzzo et al. and fixed effects that were extracted through restricted maximum-likelihood estimation (per Littell, Milliken, Stroup, & Wolfinger, 1996, and Milliken & Johnson, 2002). As an exception to the latter, the main effect for EPIC versus DLM was retained for each model whether significant or not as required to provide full information on the relative performance of the curricula. Post hoc comparisons of outcomes were based on Tukey-Kramer (Searle, Speed, & Milliken, 1980) contrasts of least-squares means (means adjusted for any group imbalance), and where significant fixed effects were discovered concerning EPIC versus DLM performance, effect sizes (Cohens D) were estimated as per the least-squares means and standard deviations for final status (Wave 4) score estimates. Power analyses (per Raudenbush, 1997, and Raudenbush & Liu, 2001) indicated that assuming statistical significance at p \ .05, power = .80, intraclass r \ .10, and 20% attrition over 2 years, the initial sample was sufficient to detect small to moderate experimental effects. The covariates assessed for each model included, at the child level, age in months at entry into the study (opening of school, 2007), gender (female vs. male), participation in the training year (vs. none), DLL status (vs. none), and special needs status (vs. none) and, at the classroom level, EPIC (vs. DLM curriculum), mean age for the childs classroom, teachers total years teaching experience, teachers total years teaching Head Start, and the number of adult classroom volunteers. Age at entry was selected as the age covariate rather than incrementing age over the trial year due to the substantial collinearity of the latter with the time metameter, and any additional variable reflecting a childs total years in Head Start or other preschool experience was excluded because of its essential collinearity with the participation in training year covariate.
Results
Given 80 classrooms randomly assigned to either the EPIC or DLM program condition in fall AY0607, 70 (35 DLM, 35 EPIC) remained at the start of the AY0708 trial year and through the duration of the study (attrition rate = 12.50%), while child attrition over the trial year averaged 8.24% across the four content areas, with no appreciable departure for any particular area. These attrition rates did not approach the potential 20% attrition over 2 years that was projected in prior power analyses. The classroom attrition occurring during the training year was never due to teachers declining participation but rather to classroom closings given low enrollment, school closings, or teacher transfers and retirements. The more complex error covariance structures afforded no statistically significant improvement in model fit over models assuming compound symmetry or random covariance structures; thus, the models assuming compound symmetry were adopted. Moreover, higher-order fixed and random effects uniformly were nonsignificant
778
Integrated Curriculum for Head Start statistically and frequently produced boundary cases, whereas linear models uniformly yielded significant effects and optimal model fit. Table 1 shows the hierarchical decomposition of variance for each unconditional model (exact model specifications are reported in the footnote). Across content areas, the preponderance of potentially explainable variance was that between children (M = 65.90%, SD = 10.46) with more than 70% of Vocabulary and Mathematics and just over 50% of Listening Comprehension variability being between individual children. Relatively small amounts of variability (for general comparisons, see Raudenbush, Martinez, & Spybrook, 2007; Snijders, 2005) were detected between classrooms (M = 5.54%, SD = 1.87), the percentage never exceeding 7%. This reflects more substantial homogeneity among the classrooms as might be expected for the Head Start population. Based on the number of classrooms, harmonic M classroom enrollment (20.04), and average intraclass r (.056), the experimental design effect (Snijders, 2005) was 2.12. As noted, an array of child- and classroom-level covariates was tested. The final models incorporate only those statistically significant and either structurally fundamental to the experimental design or for which marked disparities were evident across treatment conditions at Wave 1 of the trial year. Inasmuch as the largest portion of variability in young childrens cognitive performance is generally accounted for by age (M = 25.19%, SD = 4.76, in this study; refer to McDermott, 1995, for population estimates), age at entry was included in each conditional model. The binary indicator of childrens training year inclusion also was included to identify trial participants enrolled in treatment classrooms through the training year (30.00% of children). Given randomization at the classroom level in AY0607, equitable distribution of child characteristics across EPIC and DLM curricula could not be assumed. At the opening of the trial year, it was found that 64.09% of DLL children were enrolled in EPIC classes and 56.15% of children with special needs in DLM classes. Such disparities were apparent as well at the opening of the training year. Thus, DLL and special needs status were applied as covariates in all conditional models. Sequencing of covariate entry and treating of interaction effects proceeded as recommended by Bauer and Curran (2006), Singer and Willett (2003), and Willett et al. (1998). Table 1 posts parameter estimates and significance levels for the conditional models incorporating all statistically significant random and fixed effects and for the effect of interest (EPIC vs. DLM) whether significant or not (full model specifications appear in the table note). Every model evinced a significant growth rate for LE scores (the time in days), the largest being 0.24 points increment in Mathematics scores per day (or 7.24 per month) and the smallest 0.15 points growth in Listening Comprehension scores per day (4.54 per month). Among the various control covariates, a somewhat general pattern was evident wherewith on average DLL children underperformed non-DLL 779
Fantuzzo et al.
Table 1 Multilevel Individual Growth Curve Analyses for Cognitive Areas Over Cluster-Randomized Trial Year
Cognitive Area Alphabet Knowledgea Effect Random effects (parameter estimates) Between children within classrooms Intercepts 1517.27**** Covariance 0.51 Slopes 0.01**** Between classrooms Intercepts 158.49*** Covariance 0.38** Slopes \0.01** Residuals 669.24**** Random effects (% variance in hierarchical decomposition) Temporal (Level 1) 28.54 Between children (Level 2) 64.70 Between classrooms (Level 3) 6.76 Fixed effects (parameter estimates) Intercepts 229.19**** Time in days 0.22**** Vocabularyb Listening Comprehensionc Mathematicsd
Unconditional Means Models
1522.69**** 1.63**** 0.01**** 127.72***
890.93**** 1.45**** \0.01* 47.75**
1629.73**** 0.14 0.01**** 142.31*** 0.15 \0.01** 378.72****
483.08****
780.77****
22.64 71.37 5.99 225.79**** 0.19****
45.45 51.77 2.78 219.79**** 0.14****
17.61 75.77 6.62 228.46**** 0.23****
Intercepts- and Slopes-as-Outcomes Models Random effects (parameter estimates) Between children within classrooms Intercepts Covariance Slopes Between classrooms Intercepts Covariance Slopes Residuals Fixed effects (parameter estimates) Intercept Time in days Age (in months) at entry Time 3 Age Training Year Inclusion Age 3 Training Year Inclusion
1127.14**** 0.97*** 0.01**** 156.22**** 0.40** 0.00** 669.28****
994.50**** 0.43* 0.01**** 75.64****
628.82****
1064.20**** 0.83**** 0.01**** 133.01**** 0.22* \0.00** 381.43****
24.80*
485.57****
799.12****
233.00**** 0.22**** 2.34**** \0.01* 10.36****
231.52**** 0.19**** 2.15**** 0.01**** 9.07***
225.39**** 0.15**** 2.12**** 0.01**** 15.13**** 0.94*
235.05**** 0.24**** 2.90**** 0.00**** 17.50**** 1.12* (continued)
780
Integrated Curriculum for Head Start

Table 1 (continued)
Cognitive Area Alphabet Knowledgea Effect Dual-language learner Special needs status EPIC (vs. DLM) curriculum Time 3 EPIC curriculum Vocabularyb Listening Comprehensionc Mathematicsd
Intercepts- and Slopes-as-Outcomes Models 19.22**** 23.17**** 4.08 50.78**** 20.89**** 0.46 31.92**** 23.14**** 5.29* 0.03* 24.16**** 29.62**** 9.04* 0.02*
Note. Models assess cognitive growth over four time points via restricted maximum likelihood estimation assuming compound symmetric error covariance structures. Effects not appearing in models were excluded for failure to achieve statistical significance. EPIC = Evidence-Based Program for Integrated Curricula; DLM = Developmental Learning Materials Early Childhood Express. a Wave 1, N = 1,208; Wave 4, N = 1,106; total number of observations = 4,759. Unconditional model: Alphabet KnowledgeYijk = g000 1 g100Timeijk 1 (m00k 1 m10kTimeujk) 1 (m0jk 1 m1jkTimeijk) 1 rijk. Conditional model: Alphabet KnowledgeYijk = g000 1 g100Timeijk 1 g010EntryAgej 1 (g110Timeijk * EntryAgej) 1 g020TrainingYearj 1 g030DLLj 1 g040SpecialNeedsj 1 g001EPICk 1 (m00k 1 m10kTimeijk) 1 (m0jk 1 m1jkTimeijk) 1 rijk. b Wave 1, N = 1,207; Wave 4, N = 1,104; total number of observations = 4,763. Unconditional model: VocabularyYijk = g000 1 g100Timeijk 1 (m00k 1 m10kTimeujk) 1 (m0jk) 1 rijk. Conditional model: VocabularyYijk = g000 1 g100Timeijk 1 g010EntryAgej 1 (g110Timeijk * EntryAgej) 1 g020TrainingYearj 1 g030DLLj 1 g040SpecialNeedsj 1 g001EPICk 1 (m00k 1 m10kTimeijk) 1 (m0jk) 1 rijk. c Wave 1, N = 1,209; Wave 4, N = 1,107; total number of observations = 4,701. Unconditional model: Listening ComprehensionYijk = g000 1 g100Timeijk 1 (m00k 1 m10kTimeujk) 1 (m0jk) 1 rijk. Conditional model: Listening ComprehensionYijk = g000 1 g100Timeijk 1 g010EntryAgej 1 (g110Timeijk * EntryAgej) 1 g020TrainingYearj 1 (g030EntryAgej * TrainingYearj) 1 g040DLLj 1 g050SpecialNeedsj 1 g001EPICk 1 (g101Timeijk * EPICk) 1 (m00k) 1 (m0jk) 1 rijk. d Wave 1, N = 1,208; Wave 4, N = 1,106; total number of observations = 4,761. Unconditional model: MathematicsYijk = g000 1 g100Timeijk 1 (m00k 1 m10kTimeujk) 1 (m0jk 1 m1jkTimeijk) 1 rijk. Conditional model: MathematicsYijk = g000 1 g100Timeijk 1 g010EntryAgej 1 (g110Timeijk * EntryAgej) 1 g020TrainingYearj 1 (g030EntryAgej * TrainingYearj) 1 g040DLLj 1 g050SpecialNeedsj 1 g001EPICk 1 (g101Timeijk * EPICk) 1 (m00k 1 m10kTimeijk) 1 (m0jk 1 m1jkTimeijk) 1 rijk. *p \ .05. **p \ .01. ***p \ .001. ****p \ .000.
children, the most noticeable being a full standard deviation gap in Vocabulary (M = 31.52 points over all content areas), and special needs children underperformed nonspecial needs children by approximately half of a standard deviation (M = 24.20 points over content areas). On the other hand, participation in the training year translated to a distinct advantage for children (M score increment = 13.01 points), with the greatest advantage manifest for Mathematics achievement (17.50 points) and Listening 781
Fantuzzo et al. Comprehension (15.12 points). Also, the significant interaction between time and age for every model echoed the tendency of slopes to be steeper for children who were younger at entry and flatter for those who were older, as did the interaction between age and training-year participation reflect the flatter slopes for Listening Comprehension and Mathematics when children were jointly older to begin with and involved in the training year. Such slope distinctions in cognitive growth among young children, where the youngest children manifest markedly steep slopes that slowly flatten as they approach age 5 and beyond, is a well-known phenomenon in early childhood development (Shonkoff & Phillips, 2000). No significant distinctions were found among classrooms in terms of variability associated with teaching experience or numbers of adult volunteers. Statistically significant main effects were found for the superiority of the EPIC curriculum over the DLM curriculum at the conclusion of the experiment and for the diverging slopes for those curricula over time, in favor of the EPIC curriculum. For Listening Comprehension, the main effect was F(1, 67) = 5.20, p = .0258, and interaction was F(1, 4629) = 6.53, p = .0106. The model incorporated significant random effects for intercept variation at both child and classroom levels but no statistically consequential slope effects were detected at either level. Although the significant interaction effect accounted for 77.3% of the variation in between-groups slope variance, it did so in a context where the overall available between-groups slope variance was itself trivial. Thus the interaction should not be interpreted to indicate differential growth rates at conclusion of the trial year, whereas the simple main effect for EPIC (vs. DLM) curriculum clearly points to EPIC superiority at years end. With the grand mean at final status (the middle of Wave 4) at 222.85 scaled score points (the fixed effects intercept for the Listening Comprehension model in Table 1), the EPIC M = 223.89 and the DLM M = 218.68, a difference of 5.21 points. Given that these least-squares means correct for any curricular-group size imbalance, the overall Wave 4 estimated SD (31.03) was applied to show effect size = .17.2 EPIC and DLM estimated means for the middle of Wave 1 (222 days preceding final status) were not statistically discrepant (p = .7827), indicating that the curricular groups were essentially equivalent 1 month into the trial year, with neither curriculum having an apparent starting advantage. The results for Mathematics were more pronounced. Specifically, the main effect for EPIC versus DLM was F(1, 67) = 6.85, p = .0110, and interaction was F(1, 4619) = 4.19, p = .0407. Whereas the grand mean at final status was 235.05 points, the EPIC M = 234.76 and DLM M = 225.72, a difference of 9.04 points (estimated SD = 40.94, effect size = .22). EPIC and DLM performance discrepancy was not statistically significant for the first wave of the trial year (October 2007, p = .2236), nor was it significant 1 month after cluster randomization in the preceding training year (October 2006, p = .2596), indicating no advantage for either curriculum 1 month into the experiment. Superiority of the EPIC curriculum was actually detectable by Wave 2, 782
Integrated Curriculum for Head Start January 2008 (p = .0400), with the EPIC growth trajectory continuing to pull away from the DLM trajectory over the trial year. Table 2 presents for Listening Comprehension and Mathematics the breakdown of least-squares means and their standard errors for the main DLM versus EPIC effect and across DLM and EPIC within covariates over trial waves. The covariates are those yielding significant main effects as reported in Table 1. Comparison of DLM and EPIC means for a given covariate (e.g., DLL vs. non-DLL) at a given time point (e.g., Wave 4) indicate relative curricular performance. Because age (in months) at entry was a continuous variable in the models, Table 2 partitions that variable for convenient interpretation into groups of children who were younger than versus older than 4 years of age at initial enrollment into a given curriculum. The listed means are corrected for imbalance in cell sizes (although actual cell sizes are posted), and they reflect the better performance of older children, those participating in the training year, and EPIC versus DLM, even among DLL children and those having special needs.
Discussion
The primary purpose of this study was to examine the efficacy of EPIC as a stand-alone, comprehensive program for improving cognitive school readiness outcomes for children from low-income households in the context of urban Head Start centers. Designed to meet state and federal requirements for preschool programs, EPIC was developed in partnership with exemplary Head Start educators and was designed to meet state and federal requirements for preschool programs. EPIC was evaluated against the DLM, with effects being tested across important subgroups of children. DLM was a relevant comparison program since it was associated with the best cognitive effects for preschool children in the PCER studies (PCER Consortium, 2008). EPIC and DLM were implemented with comparable fidelity and resources. Analyses revealed main effects for a comprehensive set of mathematics and listening comprehension skills, with children in the EPIC program performing better than children in the DLM program, controlling for age, prior preschool experience, and special needs and language status. Additionally, interactions between type of program subgroups (i.e., 3-year-old children, DLL, and children with special needs) were tested. There were no significant differences between the programs in Vocabulary and Alphabet Knowledge. Irrespective of program, a pattern of underperformance was found for DLL children and children with special needs compared to their non-DLL and nonspecial needs counterparts. Moreover, a significant interaction between time and age for every model indicated that younger children evidenced steeper slopes than older children. Overall, the EPIC findings are notable when compared with the 14 PCER studies: No single program showed positive outcomes for mathematics, language, or literacy, and only two found any positive cognitive outcomes. Neither of the two employed a single 783
Fantuzzo et al.
Table 2 Estimated Least-Squares Means (and Standard Errors) by Curriculum Within Samples Over the Trial Year
Wave 1 Sample Listening Comprehension Full sample DLM EPIC Wave 2 DLM EPIC Wave 3 DLM EPIC Wave 4 DLM EPIC
190.67 (1.75) n 608 3-year-olds at entry 165.87 (2.35) n 204 4- and 5-year-olds at entry 203.66 (1.82) n 404 Training year inclusion 201.28 (2.67) n 202 No training year inclusion 186.21 (1.78) n 514 Dual-language learner 163.13 (2.92) n 56 Nondual language learner 195.12 (1.76) n 552 Special needs status 170.80 (3.08) n 61 Nonspecial needs status 192.84 (1.78) n 547 Mathematics Full sample 177.37 (2.24) n 607 3-year-olds at entry 149.81 (2.80) n 203 4- and 5-year-olds at entry 191.48 (2.29) n 404 Training year inclusion 189.41 (3.20) n 196 No training year inclusion 172.03 (2.26) n 411
190.16 (1.75) 601 165.22 (2.40) 183 203.01 (1.78) 418 200.63 (2.69) 205 185.56 (1.76) 494 162.48 (2.75) 103 194.47 (1.80) 498 170.15 (3.14) 47 192.19 (1.76) 554 181.10 (2.23) 601 153.54 (2.86) 183 195.21 (2.24) 418 193.14 (3.23) 196 175.76 (2.24) 405
203.73 (1.53) 598 184.03 (2.06) 212 214.11 (1.56) 386 214.34 (2.53) 178 199.27 (1.55) 420 176.19 (2.79) 56 208.19 (1.56) 542 183.87 (2.96) 64 205.91 (1.56) 534 200.05 (2.19) 597 176.05 (2.70) 211 212.34 (2.22) 386 212.10 (3.17) 179 194.71 (2.21) 418
205.95 (1.53) 605 186.12 (2.12) 194 216.20 (1.53) 411 216.44 (2.56) 186 201.37 (1.54) 419 178.29 (2.61) 109 210.29 (1.59) 496 185.96 (3.03) 47 208.00 (1.55) 558 206.27 (2.17) 604 182.27 (2.75) 193 218.56 (2.18) 411 218.32 (3.20) 186 200.93 (2.19) 418
211.16 (1.56) 583 194.34 (2.10) 211 220.04 (1.59) 372 221.77 (2.55) 173 206.69 (1.58) 410 183.62 (2.81) 45 215.61 (1.56) 109 191.29 (2.98) 60 213.33 (1.58) 523 212.94 (2.31) 583 190.96 (2.80) 212 224.20 (2.34) 371 224.98 (3.26) 172 207.60 (2.33) 411
214.92 (1.56) 599 198.00 (2.16) 196 223.70 (1.57) 403 225.43 (2.59) 184 210.35 (1.57) 415 187.28 (2.63) 538 219.27 (1.63) 490 194.95 (3.04) 48 216.99 (1.58) 551 220.57 (2.31) 598 198.59 (2.85) 194 231.83 (2.30) 404 232.62 (3.28) 185 215.24 (2.31) 413
218.52 (1.65) 544 204.56 (2.27) 202 225.93 (1.76) 342 229.13 (2.64) 162 214.05 (1.72) 382 190.98 (2.89) 47 222.97 (1.71) 106 198.65 (3.05) 55 220.69 (1.72) 489 225.72 (2.52) 543 205.74 (3.02) 201 235.95 (2.57) 342 237.76 (3.41) 162 220.38 (2.54) 381
223.81 (1.71) 563 209.77 (2.34) 188 231.14 (1.73) 375 234.34 (2.68) 167 219.26 (1.71) 396 196.19 (2.71) 497 228.18 (1.77) 457 203.86 (3.12) 47 225.90 (1.73) 516 234.76 (2.51) 563 214.77 (3.07) 189 244.99 (2.52) 374 246.80 (3.43) 166 229.42 (2.52) 397
(continued)
784

Table 2 (continued)
Wave 1 Sample Dual-language learner DLM EPIC 160.17 (3.37) 103 184.33 (2.29) 498 154.18 (3.70) 47 183.80 (2.25) 554 Wave 2 DLM 179.13 (3.54) 56 203.29 (2.20) 541 173.13 (3.60) 63 202.75 (2.22) 534 EPIC 185.35 (3.33) 109 209.51 (2.24) 495 179.35 (3.67) 47 208.97 (2.20) 557 Wave 3 DLM 192.01 (3.62) 45 216.18 (2.32) 538 186.02 (3.68) 60 215.64 (2.34) 523 EPIC 199.65 (3.41) 108 223.81 (2.36) 490 193.66 (3.74) 49 223.27 (2.32) 549 Wave 4 DLM 204.79 (3.76) 47 228.96 (2.53) 496 198.80 (3.81) 55 228.42 (2.55) 488 EPIC 213.83 (3.55) 105 237.99 (2.57) 458 207.84 (3.87) 46 237.46 (2.53) 517
156.44 (3.57) n 56 Nondual language learner 180.60 (2.25) n 551 Special needs status 150.45 (3.63) n 61 Nonspecial needs status 180.07 (2.26) n 546
Note. Entries are estimated population marginal means corrected for cell imbalance as based on all variables in a given model. Parenthetical entries are associated standard errors. Estimates are centered on the midpoint for each respective wave. Values for the covariate age (in months) at entry as applied in the Table 1 models are for the convenience of interpretation presented for those \4 and 4 years old at program entry. Cell sizes indicate the actual number of available cases for a given wave. DLM = Developmental Learning Materials Early Childhood Express; EPIC = Evidence-Based Program for Integrated Curricula.
comprehensive program but implemented various combinations of programs or add-ons of some or all of the DLM program. Positive mathematics and language findings in the present study reflect some distinctive features of the RCT over PCER. First, in this study, EPIC and DLM were implemented as stand-alone programs with comparable resources and fidelity. Second, the efficacy of EPIC was evaluated across all of the children in the Head Start classrooms. The PCER studies demonstrating cognitive effects included only older preschool children of mixed socioeconomic backgrounds. The present study included younger preschool children (3-year-olds) as well as older children. Furthermore, it was intentionally focused on producing effects for a policy-relevant population of urban, low-income minority children. While PCER studies included low-income children, they did not explicitly assess their effects for these children in their evaluations. Third, the present study included a comprehensive set of control variables (e.g., years in preschool, number of adults in the classroom) and tested interactions between program and relevant subgroups of children. Children participating in EPIC also demonstrated better mathematics outcomes than children in DLM. Findings from PCER showed that implementation of the full DLM program plus the Open Court Reading curriculum did not produce significant mathematics outcomes. In fact, the only study to show positive mathematics outcomes was the combination of PreK 785
Fantuzzo et al. Mathematics and the computer DLM mathematics supplement, which was implemented on top of a variety of existing comprehensive programs (e.g., High/Scope and Creative Curriculum). The clear mathematics effect for EPIC is encouraging as evidence grows documenting that early mathematics achievement may be a better predictor of later academic success than early reading (Duncan et al., 2007; Ginsburg, Lee, & Boyd, 2008). Children receiving EPIC evidenced superior listening comprehension outcomes compared to children in DLM. Although DLM was associated with positive language outcomes in the PCER study, it was not implemented in that study as a stand-alone program. Instead, it was combined with the Open Court Reading curriculum, which is a language curriculum targeting listening comprehension. This may explain why the DLM program was associated with a significant impact on language in the PCER study but not as a stand-alone program in the present study. The findings on listening comprehension are especially salient, given its importance to preschool childrens oral language, vocabulary development, and engagement in reading, more broadly (Skarakis-Doyle & Dempsey, 2008). Listening comprehension enhances childrens understanding of stories and other texts that are read aloud to them and that they read to themselves. It enables children to remember what they read and communicate with others about what they read (Armbruster, Lehr, & Osborn, 2001, p. 48). Because so much of the classroom instruction to which young children are exposed centers on books being read to them or receiving oral instruction, their listening comprehension becomes paramount to their ability to engage the text, for example, make predictions, respond to questions about the text, share their own interests, generate and respond to questions, and reenact the story. As a result, children are facilitated in using a range of metacognitive abilities that deepen their engagement with literacy and other learning. While EPIC did not surpass DLM in Vocabulary and Alphabet Knowledge, as it had in Mathematics and Listening Comprehension, it compared well against DLM. This is significant because DLM was the only program in PCER to evidence positive preschool literacy outcomes in combination with the Open Court curriculum, including vocabulary and alphabet skills. In the present study, both EPIC and DLM were comparably effective in evidencing significant growth rates in Vocabulary and Alphabet Knowledge. The significant growth rates for Vocabulary and Alphabet Knowledge were 5.7 and 6.7 scaled score points per month, respectively (with approximately one standard deviation). Overall, these effects underscore the importance of intentional mathematic instruction integrated with language and literacy instruction. The findings suggest that comprehensive mathematics instruction in early childhood may have a generalization effect to other important school readiness areas, such as listening comprehension. Mathematics has been found to be a good predictor of early school performance (Duncan et al., 2007) and may be conceived of as a kind 786
Integrated Curriculum for Head Start of prototypic learning of mental discipline, one like listening comprehension, that requires conceptual understanding, adaptive reasoning, and a discipline about rules, steps, sequences, and comparisons (Kilpatrick et al., 2001). Moreover, mathematics and listening comprehension are related to important foundational skills supported by intentional instruction and emphasis on learning behaviors, an additional focus of EPIC. Kilpatrick and his colleagues (Kilpatrick et al., 2001) refer to the interdependence of mathematics proficiency as including conceptual understanding, adaptive reasoning, strategic competence, productive disposition, and procedural fluency, all approaches to learning skills reflecting attention control, frustration tolerance, group learning and task approach. The findings from the present study point to a need for future research to explore the possibilities of using mathematics skill instruction as a mediator or precursor to foster good listening comprehension and the mediating or moderating effect of instruction in learning behaviors on a comprehensive set of cognitive school readiness competencies. With respect to special subpopulations, there were significant main effects across time. Regardless of program, there were distinctive differences between DLL children, children with special needs, and 3-year-old preschool children and their counterparts across all skills areas. The DLL children and children with special needs evidenced similar growth rates compared to their nonDLL and nonspecial needs peers; however, they consistently lagged behind in their performance, with substantial gaps across all skill areas. These findings add support to national mandates for more effective ways to bolster preschool programming to enhance the learning experiences of these special subpopulations (Espinosa, 2005; Odom et al., 2004). The purpose of this study was not to examine the relative effectiveness of EPIC with special populations, and therefore, there were not large sample sizes of these subgroups to provide adequate statistical power to explore interactions. This is a limitation of this study that should be considered for future research. Another interesting developmental finding is the differential cognitive and language skills growth rates for 3-year-old children compared to the rates of older preschool children. Across skills, the slopes for the younger children were steeper, irrespective of program. The slowing in growth rates for the older children could not be attributed to a ceiling effect for assessments because the Learning Express manifested no such capping phenomena, and it was intentionally designed to measure fine gradients of change, even among more advanced learning materials. Thus, these findings may speak to the cognitive and brain development literature supporting evidence of steeper growth in earlier years with changes in patterns of growth for older preschool and elementary school age children (Shonkoff & Phillips, 2000). Last, the implications of the development and testing of EPIC in context for early childhood education are found in our need for realistic evidencebased programming that can be used by early childhood educators (McCall, 2009, p. 3). All components of EPIC were developed intentionally to be 787
Fantuzzo et al. responsive to the context of urban Head Start centers. EPIC was developed in collaboration with Head Start teachers, administrators, and parents to be a comprehensive, integrated intervention for early childhood programs serving primarily minority children in urban Head Start centers. As such, it includes integrated curriculum-based assessments and an integrated curriculum that meets Head Start performance standards and state early childhood standards. It also includes a model of ongoing professional development that uses indigenous personnel and fits within program resource and time allocations for professional development. High anonymous teacher and teaching assistant ratings of satisfaction with the EPIC program (98% satisfaction) reflect successful efforts to fit EPIC to educators context. The current evaluation compared EPIC as a stand-alone program to another well-documented comprehensive stand-alone program. Both were implemented with comparable fidelity and resources. The evaluation was conducted including all the children in the participating classrooms and effects were tested across relevant subgroups of keen interest to Head Start. The documentation of the relative efficacy of EPIC to bring about changes in important sets of early mathematics and reading skills in context and in partnership addresses the national need for realistic evidence-based programming research. Three significant contemporary realities of early childhood education in the United States create the necessity for realistic evidence-based programming research. First, we have decades of persistent achievement gaps in mathematics and reading that document our urgent need for more effective educational intervention for young low-income minority students, who are disproportionately segregated in large urban and rural areas in the United States (Lee & Burkam, 2002). Second, as a result of our urgent need, we have a proliferation of mandates and standards requiring Head Start and state-funded prekindergarten programs to implement comprehensive educational interventions to advance a host of physical, cognitive, language, and social-emotional skills through the use of scientifically based assessments and curricula (Hyson, 2008; Scott-Little, Kagan, & Frelow, 2006). Third, this proliferation of standards has surpassed the actual capacities of many of our early childhood programs serving low-income preschool children to meet these requirements. In too many cases, the demand far exceeds actual program capacities. Therefore, many programs with insufficient resources and professional development time and expertise are struggling to meet these requirements. They are forced to adopt commercially available products with little or no empirical evidence that promise to meet all the requirements. Available evidence-based, comprehensive interventions often neglect to ensure that the program can be effective in the actual early childhood context where they will need to be implemented (i.e., with programming responsive to meet the needs of the targeted population within the context of the existing personnel and resources). These three realities call for applied, evidence-based programming research that a priori provides correction for intervention-context misfits. 788
Integrated Curriculum for Head Start McCall (2009), in his timely article, asserts that realistic evidence-based programming requires (a) development of comprehensive programs in context and in partnership with early childhood educators in that context, (b) evaluation of programs in context with careful attention to indigenous capacities and resources, and (c) consideration of these important context and process variables when attempting to replicate the intervention. The EPIC evaluation in the present study represents a step toward realistic programming research that increases the likelihood that evidence-based programs will be used and replicated to enhance their effectiveness for our groups of young children most in need of a high-quality early childhood education. Notes
This research was supported by the U.S. Department of Health and Human Services National Institute of Child Health and Human Development, the Administration for Children and Families, the Office of the Assistant Secretary for Planning and Evaluation, and the U.S. Department of Educations Office of Special Education and Rehabilitative Services (Grant Nos. P21 HD043758-01 and R01HD46168-01). The authors are grateful for the contributions of Lauren Angelo, Douglas Frye, Daryl Greenfield, Cleo Jacobs, Staci Perlman, Yumiko Sekino, Myrna Shure, Faith Sproul, Heather Warley, Barbara Wasik, and Clare Waterman. A number of outstanding Head Start leaders and educators contributed to this research, including Susan Bowdon, Elaine Brenner, Stephanie Childs, Janet Luckey, Judy McDowell, Donna Piekarski, Jennifer Plumer Davis, Linda Stulz, Susan Whittaker, Donna Widmaier, Susan Winkelspecht (contributors listed alphabetically). 1 It is important to note that Preschool Curriculum Evaluation Research evaluations were conducted with relatively small sample sizes, making it harder to detect small or moderate effects. 2 Least-squares means are population parameter estimates and do not have SDs as are available for ordinary sampling distributions. Rather, SEs are estimated and applied to test hypotheses regarding differences between means via the Tukey-Kramer adjustment (Kramer, 1956; Searle, Speed, & Milliken, 1980). Effect size estimates, such as Cohens d and Hedges G, require an SD to specify the measurement scale in the formula denominator (Rosenthal, 1994). Typically, SDs specific to each given contrast group would be pooled to estimate the proper SD for d or G. Such a correction would be inappropriate here because the least-squares means are by their nature already corrected for contrast group imbalance, such that pooled SDs would no longer represent the measurement scale and would tend to bias estimates. Alternatively, the SD for the predicted scores of all children at Wave 4 is applied here to represent the measurement scale.
References
Aber, J. L., Jones, S. M., & Raver, C. C. (2007). Poverty and child development: New perspectives on a defining issue. In A. J. Lawrence, S. J. Bishop-Josef, S. M. Jones, K. T. McLearn, & D. A. Phillips (Eds.), Child development and social policy: Knowledge for action (pp. 129145). Washington, DC: American Psychological Association. Adams, M., Bereiter, C., Carruthers, I., Case, R., Hirshberg, J., McKeough, A. (2000). Open court reading. Columbus, OH: SRA/McGraw-Hill. Armbruster, B., Lehr, F., & Osborn, J. (2001). Put reading first: The research building blocks for teaching children to read, K3. Washington, DC: National Institute for Literacy.
789
Fantuzzo et al.
Assessment Technology. (2002). Galileo Skills Inventory Version 2. Tucson, AZ: Author. Bauer, D. J., & Curran, P. J. (2006). Multilevel modeling of hierarchical and longitudinal data using SAS: Course notes. Cary, NC: SAS Institute. Bronfenbrenner, U., & Morris, P. (1998). The ecology of developmental process. In W. Damon (Series Ed.) & R. Lerner (Vol. Ed.), Handbook of child psychology: Vol. 1. Theoretical models of human development (5th ed., pp. 9931028). New York, NY: Wiley. Burnham, K. P., & Anderson, D. R. (1998). Model selection and inference: A practical information-theoretic approach. New York, NY: Springer-Verlag. Burnham, K. P., & Anderson, D. R. (2004). Multimodel interference: Understanding AIC and BIC in model selection. Sociological Methods in Research, 33, 261304. Campbell, F. A., Pungello, E. P., Miller-Johnson, S., Burchinal, M., & Ramey, C. T. (2001). The development of cognitive and academic abilities: Growth curves from an early childhood educational experiment. Developmental Psychology, 37, 231242. Carrow-Woolfolk, E. (1995). Oral and Written Language Scales: Listening Comprehension scale. Circle Pines, MN: American Guidance Service. Chatterji, M. (2006). Reading achievement gaps, correlates, and moderators of early reading achievement: Evidence from the Early Childhood Longitudinal Study (ECLS) kindergarten to first grade sample. Journal of Educational Psychology, 98, 489507. Chernoff, J. J., Flanagan, K. D., McPhee, C., & Park, J. (2007). Preschool: First findings from the preschool follow-up of the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B) (NCES 2008-025). Washington, DC: National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York, NY: Wiley. Dodge, D., Colker, L., & Heroman, C. (2002). The creative curriculum for preschool (4th ed.). Washington, DC: Teaching Strategies. Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P., : : : Japel, C. (2007). School readiness and later achievement. Developmental Psychology, 43, 14281446. Dunn, L. M., & Dunn, L. M. (1997). Peabody Picture Vocabulary TestThird Edition, Form A. Circle Pines, MN: American Guidance Service. Espinosa, L. M. (2005). Curriculum and assessment considerations for young children from culturally, linguistically, and economically diverse backgrounds. Psychology in the Schools, 42, 837853. Fantuzzo, J., Gadsden, V., & McDermott, P. (2003). Evidence-based program for the integration of curricula (EPIC): A comprehensive initiative for low-income preschool children (Grant Nos. P21 HD043758-01 and R01HD46168-01). Washington, DC: U.S. Department of Health and Human Services. Fantuzzo, J., Gadsden, V., & McDermott, P. (2008, June). Evidence-Based Program for the Integration of Curricula. Invited presentation at Head Starts 9th National Research Conference, Washington, DC. Fantuzzo, J., Rouse, H. L., McDermott, P., Sekino, Y., Childs, S., & Weiss, A. (2005). Early childhood experiences and kindergarten success: A population-based study of a large urban setting. School Psychology Review, 34, 571588. Frye, D. (1991). Cognitive strategies, learning and educational software. In L. Birnbaum (Ed.), Proceedings of the 1991 International Conference on the
790

Learning Sciences (pp. 213261). Charlottesville, VA: Association for the Advancement of Computing in Education. Gardner, M. F. (1990). Expressive One-Word Picture Vocabulary TestRevised. Novato, CA: Academic Therapy. Ginsburg, H. P., & Baroody, A. J. (2003). Test of Early Mathematics AbilityThird Edition Form A. Austin, TX: PRO-ED. Ginsburg, H. P., Lee, J. S., & Boyd, J. S. (2008). Mathematics education for young children: What it is and how to promote it. Social Policy Report, 22, 323. High/Scope Educational Research Foundation. (2003). Preschool child observation record (2nd ed.). Ypsilanti, MI: High/Scope Press. Hyson, M., (2008). Enthusiastic and engaged learners: Approaches to learning in the early childhood classroom. New York, NY: Teachers College Press. Jencks, C., & Phillips, M. (1998). The Black-White test score gap: Why it persists and what can be done. Brookings Review, 16(2), 2427. Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding it up: Helping children learn mathematics. Washington, DC: National Academy Press. Kolen, M. J., & Brennan, R. L. (2004). Test equating: Methods and practice (2nd ed.). New York, NY: Springer-Verlag. Kramer, C. Y. (1956). Extension of multiple range tests to group means with unequal numbers of replications. Biometrics, 12, 307310. Lee, V. E., & Burkam, D. T. (2002). Inequality at the starting gate: Social background differences in achievement as children begin school. Washington, DC: Economic Policy Institute. Littell, R. C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996). SAS systems for mixed models. Cary, NC: SAS Institute. Littell, R. C., Milliken, G. A., Stroup, W. W., Wolfinger, R. D., & Schabenberger, O. (2006). SAS systems for mixed models (2nd ed.). Cary, NC: SAS Institute. Luthar, S. S., Cicchetti, D., & Becker, B. (2000). The construct of resilience: A critical evaluation and guidelines for future work. Child Development, 71(3), 543562. McCall, R. B. (2009). Evidence-based programming in the context of practice and policy. Social Policy Report, 28, 319. McCardle, P., Scarborough, H. S., & Catts, H. W. (2001). Predicting, explaining, and preventing childrens reading difficulties. Learning Disabilities Research and Practice, 16(4), 230239. McDermott, P. A. (1995). Sex, race, class, and other demographics as explanations for childrens ability and adjustment: A national appraisal. Journal of School Psychology, 33, 7591. McDermott, P. A., Angelo, L. E., Waterman, C., & Gross, K. S. (2006, April). Building IRT scales for maximum sensitivity to learning growth patterns over multiple short intervals in Head Start. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA. McDermott, P. A., & Fantuzzo, J. W. (2000). Learning-in-time and teaching-to-learn: Study of the unique contributions of learning behaviors to school readiness (Head StartUniversity Partnership Grant No. 90-YD-0080). Washington, DC: U.S. Department of Health and Human Services, Administration on Children, Youth, and Families. McDermott, P. A., Fantuzzo, J. W., Waterman, C., Angelo, L. E., Warley, H. P., Gadsden, V. L., & Zhang, X. (2009). Measuring preschool cognitive growth while its still happening: The Learning Express. Journal of School Psychology, 47, 337366. Milliken, G. A., & Johnson, D. E. (2002). Analysis of messy data: Vol. 3. Analysis of covariance. New York, NY: Chapman & Hall/CRC.
791
Fantuzzo et al.
National Early Literacy Panel. (2009). Developing early literacy: A scientific synthesis of early literacy development and implications for intervention. Washington, DC: National Institute for Literacy. National Institute of Child Health and Human Development. (2000). Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction (NIH Publication No. 00-4769). Report of the National Reading Panel. Washington, DC: Government Printing Office. Odom, S., Vitztum, J., Wolery, R., Lieber, J., Sandall, S., Hanson, M., . . . Horn, E. (2004). Preschool inclusion in the United States: A review of research from an ecological systems perspective. Journal of Research in Special Educational Needs, 4, 1749. Pennsylvania Department of Education and Department of Public Welfare. (2005). Pre-Kindergarten: Pennsylvania Learning Standards for Early Childhood. Harrisburg, PA: Author. Preschool Curriculum Evaluation Research Consortium. (2008). Effects of preschool curriculum programs on school readiness (NCER 20082009). Washington, DC: Government Printing Office. Raudenbush, S., Martinez, A., & Spybrook, J. (2007). Strategies for improving precision in group-randomized experiments. Educational Evaluation and Policy Analysis, 29, 529. Raudenbush, S. W. (1997). Statistical analysis and optimal design for cluster randomized trials. Psychological Methods, 2, 173185. Raudenbush, S. W., & Xiao-Feng, L. (2001). Effects of study duration, frequency of observation, and sample size on power in studies of group differences in polynomial change. Psychological Methods, 6, 387401. Reid, D. K., Hresko, W. P., & Hammill, D. D. (2001). Test of Early Reading Ability Third Edition, Form A. Austin, TX: PRO-ED. Rosenthal, R. (1994). Parametric measures of effect size. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 231244). New York, NY: Sage. Schiller, P., Clements, D. H., Sarama, J., & Lara-Alecio, R. (2003). DLM Early Childhood Express. Columbus, OH: SR//McGraw-Hill. Schweinhart, L. (2004). The High/Scope Perry Preschool Study through age 40: Summary, conclusions, and frequently asked questions. Ypsilanti, MI: High/ Scope Press. Scott-Little, C., Kagan, S., & Frelow, V. (2006). Conceptualization of readiness and the content of early learning standards: The intersection of policy and research? Early Childhood Research Quarterly, 21, 153173. Searle, S. R., Speed, F. M., & Milliken, G. A. (1980). Populations marginal means in the linear model: An alternative to least squares means. American Statistician, 34, 216221. Shonkoff, J., & Phillips, D. (2000). From neurons to neighborhoods: The science of early childhood development. Washington, DC: National Academy Press. Shure, M. B., & DiGeronimo, T. F. (1996). Raising a thinking child: Help your young children to resolve everyday conflicts and get along with others. New York, NY: Pocket Books. Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis: Modeling change and event occurrence. New York, NY: Oxford University Press. Skarakis-Doyle, E., & Dempsey, L. (2008). The detection and monitoring of comprehension errors by preschool children with and without language impairment. Journal of Speech, Language, and Hearing Research, 51(5), 12271243.
792

Snijders, T. A. B. (2005). Power and sample size in multilevel linear models. In B. S. Everitt & D. C. Howell (Eds.), Encyclopedia of statistics in behavioral science (Vol. 3, pp. 15701573). New York, NY: Wiley. Snow, C. (1991). The theoretical basis for relationships between language and literacy in development. Journal of Research in Childhood Education, 6, 510. Snow, C., Burns, M. S., & Griffin, P. (1998). Preventing reading difficulties in young children. Washington, DC: National Academy Press. Spillane, J. P. (2006). Distributed leadership. San Francisco, CA: Jossey-Bass. U.S. Department of Education. (2007). Reading First and Early Reading First: Student achievement, teacher empowerment, national success. Retrieved from http:// www.ed.gov/nclb/methods/reading U.S. Department of Health and Human Services. (2005). Head Start impact study: First year findings. Washington, DC. U.S. Department of Health and Human Services. (2003). Head Start National Reporting System: Direct child assessment fall and spring. Washington, DC: Administration for Children and Families, Administration on Children, Youth, and Families, and Head Start Bureau. U.S. Department of Health and Human Services. (2006). Head Start child outcomes framework. Washington, DC: Administration for Children and Families, Administration on Children, Youth, and Families, and Head Start Bureau. U.S. Department of Health and Human Services. (2008a). Head Start facts sheet. Washington, DC: Administration for Children and Families. Retrieved from http://www.acf.hhs.gov/programs/ohs/about/fy2008.html U.S. Department of Health and Human Services. (2008b). Head Start family income guidelines for 2008 (ACF-IM-HS-08-05). Washington, DC: Administration for Children and Families, Office of Head Start. Wasik, B. A., & Bond, M. A. (2001). Beyond the pages of a book: Interactive book reading and language development in preschool classrooms. Journal of Educational Psychology, 93, 243250. Weikart, D., & Schweinhart, L. (2005). The High/Scope curriculum for early childhood care and education. In J. L. Roopnarine & J. E. Johnson (Eds.), Approaches to early childhood education (4th ed., pp. 277294). Upper Saddle River, NJ: Prentice Hall. West, J., Denton, K., & Reaney, L. (2002). Childrens reading and mathematics achievement in kindergarten and first grade (NCES 2002-125). Washington, DC: U.S. Department of Education, National Center for Education Statistics. Willett, J. B., Singer, J. D., & Martin, N. C. (1998). The design and analysis of longitudinal studies of development and psychopathology in context: Statistical models and methodological recommendations. Development and Psychopathology, 10, 395426. Wolfinger, E. D. (1996). Heterogeneous variance: Covariance structures for repeated measures. Journal of Agricultural, Biological, and Environmental Statistics, 1, 205230. Wolfinger, R. D. (1993). Covariance structure selection in general mixed models. Communications in Statistics-Simulations, 22, 10791106. Zigler, E., Gilliam, W., & Jones, S., (2006). A vision for universal preschool education. New York, NY: Cambridge University Press.
Manuscript received October 6, 2009 Final revision received August 11, 2010 Accepted August 24, 2010
793

Am Educ Res J 2011 Fantuzzo 763 93

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Am Educ Res J 2011 Fantuzzo 763 93

Diunggah oleh

Hak Cipta:

Format Tersedia

American Educational Research Journal http://aerj.aera.

American Educational Research Association

Unconditional Means Models

1522.69** 1.63 0.01 127.72*

890.93 1.45 \0.01* 47.75**

1629.73** 0.14 0.01 142.31* 0.15 \0.01 378.72**

22.64 71.37 5.99 225.79 0.19

45.45 51.77 2.78 219.79 0.14

17.61 75.77 6.62 228.46 0.23

1127.14** 0.97* 0.01 156.22 0.40 0.00 669.28

994.50**** 0.43* 0.01 75.64

1064.20 0.83 0.01 133.01 0.22* \0.00 381.43**

233.00 0.22 2.34 \0.01* 10.36****

231.52** 0.19 2.15 0.01 9.07*

225.39 0.15 2.12 0.01 15.13 0.94*

235.05 0.24 2.90 0.00 17.50 1.12* (continued)

Integrated Curriculum for Head Start

Integrated Curriculum for Head Start

Integrated Curriculum for Head Start

Integrated Curriculum for Head Start

Anda mungkin juga menyukai