Anda di halaman 1dari 32

TEACHERS COLLEGE, COLUMBIA UNIVERSITY

TEST SCORES AND TEACHER SELECTION

AN EMPIRICAL ANALYSIS FOR TURKEY

M. ALPER DINCER
4/26/2011

[In 2002 in Turkey, a decentralized model of teacher hiring was replaced with a teacher selection
model which operates through centralized testing. This study evaluates the impact of this new
teacher selection policy on mathematics and science test scores of 8 th graders. The findings show
that a 0.17 standard deviation increase in test scores can be attributed to the new teacher
selection policy and the estimated impact is much higher for below median achievers and
students with female teachers. The findings also provide evidence exhibiting that the new teacher
selection policy assigns more teachers to relatively poor schools and classrooms.]
1. Introduction: test scores
The primary and secondary education systems in Turkey have been undergoing a

restructuring since late 1990s in response to swift developments in the formation of its

economy and the demographics of its young population. One of the main goals of this

restructuring is to increase the quality of learning outcomes in Turkey (Aksit, 2007).

Thus it is important to investigate empirically whether these reform efforts achieve the

intended outcomes or not.

The Trends in International Mathematics and Science Study (TIMSS) and Program for

International Student Assessment (PISA) periodically measure the student achievement

on an international scale and assemble information about students, their families and

schools. Thus with the help of these projects it is possible to track student achievement in

participating countries and make cross-country comparisons. Therefore these projects

provide the necessary data in order to analyze the trend of learning outcomes in Turkey.

A representative set of the student body in 8 th grade which is the final grade of mandatory

schooling in Turkey participated in TIMSS 1999 and 2007. The average mathematics and

science scores of students in Turkey in 1999 were 429 and 433 whereas the international

average scores were 487 and 488, respectively. Similarly the average mathematics and

science scores of students in Turkey in 2007 were 433 and 454 whereas the international

average scores were 488 and 500, respectively. Thus the students in Turkey performed

lower than the average international student achievement. The following table gives the

percentages of students reaching the TIMSS international benchmarks:


Table 1 the percentages of students reaching the TIMSS international benchmarks

Advanced High Intermediate Low

1999 1 6 20 38
Mathematics
2007 5 10 18 26

1999 1 5 19 37
Science
2007 3 13 24 31

Source: (Martin et al., 2001a), (Martin et al., 2001b), (Martin, Mullis, Foy, & Olson,
2008a), (Martin, Mullis, Foy, & Olson, 2008b)

As a cautionary note, it should be stated that these percentages are not directly

comparable between 1999 and 2007 for Turkey (Martin, et al., 2008a, 2008b). However

these figures present the same pattern in mathematics and science for the students in

Turkey. There are more students in advanced and high international benchmark levels

and there are fewer students in low international benchmark levels 1.

PISA offers more definitive information about the trend of learning outcomes of students

in Turkey. Similar to TIMSS PISA measures the reading, mathematics and science test

scores of a student body which is representative for the 15-year old student population in

each participating country. Turkey has participated PISA in 2003, 2006 and 2009 and the

trend in mathematics score is comparable between 2003 and 2009 and the trend in

science test score is comparable between 2006 and 2009 (OECD, 2010).

According to PISA results average mathematics score of 15-year old students in Turkey

increased by 22 points (more than 0.2 standard deviation) and average science score of

1
For the description of these benchmark proficiency levels please see (Martin, et al., 2008b) and
(Martin, et al., 2008a).
15-year old students in Turkey increased by 30 points (approximately 0.3 standard

deviation) (OECD, 2010).

PISA data also shows that in which segment of the student body these improvements

occurred. The percentage of students who falls below the proficiency level 2 decreased

from 52 to 42 in mathematics and in science the same percentage dropped from 47 to 30.

On the other hand the percentages of top performers did not show any increase or

decrease between the respective periods (Figure 1).

These figures on the trend of the average student achievement in mathematics and

science in Turkey highlight at least three important facts. First, for a period which follows

1999, average student achievement in mathematics and science is increasing for the

student population which is either in grade 8 or 15 years old. Second, this increase in

average student achievement is not homogenous. Indeed it is much more intensive on the

lower end of the student achievement distribution in these subjects. Third, these

improvements in average student achievement in Turkey are not due to inflation in test

score scales; average performance of students in Turkey is converging to international

benchmarks as it is defined either by TIMSS or PISA. This convergence is pretty quick at

least according to the measure PISA provided.

These facts immediately raise several questions: Are these changes in student

achievement related to restructuring in the education system in Turkey? If yes, which

aspects of the reform initiative in Turkey did lead to higher learning outcomes in

mathematics and science? Is it possible to identify the channels through which the policy

intervention leads to increases in student achievement? This study attempts to offer some

candidate answers to these questions.


Score point change in science performance between 2006 and 2009 Score point change in mathematics performance between 2003 and 2009

-25
-20
-15
-10
10
15
20
25
30
35
40

-5
0
5

-20
-15
-10
10
15
20
25
30
35

-5
0
5
Mexico 0
Qatar 0 Brazil 0
Turkey 0 Turkey 1
Portugal 0
Greece 0
Korea 0
Tunisia 0 Portugal 0
Brazil 0 Italy 0
Colombia 1
Italy 0
Tunisia 0
Norway 1 Indonesia 5
United States 3 Germany 4
Poland 2
Romania 10
Switzerland 15
Argentina 23 Serbia 29

Source: (OECD, 2010)


Chile 11 Poland 29
Japan 14
Kyrgyzstan 12
Uruguay 33
Figure 1: PISA indicators

Serbia 12 United States 37


Hong Kong-China 13 Hong Kong-China 46
Mexico 13
Bulgaria 56 Korea 47
Switzerland 31 Norway 48
Iceland 15 Thailand 74
Germany 38
Latvia 38 Liechtenstein 97
Thailand 34 Hungary 97
Lithuania 47
OECD average-28 98
Denmark 47
France 59 Russian Federation 92
OECD average -33 24 Latvia 78
Slovak Republic 70
Slovak Republic 76
New Zealand 72
Israel 86 Spain 67
Australia 93 Macao-China 58
Macao-China 94
Finland 28
Spain 97
Ireland 95 Luxembourg 10
Uruguay 84 New Zealand 27
United Kingdom 80
Japan 36
Russian Federation 83
Hungary 79 Canada 7
Liechtenstein 70 Iceland 0
Luxembourg 43
Australia 1
Netherlands 69
Greece 57 Denmark 1
Estonia 43 Netherlands 5
Belgium 39
Belgium 0
Canada 11
Jordan 21 France 0
Croatia 13 Sweden 0
Slovenia 2
Ireland 0
Sweden 6
Azerbaijan 6 Czech Republic 0
Finland 2
Montenegro 0
Indonesia 14
Chinese Taipei 2
Czech Republic 2 Percentage of students below proficiency Level 2
0
10
20
30
40
50
60
70
80
90

Percentage of students below proficiency Level 2

10
20
30
40
50
60
70
80
90

0
100

Finland o
Finland +
Korea - Korea o
Hong Kong-China o Hong Kong-China o
Estonia o
Canada o Liechtenstein o
Macao-China o Macao-China o
Japan o
Chinese Taipei o
Canada o
Liechtenstein o Japan o
Australia o
Poland -
Netherlands o
Netherlands o Switzerland o
New Zealand o
New Zealand o
Switzerland o
Hungary o Australia o
Latvia o
Iceland +
2009

Slovenia o
Germany o Denmark o
United Kingdom o
Norway o
Ireland o
Norway - Germany o
Portugal -
Belgium +
Denmark o
2009

Lithuania - Poland o
Czech Republic o
2003

Ireland +
Iceland -
Belgium o Slovak Republic o
2006

United States - Sweden +


Spain o
Croatia o Hungary o
Sweden + Czech Republic +
Slovak Republic o
France o France +
Italy - Latvia o
Russian Federation o
Luxembourg o
United States o
Greece o Portugal -
Turkey -
Chile -
Spain o
Israel o Luxembourg +
Serbia -
Bulgaria o
Italy -
Romania o Russian Federation o
Uruguay o
Greece -
Thailand o
Jordan o Serbia o
Mexico -
Turkey -
Argentina o
Montenegro + Uruguay o
Tunisia -
Mexico -
Colombia -
Brazil - Thailand o
Qatar - Brazil -
Indonesia o
Azerbaijan o Tunisia -
Kyrgyzstan - Indonesia o
2. Possible explanations

OECD (2010) stresses the role of the Basic Education Programme (BEP) in increasing

learning outcomes in Turkey. The World Bank supported programme defined the

framework for the education reform initiative in Turkey according to the Law No. 4306 2.

With this legislation Ministry of National Education (MONE) aimed to achieve

increasing primary school education, improving the quality of education and overall

student outcomes, closing the performance gap between boys and girls, providing equal

opportunities, matching the performance indicators of the European Union, developing

school libraries, increasing the efficiency of the education system, ensuring that qualified

personnel were employed, integrating information and communication technologies into

the education system and creating local learning centers, based in schools, that are open

to everyone3.

In response to these efforts the attendance rate in the eight-year primary education system

soared from 85 to 100 percent. Similarly, the attendance rate in pre-primary education

system increased from 10 to 25 percent. These increases led to an expansion of the

education system by 3.5 million pupils. These quantitative expansions of the education

system were accompanied by qualitative improvements: During the same period average

class size was reduced from approximately 40 to 30; conditions were improved in all

rural schools and computer laboratories were established in every primary school and

lastly the cost of the BEP exceed the equivalent of USD 11 billion (OECD, 2010).

2
http://mevzuat.meb.gov.tr/html/24.html
3
http://www.meb.gov.tr/Stats/Apk2002/502.htm
OECD (2010) as well as MONE also highlights the importance of recent curriculum

change in mathematics and science (TTKB, 2008): New curricula were launched in the

2006-2007 school year, starting from the 6 th grade. Similarly, mathematics and language

curricula were also updated and starting from the 9 th grade in the 2008-2009 school year a

new curriculum of science was in force. According to the Board of Education (TTKB)

the aim of this change was to update the content of school education as well as to change

the teaching philosophy and culture within schools.

Although the new curricula is the preferred explanation of MONE and some other

research institutions in Turkey 4 for the increased learning outcomes the connection is not

clear and there is a problem with this specific explanation: First, given that the TIMSS

covers the period between 1999 and 2007 the new curricula explanation does not explain

the improvement in learning outcomes which is evident in TIMSS data. Second, average

achievement in mathematics in PISA is not comparable between 2006 and 2009.

Therefore the timing of the inception of the new curricula and the increase in average

mathematics achievement in Turkey do not overlap. Third, the students who were subject

to the curricula change in science are 9 th graders which constitute only a portion of the

PISA 2009 sample in Turkey; moreover they experienced the new curricula only for two

semesters. It is not clear whether these students may drive a 0.3 standard deviation

increase in the average student achievement in science between 2006 and 2009.

As mentioned earlier, one of the targets of the BEP was to ensure that qualified personnel

were employed. In line with this goal teacher selection policy was changed in 2002 in

4
http://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumak,
http://www.tepav.org.tr/upload/files/1292255907-
8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdf
Turkey which might have affected teacher quality in public primary and secondary

institutions. In the following I will present a brief review on teacher quality and then go

on with the nature of this policy intervention.

3. Why teacher quality is important? How is teacher quality measured?

Learning outcomes are affected by many factors, including: students’ ability, potential,

enthusiasm and behavior; school management, resources and atmosphere; curriculum and

content; and teacher ability, preparation, attitudes and practices. Schools and classrooms

are elaborate and dynamic mediums and identifying the education production function

and the underlying technology continues to be a major challenge of educational research.

This problem has many aspects ranging from research design, methodology and data

availability. Usually researchers are forced to use measures which are only partial

indicators of learning and in many cases it is not possible to apply the relevant

methodologies. Therefore the results, interpretations and policy implications of such

studies are regularly questioned.

Keeping this caveat in mind some general inferences can be drawn from the body of

research on the determinants of learning. First, out-of-school factors such as the ability,

motivation, parental characteristics, neighborhood and socioeconomic status are the

strongest predictors of learning and it is not easy to change these factors through policy

intervention in the short run.

Second, among the factors which are open to policy influence teacher quality is the most

important school input affecting learning. Santiago (2002), Schacter and Thum (2004)

and Eide, Goldhaber and Brewer (2004) present extensive and detailed reviews of this

line of research.
The difference in teacher quality may lead to substantial difference in student

achievement. In order to understand the relative significance of teacher quality Rivkin et

al. (2005) analyze a unique matched panel data from the UTD Texas Schools Project

which allows them to identify teacher quality based on student performance. They

conclude that the contribution of a ten student reduction in class size to learning is less

than that of a standard deviation increase in teacher quality.

In another study, Rockoff (2004) analyzes a 10-year panel data of test scores and teacher

assignments to understand how much teachers affect learning. The panel structure allows

him to focus on differences in the performance of the same student with different teachers

and to decompose the variation in teacher quality from variation in students’

characteristics. His analysis shows that variation in teacher quality explains 23 percent of

the variation in the test scores which is potentially open to policy influence.

Third, teacher characteristics such as qualifications, teaching experience and teacher

education do not exhibit consistently clear and strong effects on student achievement:

Hanushek (2002, 2003) reviews the studies focusing on United States and concludes that

overall there are no systematic effects of characteristics such as teacher education or

teacher experience. Thus it is a challenging inquiry to identify the components which

characterize the quality of teachers.

In the same reviews Hanushek (2002, 2003) also highlights that there is convincingly

strong support for the effects of teachers’ academic ability as measured by teacher test

scores. In line with Hanushek’s inference National Center on Teacher Quality (NCTQ)

(2004) reports that teacher’s academic aptitude has a clear, measurable effect on learning

and this finding is robust and consistent. The same reports emphasizes that a teacher’s
literacy ability as measured by standardized tests has an impact on learning more than

any other measureable teacher characteristics. Thus a broad conclusion emerges from

research connecting teacher quality to teachers’ test scores: Teachers’ test scores may be

a good measure for teacher quality if these tests are measuring academic aptitude.

Interestingly, there are some studies from Turkey which is in line with these findings.

Several studies which analyze PISA 2006 data for Turkey show that students who were

taught by teachers who passed rigorous testing procedures are associated with higher test

scores (Alacaci & Erbas, 2010; Dincer & Uysal, 2010).

The literature leads to two main conclusions in these aspects: First, teacher quality is an

essential ingredient of education production and it is open to policy influence. Second,

screening teachers with testing which measures academic ability may lead to an increase

in the teacher quality.

4. Basic characteristics of teacher labor market in Turkey

The main characteristic of teacher labor market in Turkey is the excess supply of

teachers. As of 2010, approximately 327 thousand teachers wait to be employed by the

public sector and the number of applicants is three to four times higher than the number

of the opening teaching positions (Figure 2). This army of inactive teachers represents a

significant population given that the number of employed teachers in the public sector is

680 thousand. MONE also predicts that the optimal number of employed teachers in

public education system 717 thousand 5. Under these circumstances the gap between the

supply and the demand of teachers widens cumulatively.

5
http://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdf
As of 2010, MONE demanded 782 mathematics teachers and it received 2798

applications. For science these figures are 861 and 3546 6, respectively and the gap more

or less is evident in every subject; thus excess supply is not specific to some of the

subjects.

Figure 2: The number of open positions and applicants by subject


4000
3500
3000
2500
2000
1500
1000
500
0
Math Science Physics Biology Chemistry
and Tech
# Open positions # Applicants

Source: Author’s own calculations from http://personel.meb.gov.tr/ana_sayfa.asp

A candidate rationalization of this excess supply may be the presence of very attractive

teacher salaries in Turkey. However the teacher salaries are not attractive at all in Turkey.

In the public sector the starting salary of a teacher is around 14000$ and it does not

improve much with experience (Figure 3). The salary of a teacher with 15 years of

experience is around 16000$ (OECD, 2009).

Dolton and Gutierrez (2011) present a cross-country analysis of teacher pay and

performance by taking the relative earning distribution in each country into account.

6
http://personel.meb.gov.tr/ana_sayfa.asp
Their analysis confirms that the teacher salaries are not especially attractive in Turkey

and the salary-experience profile is flat (Figure 4).

Figure 3: Ratio of salary after 15 years of experience to GDP per capita


2.5

1.5

0.5

United States

Israel
Netherlands

Austria
Greece

Sweden
Korea

Portugal

Italy
Germany

Denmark

France

Norway
OECD average
Slovenia
Scotland

Estonia
Switzerland

Hungary
Spain

Chile

Luxembourg
Japan

Australia
Belgium (Fl.)

Belgium (Fr.)

Finland
Mexico

Ireland

Iceland
New Zealand

England

Turkey
Czech Republic

Source: (OECD, 2009)

Figure 4: Average teacher wage-experience profile in Turkey

Source: (Dolton & Marcenaro Gutierrez, 2011)

Therefore the starting salaries and the expectation of relatively higher salaries in the

teaching profession cannot explain the excess supply in the teacher labor in Turkey.
Another important feature of the teacher labor market is that all the public servants in

Turkey are protected by law and unions and the job separation is a very unlikely event.

As a result teaching profession offers substantial job security and given the presence of

very high chronic unemployment rates individuals value job security heavily. One study

(Caner & Okten, 2010) analyzes the college major choice decision in a risk and return

framework using university entrance exam data from Turkey and show that individuals

are very sensitive to risk during career choice.

It should be also noted total enrollment in education faculties in Turkey also increased

steadily in time: The total enrollment increased from 33 thousand in 2007 to 45 thousand

in 2008 and 54 thousand in 2009 and MONE expands the teaching force by

approximately 40 thousand each year 7.

Thus a combination of an intense demand for job security and increased quotas of

education faculties may provide a more sensible explanation for the excess supply in

teacher labor market in Turkey.

5. Legal framework of teacher selection in Turkey

There are three main legal sources which regulates the hiring of teachers in Turkey. First,

teachers working in the public sector are subject to Law No. 657. This law defines the

rights as well as legal obligations of public servants since 1965. Second, the regulation of

the tests concerning the assignments of public servant candidates describes the testing

procedure for public servant posts since 2002. Third, MONE’s regulation of teacher

assignment and replacement explains how the testing procedure and test results apply to

7
http://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-
fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspx
teacher selection process. The current version of this regulation is legislated in 2010 and

it has changed many times in the past according to the needs of MONE.

The regulation of the tests concerning the assignments of public servant candidates

basically forms a turning point in teacher selection; because it causes a radical change in

teacher selection policy in Turkey.

In teacher selection system before the legislation of this regulation, i.e. prior 2002, any

eligible teacher candidate was able to apply to any available position announced by

MONE. The applications were processed in provincial offices of MONE and then the

final decision was given by the headquarters of MONE in the capital, Ankara (Figure 5).

Figure 5: A presentation of teacher selection system before 2002

This system was a cause of concern in MONE as well as in State Planning Organization

(SPO) (SPO, 1989). One of the main issues of the pre-2002 system was highlighted by

MONE as a constant imbalance of teacher population across regions. According to the

Research and Development department of MONE, one preliminary report of the 1993

National Education Assembly stressed that more than 10 percent of teachers employed by

MONE in urban areas did not teach a single class. Another issue documented in MONE’s

record associated with the pre-2002 was that political pressures and interventions
damaged the fairness and equality principles in teacher employment and caused unrest

among teachers (EARGED, 1995). Indeed this was well-known publicly that to have

connections in provincial offices as well as in the capital was essential to get hired. Thus

nepotism was a general worry about this selection process.

Following the legislation of the above mentioned testing regulation Center of Student

Selection and Placement (OSYM) launched a central examination process which is

known as Public Servant Selection Examination (KPSS). This exam has two sessions:

For the first session the teacher candidates have to answer 120 multiple choice questions

about Turkish, Mathematics, History, Citizenship, General Culture and Geography in 180

minutes. In the second the teacher candidates have to answer 120 multiple choice

questions about educational psychology, educational programs and teaching and

educational guidance in 180 minutes. Then applicants are assigned to teaching positions

centrally by MONE according to their test scores in the central examination and their

ranked list of preferred teaching positions (Figure 6). OSYM conducts the exam annually

and if a teacher candidate fails to be placed to a teaching position then s/he has to take the

exam again in the following year.

Figure 6: A hypothetical presentation of teacher selection after 2002

In this teacher selection system it is not possible to game the hiring process and it is also

not possible to leverage nepotism in order to get a teaching position. Thus it is reasonable
to claim that the central examination and allocation of teaching positions based on test

scores address the problem of lack of fairness. However two questions remain to be

answered: Does the new system ensure that the qualified teachers are employed? Does

this system have an impact on the regional imbalance of teacher population? The first

question is critical because it was one of the main goals of BEP. The second question is

critical because it was the chronic problem of education system (EARGED, 1995; SPO,

1989).

6. Data

In order to answer these research questions I employed TIMSS 19998 and TIMSS 20079

data sets for Turkey. These data sets have some very important qualities which render

them very suitable to analyze the questions in interest.

First, as mentioned earlier, these projects assess a representative set of 8th graders in the

participating countries. 8 th grade is the final grade of primary education in Turkey and

thus students in the sample should have spent at least a couple of years in their current

institutions.

Second, it is possible to link teachers to students in the same classroom which makes

these data sets especially attractive for this analysis.

Third, the TIMSS project conducts four questionnaires, i.e. student, school, mathematics

teacher and science teacher questionnaires. The student and teacher questionnaires

contain extensive information about demographic and socioeconomic characteristics of

8
http://timss.bc.edu/timss1999.html
9
http://timss.bc.edu/timss2007/index.html
students and teachers. In addition, the school questionnaire contains information on

school location, resources and governance.

Fourth, the information collected in 1999 and 2007 is comparable to a certain extent. The

questionnaires in 1999 and 2007 are not overlapping extensively; however most of the

essential information is available in both data sets.

Fifth and most importantly, the policy change which is subject to the evaluation in this

study falls into the middle of 1999 and 2007, the dates Turkey participated to TIMSS.

This allows me to have a reasonable number of observations who are subject to the policy

change which was launched in 2002.

Lastly, the teacher experience is reported in years such as 1, 2, 3 etc. but not in year

categories such as 0-4, 5-8 etc. This distinction is crucial for this analysis because the

data on teacher experience in TIMSS allow me to define the treatment and control groups

with respect to the inception date of the policy change.

7. Methodology and empirical analysis

For the empirical analysis, first, I merged the student, school and teachers data sets for

1999 and 2007 and compiled the 1999 and 2007 TIMSS data sets. Then I defined the

treatment group as the students whose teachers have four or less years of experience. This

assumption is necessary because I do not observe whether the teachers were selected via

central examination or not. Thus I claim that this definition of treatment group

approximates the ideal case.

The justification of this assumption is based on the timing of the TIMSS application and

the central examination. The first central examination in Turkey was conducted in July
2002; OSYM announced the test scores in August 200210 and MONE distributed the

teaching posts based on announced test scores in September, October and November

200211. On the other hand The TIMSS 2007 application in Turkey was conducted in

April, May and June 2007(Olson, Martin, Mullis, & Arora, 2008). Thus a teacher who

was selected with the first central examination should have assigned to the post as early

as September 2002 and the same teacher should have answered the TIMSS teacher

questionnaire as late as June 2007. According to this hypothetical example this teacher

should not have five years of experience at the time of TIMSS application. Therefore the

treatment group is assumed to be as defined above.

However this is an imperfect measure of selection via central examination: First, teacher

turnover leads to measurement error; because it is possible to quit and return teaching

which may be especially an issue for female teachers who may substitute teaching with

child raising for a couple of years. Second, OSYM conducted another central

examination which is known as Central Elimination Examination for Institutions (KMS)

in 200112. KMS was different then KPSS and it is not clear how many teaching posts

were distributed based on KMS scores as well as whether KMS scores were the sole

determinant of the teacher assignments. This issue may also lead to measurement error.

Keeping these shortcomings in mind I basically compared the difference of average

student achievement between treatment and control groups in 1999 and 2007 with a basic

differences-in-differences approach. The main assumption of this approach is that the

10
http://www.osym.gov.tr/belge/1-6128/2002-sinavlari.html
11
http://personel.meb.gov.tr/sayfa_goster.asp?ID=207
12
http://www.osym.gov.tr/belge/1-12485/2001-sinavlari.html
change in mean test scores that the control group experiences over time reflects the same

change that the treatment group would have experienced had they not been exposed to the

treatment. Another important assumption of differences-in-differences approach is that

unobserved characteristics have the same distribution across time points and across

treatment groups. I will discuss the validity of these assumptions in the subsequent

sections.

For the differences-in-differences analysis I have estimated the following regression

models:

Table 2: Difference-in-Differences estimations


In these regression models represents the dependent variable which is either the

mathematics or science test score. However it should be mentioned that TIMSS does not

provide point estimates of mathematics and science test scores instead TIMSS gives five

plausible values of mathematics and science ability. For the sake of simplicity I averaged

the five plausible values for each subject and then used the averaged plausible values as

the measure of the subject test score. TIMSS 2007 Technical Report highlights that

taking the average of the plausible values will not yield suitable estimates of individual

student scores (Olson, et al., 2008). In this analysis I repeated some of the estimations

with plausible values and then compared the point estimates and the standard errors of the

population parameter in interest, i.e. . In all cases the point estimates were very close to

each and the standard errors were slightly larger which did not affect the statistical

significance levels.

In these regression models stands for the TIMSS cycle (1999 and 2007),

defines the treatment variable which equals to 1 if the subject teacher has four

or less years of experience. Observed information regarding teachers, students, classes

and schools enters the regression models as control variables (Table 3).

The list of control variables was basically constructed within the data limitations. The

variables available in TIMSS 1999 and 2007 data sets are not overlapping to a significant

degree and in some cases although the necessary variables are available in both data sets

the scales of measurement are different. For example this was a serious issue in terms of

school location variable. All in all I experimented with every variable which is available

in both data sets. The number of missing observations partially had an impact on the list

of control variables.
Table 3: List of Control Variables
Teacher Class Student School
characteristics characteristics characteristics resources
Sex Diversity in Sex An indicator
academic ability for school
resources
Age Diversity in Age Location
socioeconomic
background
Subject degree Presence of Parental
disruptive students education
Experience Class size # books at home
Instructional time Computer at
home
Language
spoken at home

Following the difference-in-differences analysis with mathematics and science

achievement I utilized another aspect of the data structure: The treatment variable offers

variation by subject. This means that same student may have a mathematics teacher who

has four or less years of experience whereas his science teacher may have more than four

years of experience (or vice versa). Given that both the mathematics and science test

scores are observed for each student this structure allows me to employed individual

fixed effects. For that purpose I compiled the mathematics and science data sets and

incorporated student fixed effects into the regression models defined in Table 2. This

approach allowed me to relax one of the assumptions which are associated with

difference-in-differences approach. After adding student fixed effects into the model I do

not have assume that unobserved student and school characteristics have the same

distribution across time points and across treatment groups. However I still have to

assume that unobserved class characteristics have the same distribution across time points

and across treatment groups (Table 4). Lastly it should be also mentioned that there are
other examples which employs very similar identification strategies such as the study of

Lavy (2010). In this study the researcher establishes a causal link between instructional

time and student achievement by making use of the within-individual variation in the test

scores and within-subject variation in the instructional time. In its essence the

identification strategy I am employing is identical to the approach Lavy (2010) used with

one exception that I embedded it into a difference-in-differences framework (Table 4).

Table 4: Fixed effects and difference-in-differences estimations

Although this identification strategy allows me to relax some of the assumptions of the

differences-in-differences approach it has also its own shortcomings: First, it leads to a

reduction in the sample size automatically and this problem becomes more pronounced in

sub-group analysis. Second, it is not possible to decompose the effect into two parts as

learning gains in mathematics and learning gains science.

8. Findings

The following table gives the estimated values for the coefficient of interest under

different specification as described in Table 2 as well as it also presents sub-group

estimates of this coefficient. The analysis has been conducted separately for mathematics

and science test scores (Table 5).


Table 5: Estimation results of difference-in-differences
Mathematics
Whole sample Female teacher sample Male teacher sample Below median achievers Above median achievers
sample sample
Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R²

Model 1 -17.86 [13.94] 0.04 -23.53 [22.02] 0.07 -15.50 [17.44] 0.03 -2.38 [6.32] 0.03 -11.90 [8.50] 0.05
Model 2 -28.15* [16.46] 0.08 -25.76 [20.89] 0.19 -20.10 [21.03] 0.06 -4.34 [7.47] 0.04 -18.43* [10.30] 0.08
Model 3 -27.12 [17.26] 0.10 -10.76 [21.23] 0.26 -28.35 [24.38] 0.09 -2.80 [7.12] 0.05 -16.59 [11.18] 0.10
Model 4 -14.19 [14.02] 0.27 -7.01 [18.09] 0.38 -10.20 [20.20] 0.24 -2.40 [6.82] 0.10 -8.38 [9.29] 0.21
Model 5 -0.61 [13.56] 0.30 6.45 [17.84] 0.40 0.55 [19.41] 0.26 0.22 [6.66] 0.11 -0.26 [9.14] 0.23
Obs. 6,750 2,757 3,993 3,354 3,396
Science
-8.46 [6.25] 0.12
Model 1 -15.49 [12.30] 0.07 -42.53** [16.51] 0.10 4.36 [17.24] 0.05 -3.41 [6.31] 0.01
Model 2 -17.42 [12.34] 0.09 -28.14 [17.63] 0.12 -2.40 [17.70] 0.08 -5.99 [5.76] 0.03 -5.87 [6.94] 0.14
Model 3 -17.85 [13.71] 0.14 -15.22 [21.73] 0.18 -12.49 [17.03] 0.17 -4.83 [6.05] 0.04 -4.79 [7.31] 0.18
Model 4 -6.89 [10.63] 0.31 -4.98 [17.31] 0.35 -5.00 [13.05] 0.31 -2.70 [5.23] 0.10 -2.21 [6.52] 0.27
Model 5 -6.29 [10.56] 0.31 -11.90 [18.50] 0.36 -2.73 [12.71] 0.31 -4.15 [5.32] 0.11 -1.38 [6.53] 0.27
Obs. 7,085 3,131 3,954 3,536 3,549
Robust standard errors in brackets clustered at the class level, *** p<0.01, ** p<0.05, * p<0.1
The results in Table 5 draw attention to several important issues: First, standard errors are

very large. Among 50 point estimates of the treatment effect only three of them are

statistically different than different than zero at least at 10 percent significance level.

Second, almost all of the point estimates have a negative sign. Third, the point estimates

are not stable. In the Model 1 without any control variables the point estimates are

negative and large; however the addition of teacher, class, student and school

characteristics into the regression model rasps this negative treatment effect towards zero.

In some sub-groups addition of these control variables also led to sign changes. A closer

look to the female teacher and male teacher sub-groups highlights that this problem is

much more severe in female teacher sub-group. All in all, the difference-in-differences

analysis does not provide any information about the possible impact of treatment on

student learning. Because of the very large standard errors the treatment effect may be

negative, zero or positive. However it also shows that observed class, student and school

characteristics do not have the same distribution across time points and across treatment

groups given that the point estimates are instable and change signs. Therefore it is also

very likely that unobserved class, student and school characteristics do not have the same

distribution across time points and across treatment groups which is a violation of the

assumptions underlying difference-in-differences approach. This may also be a sign of

differential assignment of teachers with four or less years of experience to classrooms

between 1999 and 2007. In the following I incorporate the student fixed effects into the

regression models in order to take into account the factors at the student and school levels

(Table 6). However teacher and class characteristics vary between the subjects; thus the

regressions contain controls for observed teacher and class characteristics.


Table 6: Estimation results of student fixed effects and difference-in-differences
Mathematics & science scores combined
Whole sample Female teacher sample Male teacher sample Below median achievers Above median achievers
sample sample
Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R²

Model 1 3.68 [10.62] 0.01 15.10 [12.92] 0.04 7.11 [10.85] 0.03 3.94 [12.59] 0.02 2.94 [8.44] 0.00
Model 2 4.28 [9.36] 0.09 16.07 [12.61] 0.10 8.59 [13.17] 0.14 -0.60 [12.49] 0.20 2.42 [8.13] 0.03
Model 3 14.77** [6.89] 0.22 41.56** [18.32] 0.30 17.63 [14.13] 0.23 20.67*** [5.32] 0.52 6.17 [9.91] 0.06
Obs. 4619 612 1166 2959 1675
Robust standard errors in brackets clustered at the class level, *** p<0.01, ** p<0.05, * p<0.1

Table 7: Alternative treatment definitions


Math & science scores combined – Model 5
Whole sample Female sample Male sample Below median achievers Above median achievers
sample sample
Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R² Coef Std Err Adj R²

5-8 years -4.63 [8.18] 0.22 6.64 [19.58] 0.30 0.55 [23.93] 0.23 -14.89** [6.64] 0.51 -0.29 [8.58] 0.06
9-20 years -0.55 [7.01] 0.23 -16.63 [16.99] 0.31 1.53 [10.58] 0.27 -13.69* [8.04] 0.53 6.92 [5.85] 0.06
20+ years -7.34 [8.32] 0.22 4.23 [17.16] 0.30 -33.28** [14.63] 0.24 12.04 [9.98] 0.51 -7.36 [6.14] 0.06
Robust standard errors in brackets clustered at the class level, *** p<0.01, ** p<0.05, * p<0.1
The results in Table 6 are in contrast with the result in Table 5. Generally the standard

errors are smaller; more interestingly with one exception all of the point estimates of the

treatment effect are positive. The point estimates are not sensitive to the addition of the

teacher characteristics to the regression; however they are very sensitive to the addition

of class characteristics. According to the Model 3, i.e. after controlling for teacher and

class characteristics, the impact of the treatment is estimated precisely for the whole,

female teacher and below median achievers samples.

The standard deviation of the dependent variable in the whole sample is 89. Thus the

impact of the policy change in 2002 on student achievement is around 0.17 standard

deviations. However the sub-group analysis exhibits that this impact is channeled mostly

through female teachers. The estimated impact of the treatment effect in the female

teacher sample is 2.8 times higher than the whole sample whereas in the male teacher

sample the impact is not precisely estimated. Another important inference which can be

drawn from Table 6 is that the below median achievers benefit more from the new

teacher selection compared to above median achievers. Thus the treatment effect is

concentrated on below median achievers. Lastly, the sensitivity of the point estimates to

the addition of class characteristics are in line with the findings in Table 5. This may be

due to the within-school (between classroom) differential assignment of teachers with 4

or less years of experience to classrooms between 1999 and 2007.

The findings in Table 6 provide evidence in favor of a positive and moderately large

treatment effect. Thus it may be claimed that within the contextual framework in Turkey

teacher selection with centralized testing may lead to higher learning outcomes compared

to a decentralized recruitment system. However, there may be other underlying reasons


which can potentially explain the findings in Table 6: For example, there may be a

secular increase in the quality of education faculties in Turkey. If this is the case the

estimated impact may be due to the quality increase in education faculties instead of the

new teacher selection policy. In the same line of thought it can be said that more and

more high school students with higher ability opt for education faculties; thus ability

distribution of the pool of teacher candidates may shift in time. However if these

arguments are true I should expect to detect positive estimates of treatment effect for

different segments of teachers. In order to test these arguments I divided the sample of

teachers who have more than four years experience into three parts such that the sizes of

the subsamples are equal. These segments are 5-8, 9-20 and 20+ years of experience.

Thus these categories defined the alternative treatment variables for each case and I

repeated the individual fixed effects exercise with the full model which includes teacher

and class characteristics as controls. In Table 7 none of the point estimates are

statistically significant and positive; additionally statistically insignificant point estimates

are small when compared with the positive point estimates in Table 6. Thus I failed to

detect any positive impact of the treatment effect with alternative treatment definitions.

Therefore it is more likely that the estimated impact is due to the new selection policy

rather than a secular increase in the quality of education faculties or student body.

9. Conclusion

These findings are suggestive in their nature and they are not suitable to make causal

inferences: Combining individual fixed effects with difference-in-differences allows for a

relatively precise estimate of the treatment effect. The remaining problem with this

approach is the lack of a complete set of classroom characteristics. The point estimates
are sensitive to the classroom characteristics and unobserved classroom characteristics

may cause a bias on the estimate. Although all of this analysis shows that the possible

direction of this bias is downward.

The findings also provide a reasonable explanation for the trend in TIMSS and PISA

results. First, since the analyzed period precedes the curriculum reform in Turkey the

findings cannot be attributed to the curriculum reform. Second, the findings present a

concentrated impact on below median achievers whereas no impact for above median

achievers. This is perfectly in line with what we observe in PISA cycles for students in

Turkey.

The findings are also in accordance with the literature on teacher quality: As mentioned

earlier teacher’s academic ability is one of most robust indicators of teacher’s

effectiveness (Hanushek, 2002, 2003; NCTQ, 2004). Basturk (2008) shows that test

scores in college entrance exam are highly predictive for the KPSS test score. Therefore

it should be reasonable to interpret success in KPSS as an indication of higher academic

ability.

Lastly, the following table depicts the degree of differential assignment of teachers into

schools and classrooms. These tables can be interpreted as MONE attempts to ensure a

more balanced distribution of teacher assignment across resource rich and poor regions.

As mentioned earlier MONE as well as SPO were concerned about the imbalance of

teaching force across regions (Table 8).

After the introduction of the central examination the teaching force became much more

female, the new teachers were assigned to classrooms which were much more diverse in

terms of socioeconomic background and have fewer resources for instruction. The
students in these classrooms were more likely to speak Turkish sometimes (but not

always), had fewer books at home and their parents were more likely to have less than

lower secondary education.

Table 8: Differential teacher assignment between 199 and 2007


1999 2007
TREAT=0 TREAT=1 TREAT=0 TREAT=1
Teacher's sex (%)
Female 41 40 35 67
Male 59 60 65 33
Wide range of backgrounds in class (%)
not at all 12 6 16 28
a little 49 48 35 20
quite a lot 31 36 38 23
a great deal 8 10 11 29
Resources for math instruction (%)
low 32 27 19 31
medium 65 66 72 65
high 4 7 9 5
Language at home
Always Turkish 93 84 94 78
Sometimes Turkish 6 14 6 20
Never Turkish 1 2 1 2
# books at home
0-10 20 27 20 37
11-25 36 40 36 41
26-100 29 21 27 15
101-200 9 5 10 5
200+ 6 6 7 2
Parental education
University Degree 10 5 10 2
Completed Post-
Secondary 21 13 4 2
Completed Secondary 68 80 72 71
Less Than Lower-
Secondary 2 2 13 22
Do Not Know 0 0 1 2
References

Aksit, N. (2007). Educational reform in Turkey. International Journal of Educational

Development, 27(2), 129-137.

Alacaci, C., & Erbas, A.K. (2010). Unpacking the inequality among Turkish schools:

Findings from PISA 2006. International Journal of Educational Development,

30(2), 182-192.

Basturk, R. (2008). Predictive validity of the science and technology pre-service teachers’

civil servant selection examination. Elementary Education Online, 7(2), 323-332.

Caner, A., & Okten, C. (2010). Risk and career choice: Evidence from Turkey.

Economics of Education Review, 29(6), 1060-1075.

Dincer, M.A., & Uysal, G. (2010). The determinants of student achievement in Turkey.

International Journal of Educational Development, 30(6), 7.

Dolton, P., & Marcenaro Gutierrez, O.D. (2011). If you pay peanuts do you get

monkeys? A cross country analysis of teacher pay and pupil performance.

Economic policy, 26(65), 5-55.

EARGED. (1995). Ogretim Yukunun Analizi. Ankara: MONE.

Eide, E., Goldhaber, D., & Brewer, D. (2004). The teacher labour market and teacher

quality. Oxford Review of Economic Policy, 20(2), 230.

Hanushek, E.A. (2002). Publicly provided education: National Bureau of Economic

Research Cambridge, Mass., USA.

Hanushek, E.A. (2003). The Failure of Input based Schooling Policies*. The economic

journal, 113(485), F64-F98.


Lavy, V. (2010). Do Differences in School’s Instruction Time Explain International

Achievement Gaps in Math, Science, and Reading? Evidence from Developed

and Developing Countries: National Bureau of Economic Research.

Martin, M.O., Mullis, I.V.S., Foy, P., & Olson, J.F. (2008a). TIMSS 2007: International

Mathematics Report: Findings from IEA's Trends in International Mathematics

and Science Study at the Fourth and Eighth Grades: IEA TIMSS & PIRLS

International Study Center, Lynch School of Education, Boston College.

Martin, M.O., Mullis, I.V.S., Foy, P., & Olson, J.F. (2008b). TIMSS 2007: International

Science Report: Findings from IEA's Trends in International Mathematics and

Science Study at the Fourth and Eighth Grades: IEA TIMSS & PIRLS

International Study Center, Lynch School of Education, Boston College.

Martin, M.O., Mullis, I.V.S., O’Connor, K.M., Chrostowski, S.J., Gregory, K.D., Smith,

T.A., & Garden, R.A. (2001a). Mathematics benchmarking report: TIMSS

1999—Eighth grade. Chestnut Hill, MA: International Study Center.

Martin, M.O., Mullis, I.V.S., O’Connor, K.M., Chrostowski, S.J., Gregory, K.D., Smith,

T.A., & Garden, R.A. (2001b). Science benchmarking report: TIMSS 1999—

Eighth grade. Chestnut Hill, MA: International Study Center, Lynch School of

Education, Boston College.

NCTQ. (2004). Increasing the Odds How Good Policies Can Yield Better Teachers:

NCTQ.

OECD. (2009). Education at a Glance 2009: OECD Indicators: Organization for

Economic Cooperation and Development.

OECD. (2010). PISA 2009 Results: Learning Trends: OECD.


Olson, J.F., Martin, M.O., Mullis, I.V.S., & Arora, A. (2008). TIMSS 2007: Technical

Report: International Association for the Evaluation of Educational Achievement.

Rivkin, S.G., Hanushek, E.A., & Kain, J.F. (2005). Teachers, schools, and academic

achievement. Econometrica, 73(2), 417-458.

Rockoff, J.E. (2004). The impact of individual teachers on student achievement:

Evidence from panel data. The American Economic Review, 94(2), 247-252.

Santiago, P. (2002). Teacher demand and supply: Improving teaching quality and

addressing teacher shortages. OECD Education Working Papers.

Schacter, J., & Thum, Y.M. (2004). Paying for high-and low-quality teaching. Economics

of Education Review, 23(4), 411-430.

SPO. (1989). Altinci bes yillik kalkinma plani 1990-1994. Ankara: SPO.

TTKB. (2008). İlkögretim Matematik Dersi 6–8 Sınıflar Öğretim Programı ve Kılavuzu

(Teaching Syllabus and Curriculum Guidebook for Elementary school mathematics

course: Grades 6 to 8). Ankara: Ministry of National Education (MONE)

Anda mungkin juga menyukai