Anda di halaman 1dari 5

Knowledge, interests and the

many meanings of evaluation:


a developmental perspective
Albaek E. Knowledge interests and the many meanings of
evaluation: a developmental perspectivc.
Scand J Soc Welfare 1998: 7: 94-98 0 Blackwell. 1998.

The problems addressed by evaluation research and practice


are the problems that emerged in the efforts to form,
consolidate, and reform the modern welfare state. Each of
these problems are linked with stakeholders with divergent
and often conflicting interests. One of the major
characteristics of the history of evaluation research has been
efforts to develop evaluation theory, design and methodology
that are responsive to the knowledge interests of different
stakeholders.
I

Without doubt the first evaluation ever recorded is


found in the Book of Genesis: And God saw
everything that he had made, and behold, it was very
good (Butler, 1986). Since man is created in Gods
image, evaluation is an immanent part of human
existence. Perhaps this is the single most decisive
feature to differentiate us from other living creatures:
not only are we aware of things, we also reflect on and
put a value on things. We evaluate i n everyday
ordinary practice; and we evaluate in the form of
formalized, systematized and sometimes researchbased investigations of public activities, policies,
programs, and projects. The latter form is what is
today usually understood by the concept of
evaluation and the object of this paper.
Although its historical roots go hack very far in
time, present-day evaluation had its decisive take-off
in the US in the 1960s - in terms of scope of activities,
theoretical and methodological foundation and
disciplinary professionalization. There is no doubt
that the US became an international trendsetter for the
institutionalization of evaluation that has taken place
in all OECD countries during the last 30 years,
although with considerable comparative differences
(Rist, 1990).
The object of this paper is the development in
evaluation research and practice since its take-off in
the 1960s. It is based on two claims:

1. Although evaluation quickly moved into other


fields, the development of evaluation research and
practice is to a large extent related to the
development in modern welfare state provisions

94

E. Albaek
University of Aarhus, Denmark

Key words: evaluation; knowledge interests; design; utilization


Erik Albzk, Department of Political Science, University of
Aarhus, Universitetsparken, DK-8000 Aarhus C, Denmark
Accepted for publicationJune 23,1997

and services. The problems addressed by evaluation


research and practice are the problems that emerged
in the efforts to form, consolidate, and reform the
modern welfare state.
2. Each of these problems is linked with stakeholders
with divergent and often contlicting interests. One
of the major characteristics of the history of
evaluation research has been efforts to develop
evaluation theory, design and methodology that are
responsive to the knowledge interests of different
stakeholders.

Euphoria: welfare state expansion and evaluation


take-off in the 1960s
Throughout the OECD, the 1960s became the heyday
of the development of the mixed economy and the
welfare state. Unprecedented economic growth made
it possible to redirect resources from the private to the
public sector and to provide welfare services that were
often the result of deliberate political choices (Keman
1993).
This was also the case in the US where a liberal the peculiar US term for left-of-center - climate
combined with a political will to act reigned. In the
words of the all-time most famous civil rights activist,
the liberals had a dream. A dream about a future
USA characterized by prosperity, welfare, and social
justice, especially for those social groups who had so
far been deprived. In the well-known enthusiastic
American political rhetoric, a New Frontier had to
be conquered to realize the dream of the Great
Society. To do so, a War on Poverty was declared.

0 Blackwell Publishers Ltd and Flireningrn Scandinavian Journal of Social Welfare 1998.
Published by Blnckwell Publishcrs, 108 Cowley Road, Oxford OX4 IJF. UK and 3.50 Malden Street, Malden M A 02148, USA

Knowledge, interests and the many meanings of evaluation


Never in history had the public economy been so
huge, i n absolute numbers or as a share of GDP, and
never had it grown so fast. Never had the public sector
engaged in such a wide array of activities, and never
had it intervened so deeply in other spheres of society.
Never had the public sector employed so many. And
exactly because the welfare state expansion and
proliferation happened so fast, a considerable
knowledge vacuum appeared.
First, the great zeal for reform was not matched by
well-founded knowledge about what to do. A number
of areas had a long tradition of reforms, research and
even evaluation, for example, in education (not only in
the US, but internationally). Most areas, however,
were virgin territory as far as reform was concerned
and policy formulating agencies suffered from a
considerable lack of substantial knowledge about
how to attack the problems.
Second, t h e growth i n the public sector
simultaneously resulted in an increase in principalagent related problems. Policy formulating officials
therefore had an interest in reducing implementing
agencies present information asymmetry advantages
in order to increase the possibilities of holding them
accountable. There was very little knowledge of
wheiher the public or publicly funded agencies and
institutions were actually doing what they were
supposed to be doing (output) and whether their
activities had the intended effects (outcome); and it
was an even bigger question whether they did it in a
cost-efficient manner. It therefore became necessary to
gather information. This was especially so in the US:
first, because it is part of the American ethos to make
sure that you get value for money (Weiss, 1993);
second, because the welfare state has been a politically
more controversial construct in the US than in many
other countries, which increased the incentives to
documcnt - positively or negatively - public sector
effectiveness and efficiency. Consequently, it is no
coincidence that evaluations came in demand and were
produced not only by policy formulating agencies, but
also by accountability units of government, for
example, the GAO.
Information on what could be done and what had
been done was first and foremost in demand among
elite level policy officials, elected as well as
appointed. Their institutional positions were
legitimized by two principles of governance which
made their knowledge interests top-down oriented: 1)
the normative regulation of the relationship between
politically elected policy officials and implementing
agencies in the parliamentary chain of governance,
fundamental to representative democracies. This
implies that politically responsible officials set out
guidelines for and control implementing agencies
(Olsen, 1978); 2) the principle of professional

governance in welfare provision relying on exclusive


expert knowledge. A final reason for the strong topdown orientation was the irony of democracy in the
US that progressive reform has often been initiated
and supported more by the elite than by the mass level
of politics (Dye & Zeigler, 1975).
In the 1960s the US federal government was
invaded by progressive reformers. They were the
best and the brightest graduates from US elite
institutions of higher education. They knew the
potential information suppliers, either in person or
by temperament: i.e. social science researchers. Thus,
the foundation was created so that evaluation research
could develop into a discipline and industry in its own
right.
Utilizing social science in public policy making had
clear ideological advantages. At that time, ideologies
were declared dead and replaced by rational decision
making. So in a nation united in enthusiasm over
having reached the space age through scientific
progress, nothing could be more appropriate than
using scientific knowledge to solve societal problems
(Nelson, 1977). Especially when a clear symmetry
between the then dominant models of rational decision
making and positivist science allowed for the
utilization of social science in public policy making
without jeopardizing the roles traditionally assigned to
the policy maker and the social scientist. The policy
maker was still the chief reformer, but now used
scientific evidence; the social scientist was still the
supplier of scientific evidence, but was now allowed to
assist in reforming society (Albaek, 1995).
But there were also substantial reasons for a
transfer of the dominant positivist paradigm in the
academic social science disciplines to the emergent
applied social science discipline intended to evaluate
public policy activities. First, it was natural to turn to
t h e long established academic disciplines for
inspiration. Second, and even more important, the
positivist paradigm was capable of supplying
information that matched policy officials knowledge
interests. The prototypical experiment in positivist
social science 1 ) matched trial-and-error reform logic
to an extent that one of the founding fathers of
evaluation research called for an experimenting
society (Campbell, 1969; 1988); and 2) held the
promise of generating knowledge about potential
nationwide replicability of policies or programs. This
was possible ex ante in the form of true experiments or
ex post through, for instance, statistical manipulation
in quasi-experimental designs. Furthermore, its strong
reliance on numbers and statistics made the positivist
paradigm suitable for providing accountability
information, including information which could be
coupled with cost-efficiency and cost-benefit oriented
analyses.

0 Blackwell Publishers lard and Foreningen Scandinavian Journal of Social Welfare 1998

95

Albaek

Realistic expectancy adaptation: welfare state and


evaluation consolidation in the 1970s
It is almost a law of nature that when you come down
from a high you will get a hangover, and this is what
happened in relation to the mixed economy and
welfare statism in the 1970s. After the happy trip
through the 1960s, the OECD economies, among other
things because of the 1970s oil crisis, were struck by
an almost unthinkable combination of problems in the
form of high unemployment, low economic growth,
high intlation, large budgets and balance of payment
deficits. All these problems were supposed to have
been eradicated. In welfare provision many problems,
which had been considered tame during the 1960s
progress euphoria, turned out to be wicked and thus
far less receptive to solutions than expected (Rittel &
Webber, 1973). The 1970s became the decade when
people started talking about the crisis of the welfare
state. The crisis actually did not result in cutbacks;
on the contrary, the welfare state continued to grow
and consolidate in the 1970s (Alber, 1988). But with
the consolidation the expectations became more
realistic.
After a while it became clear that the expectation of
creating scientifically tested, generalizable and
replicable knowledge had been far too high; that goal
attainment measurements were hampered or even
precluded by unrealistic rationality preconditions; that
for the same reasons the expected rationalinstrumental evaluation research utilization did not
happen; and that implementation was not technicaladministrative as expected, but on the contrary a
highly political process.
The implementation issue in particular brought new
policy actors on to the stage, at first managers and staff
operating welfare deliveries, then users and clients.
They had other knowledge interests and therefore
demanded other types of information than the output/
impact related type. Not that they had no interest in
output/impact, but their primary knowledge interest as
practitioners consisted of knowing whether their daily
practice was on the right track. Whereas evaluation in
the 1960s was a top-down management tool for policy
principals, evaluation in the 1970s developed bottomup oriented implementation tools for program
operators. This development of evaluation theory,
design and methodology had two dimensions.
Its purpose was to improve policy management at
various levels of government and at the program,
project or institution levels. Since public administration and service delivery had traditionally been
considered purely technical matters, very little was
known about implementation and implementation
processes. Implementation was idcntified as the
missing link in the understanding of the politico-

96

administrative system (Hargrove, 1975), and now a


search for this link was initiated -inspired by political,
a d m i n i s t r a t i v e and o r g a n i z a t i o n a l s c i e n c e .
Implementation research became a booming industry,
as an academic as well as an applied discipline. There
was a parallel development of evaluation design and
methodology for program monitoring with focus on
input and process data. Not only was it meaningful to
monitor the implementation process because it is the
very precondition for program effectiveness, but also
because it is often difficult, impossible or misleading
to measure output and impact (Majone, 1988).
There was also a substantial reason for evaluation to
focus on the implementation process. In the
summative impact evaluation based on (quasi-)
experimental designs, the intervention/program is a
black box. But what the professional operators
involved in welfare delivery needed was knowledge
on precisely what went on inside the box, i.e.,
knowledge on which program mechanisms operate to
bring about intended effects. To formulate and test
substantial theories of program processes, process
evaluation was developed - as distinct from the
organizational aspects of the process, the object of
implemetation research and analysis.
As a rule, the experimental design is less useful for
process than for output and impact analyses, because it
is often necessary to include large amounts of
contextual data. For the same reason, it is typically a
good idea to use qualitative data in this type of
analysis. However, as far as both design and method
were concerned there was not much to work with.
Therefore, implementation and evaluation research
needed to develop new research designs on methods.
Contrary to the common understanding, the applied
rather than the academic social sciences became the
front-runners of methodological development during
the 1970s. This created a considerable level of selfconsciousness and professionalization among
evaluation researchers, who distanced themselves
more and more from the traditional academic
standards and developed their own research and
practice standards. Furthermore, implementation and
process evaluation studies made it natural for the
researchers to start a dialogue with the responsible
managers and staff in connection with program
development and monitoring. In such formative
research designs the researcher role took on an
element of consultancy.
Evaluation utilization studies revealed that in
actuality the attempts to de-politicize policy making
and evaluation research in the 1960s were often a
legitimizing cloak for political and symbolic research
utilization. Accordingly, the practical evaluation work
revealed that there were more principals with
diverging and often conflicting interests than had been

0 Blackwell Publishers Ltd and Wreningen Scandinavian Journal of Social Welfare 199R

Knowledge, interests and the many meanings of evaluation


assumed in the notion of the rational decision process,
just as the actors in the implementation process had
various and only partly coinciding - substantial as well
as institutional - interests. In other words, it became
obvious that evaluation has many stakeholders.
Evaluation research responded by developing
theory and designs that take the political nature of
evaluations into account. At one end of the scale
goal-free evaluations were developed, which in
their design avoid dealing with possible goals or
interests altogether (Schriven, 1973). At the other end
of the scale are stakeholder evaluations, which do
take several or all interests involved into account
(Bryk, 1983).
With many interests involved, it also became clear
that a policy or program could not be viewed from one
single panoptic point as assumed in the rationalscientific evaluation design. With inspiration from
interpretavist humanist philosophy of science,
methods were developed to uncover how a policy or
program looks from the perspectives of different
knowledge interests (Smith, 1983; Smith & Cantley,
1985), and interactionist or constructivist approaches
to producing a negotiated program understanding
(Guba & Lincoln, 1989). In the latter instance, the
evaluation researcher takes on the role of broker.

The 1980s and after: evaluation in the service of public


sector reform
After the explosive growth in the 1960s and the solid
consolidation in the 1970s, the notion developed
around 1980 that the welfare state had lost its
dynamism and instead had become stuck in outmoded
ways of thinking and intlexible structures and
processes. The suggested cure for these ailments was
major surgery in the form of public sector reforms
(OECD, 1980). Such reforms had both an
effectiveness and an efficiency dimension, although
the latter dimension has received most political
attention. Inspired by public choice theory,
governments of various political colors launched
remarkably similar reform initiatives in all OECD
countries - some more spectacular (e.g. USA and GB)
than others.
The reforms questioned the governance principles
of hierarchy and professionalism. Instead, other
governance instruments were suggested as substitutes
or supplements. One suggestion was decentralization.
However, decentralization increases the information
asymmetry in favor of the implementing agencies. In
order to make up for this asymmetry, three options
were suggested: I ) government by the market gives
users of welfare services a quasi-market free exit
choice, for example, free hospital and school choice,
which through the market mechanisms forces public

institutions to adapt to consumer preferences; 2) an


increased voice, for example, in the form of elected
user boards in public institutions is not based on the
market logic but on a participatory-democratic right to
participate in the decision making in matters that relate
to important aspects of day-to-day life; 3) evaluation
of performance, quality, effectiveness and efficiency,
including measurements of public sector consumer
preferences and user satisfaction.
Policy principals need for control information is
increasingly provided by evaluation which is
increasingly incorporated into standard audit and
budgeting procedures. Such evaluations are top-down
oriented and often in contlict with the bottom-up
evaluations developed in the 1970s, among other
reasons because in the latter evaluation types cozy
relationships (Guba & Lincoln, 1989, p. 33) can
develop between evaluator and professionals/front-line
staff so that the evaluations contribute to shirking
under the cloak of children, the elderly or patient
needs.
As far as the content of welfare policies is
concerned there has also been cause for new ideas
and reforms. In a number of areas, earlier attempts to
help especially weak social groups failed. One reason
could be that the program theory as such was not valid,
because it was based on the systems, including the
professionals, premises. As an alternative, program
and evaluation designs were developed, intended to
empower such groups to actively formulate their
wishes and demands on their own premises (Fetterman
et al., 1996). The underlying concept of this way of
thinking is the right to participate in decisions with
consequences for ones own life, and the fact that only
responsiveness to user wishes and demands and their
active support will result in effective problem solving.
Empowerment and related evaluation types are
theoretically rooted in action research. According to
empowerment evaluators, they take on the role of
midwife. If not for its negative clinical
connotations, therapist would have been a more
appropriate role metaphor.

lessons learned
The history of evaluation has increased our awareness
of the many stakeholders divergent substantial
interests (for example, psychologists and doctors
often have quite different theoretical and professional
approaches to the treatment of mental patients) as well
as of institutional interests (budget maximizing and
turf protection). Different stakeholders will have
different knowledge interests, among other reasons
because the way the public sector is structured and
functions is not legitimized by one but by several and
often conflicting govern an ce pr i n c i p le s : the

Q Blackwell Publishers Ltd and Foreningen Scandinavian Journal of Social Welfare 1998

97

AIbaek
parliamentary chain of governance, professionalism,
corporatism, local government, market logic, and
users right to participate in the governance of
institutions that have an impact on their lives. This
means that evaluation stakeholders will have some
knowledge interests in common, but at the same time
they will have knowledge interests which are specific
to them and sometimes in conflict with others
knowledge interests (Albaek, 1996). Some evaluation
designs and methods are better suited to accommodate
some stakeholders information needs that others.
This is one of the reasons that so many different
meanings are attached to, and so many bitter battles
are fought over, the concept of evaluation that it is
sometimes hard to figure out that it is one and the
same phenomenon. Still, there is no one-to-one
correspondence between evaluation theory, design
and method on the one hand and stakeholders
knowledge interests on the other. For instance,
empowerment evaluation may be legitimized both
from a public choice (sympathy with consumer
preferences) and a participatory democratic point of
view (users right to form their own lives). And the
strategically most conscious top-down control
evaluation would negotiate a constructed agreement
with implementing agencies on acceptable evaluation
standards in order to reduce the latters post-evaluation
room of critical maneuvering.
The bottom line - as seen by a political scientist - is
this: only in the best of all worlds can everybody be
satisfied. But we do not live in that world. And we
never will. Therefore we have to take politics seriously
- in evaluation theory and in practice.

Refecences
Albaek E (1995). Policy evaluation: design and utilization. In:
Rist RC, ed. Policy Evaluation: Linking Theory to Practice.
Aldershot, Edward Elgar.
Albrek E (1996). Why all this evaluation? Theoretical notes and
empirical observations on the functions and growth of
evaluation with Denmark as illustrative case. Canadian
Journal of Program Evaluation, 1 l(2): 1-34.
Alber J (1988). Is there a crisis of the welfare state? Crossnational evidence from Europe, North America, and Japan.
European Sociological Review 4(3): 18 1-207.
Bryk AS, ed. ( 1 983). Stakeholder-based evaluation. San

98

Francisco, CA, Jossey-Bass.


Butler R (1986). Program evaluation: a central perspective. In:
Policy nianagenzent and policy assessment. Royal Institute
of Public Administration/Peat Marwick. Cited from Gray A,
Jenkins B (1990) Policy evaluation in a time of fiscal stress.
In: Rist. RC, ed. Policy and program evaluation:
perspectives on design and utilization. Brussels, IIAS.
Campbell DT ( I 969). Reforms as experiments. American
Psychologist 24: 409-429.
Campbell DT (1988). The experimenting society. In: Campbell,
DT Methodology and epistentology for social science:
selected papers, edited by E. Samuel Overman. Chicago,
Ill, University of Chicago Press.
Dye TR, Zeigler LH (1975). The irony of democracy: an
uncommon introduction to American politics. North
Scituate, MA, Duxbury Press.
Fetterman DM, Kaftarian SJ, Wandersman, A, eds. (1996).
Empowerment evaluation: knowledge and tools for selfassessment & accountabi1it.y. Thousand Oaks, CA, Sage.
Cuba E, Lincoln Y (1989). Fourth generation evaluation,
Thousand Oaks, CA, Sage.
Hargrove E (1975). The missing link: the stud.y of the
iniplementation of social policy. Washington, DC, Urban
Institute.
Keman H (1993). Proliferation of the welfare state:
Comparative profiles of public sector management, 196570. In: Eliassen KA, Kooiman J, ed. Managing public
organizations: lessons from comtemporaty European
experience. London, Sage.
Majone G (1988). Policy analysis and public deliberation. In
Reich RB ed., The power of public ideas. Cambridge, MA,
Harvard University Press.
Nelson RR (I 977). The moon and the ghetto: an essay on public
policy analysis. New York, W.W. Norton & Co.
OECD (1980). Strategies for change and reform in public
management. OECD Report 1980: 13. Paris, OECD.
Olsen JP (1978). Folkestyre, burikrati og korporativisme.
[Democracy, bureaucracy and corporativism]. In: Olsen
JP, ed., Politisk organisering. Oslo, Universitetsforlaget.
Rist RC, ed. (1990). Program evaluation and the management
of government: patterns & prosepcts across eight nations.
New Brunswick, Transaction Publishers.
Rittel HW, Webber M (1973). Dilemmas in general theory of
planning. Policy Sciences 4(2): 155-169.
Schriven M (1973). Goal free evaluation. In: House ER, ed.,
School evaluation: the politics and process. Berkeley, CA,
McCutchan.
Smith G, Cantley C (1985). Assessing health cure: a study in
organizational evaluation. Milton Keynes, Open University
Press.
Smith J K (1983). Quantitative versus interpretative: the
problem of conducting social inquiry. In: House ER, ed.,
Philosophy of evaluation, San Francisco, CA, Jossey-Bass.
Weiss CH (1993). Lessons from the U.S. evaluation experience.
Politica 25( I): 64-76.

8 Blackwell Publishers Ltd and Fiireningen Scandinavian Journal of Social Welfare 1998