Anda di halaman 1dari 12

Informetrics and Webometrics for Measuring

Impact, Visibility, and Connectivity in


Science, Politics, and Business
I rene Wormel l
University College of Bors, Swedish School of Information and Library Studies
EXECUTIVE SUMMARY
The paper reports the application of classical bibliometric methods to evaluate the
impact of scientic, political, and business developments. The novel aspect is to
regard the Web as a citation network where the traditional information entities
(scientic articles and the citations given to and taken from them) are replaced by
webpages with external and internal links. The methodological studies are intended
to test the usability of large international citation databases and the World Wide
Web as feasible and reliable tools in quantitative analyses for gathering useful
information for business intelligence. Formerly, the impact of authors and their
scientic production was measured by the average citation frequencies of journals
publishing their research: the Journal Impact Factor (JIF), calculated by the Institute
for Scientic Information (ISI) in the United States and published annually in the
Journal Citation Reports (JCR)the most frequently used quantitative indicator to
measure the quality/value/impact of research works published in the core
international journals. It has been suggested that, by calculating the number of
webpages pointing to a given site, analogously, a Web Impact Factor can be calculated
as a way of comparing the attractiveness of sites or domains on the World Wide Web.
A key to webometric studies has been the use of large-scale Web search engines such
as Alta Vista that allow measurements to be made of the total number of pages in a
Web space and links to those Web spaces. These search engines provide similar
possibilities for the investigation of links between documents to those provided by the
citation databases created by ISI. To illustrate the scope and nature of informetric
research methods applied in competitive intelligence analysis, a short presentation is
given of samples of research studies carried out at the Centre for Informetic Studies in
Copenhagen (CIS). 2001 John Wiley & Sons, Inc.
Competitive Intelligence Review, Vol. 12(1) 1223 (2001)
2001 John Wiley & Sons, Inc.
12
Introduction
Citation analysis is a wide-ranging area of bibliometrics
1
that studies the citations to and from documents (when
document A is mentioned in document B). The men-
tion most typically occurs in the reference list of docu-
ment B. Mention may, however, also occur in the text
of documents, such as in the endnotes, footnotes, or
bibliography. Since 1971, the Institute of Scientic In-
formation (ISI) in Philadelphia has undertaken a system-
atic analysis of journal citation patterns across science
and technology, social sciences, art and humanities, pro-
ducing three separate citation indexes covering the dif-
ferent disciplines.
The idea of measuring average citation frequencies,
that is, the Journal Impact Factor (JIF), as one of the most
widely used quantitative indicators was developed more
than 14 years ago. This is a two-year impact factor that
is dened as the ratio of the number of citations a jour-
nal receives in the course of a given year to the number
of articles published by that journal within the two pre-
ceding calendar years. Since 1984 ISI has produced an-
nually the Journal Citation Reports ( JCR), which is an
instrument to measure the quality/value of a journal on
the basis of its citation frequency over the last two years.
This is a quantitative method extensively used for the
evaluation of the research performance of individuals
and institutions, as well as by librarians for journal selec-
tion and weeding. For publishers of the core scientic
journals, the JIF indicators normally serve as the basic
quantitative data to use to evaluate their product and to
estimate its market potentials/positions. In spite of se-
vere limits in the methodology used for the production
of the JIF, it is today the common instrument to evalu-
ate/rank international scientic journals in order to
measure their impact on a given eld.
With the creative use of advanced information re-
trieval techniques in these three large international cita-
tion databases, today it is possible to navigate the
literature and trace the patterns of knowledge produc-
tion or monitor scientic, technical, and social develop-
ments. The symbolic role played by the citation in
representing the content of a document is an extensive
dimension of information retrieval, and can expand the
scope of information seeking by retrieving not only
those papers that have cited a primordial work, but also
those that are related to the citing references.
This methodology offers vast possibilities for tracing
trends and developments in society, science, and busi-
ness. However, this type of information is visible only
to the intelligent searcher or CI professional who has
learned to read between the lines of electronic informa-
tion. Citation and co-citation links form an unique net-
work in the scientic and technical communication
system and constitute a valuable tool to discover the less
visible relations between authors, subject areas, journals,
organizations, countries, etc. The quantitative analysis of
citations and bibliographic data provides insight into the
deeper segments of subject domains, markets, or techni-
cal solutions. It assists in determining which data will be
needed to be mined in a further competitive intelligence
analysis, and it is also a mean for deciding on the granu-
larity of each selected type of data.
Moving the citation network as an analogy to the Web
context, we have replaced the citations by webpages exter-
nal to a given site that point at least once to that site; self-
citations are replaced by the number of webpages internal
to that site that point at least once to the same site. Webo-
metrics is still in its experimental stage in testing whether the
classical bibliometric methods for impact analysis applied to
the Web are feasible and reliable. CIS researchers have
carried out some pioneer works in calculating national,
sector, and institutional impact factors in the dynamic real-
time environment of the Web.
The paper will present some cases of informetric
studies applied to the analysis of the international pub-
lishing market, trend analysis in the modern Welfare
state, and nally a methodological test of the viability of
Web Impact Factors as a comparison tool and useful
supplement to monitoring the status of Web locations.
Assessing the International Impact of Scientic Journals
To illustrate the scope and nature of informetric analyses
applied to the evaluation of the market for international
scientic journals, the rst sample describes the analysis
of the international impact of seven selected scientic
journals. It aims to move beyond the simplistic use of
the data of ISIs JCR and the common evaluation meth-
1
Bibliometrics is the application of various statistical analyses to study patterns of
authorship, publication, and literature use. It is the quantitative study of litera-
ture as reected in bibliographies and databases. Sometimes bibliometrics and
informetrics are used synonymously, but generally informetrics covers a larger area
than bibliometrics. Informetrics includes all of bibliometrics as well as the
mathematical and statistical analysis of bibliometric-like patterns found in other
areas of life. An overlapping area is scientometrics, which analyzes the structure and
development of scholarly communication, information-seeking behavior, and
government policy as related to the sciences. Webometrics is still in its experi-
mental stage in testing whether the classical bibliometric methods for impact
analysis applied to the Web are feasible and reliable means of comparing
websites.
Informetrics and Webometrics
13
ods in impact analysis. It provides a deeper insight into
the real impact of the international scientic journals
and their market. Regarding the expanding electronic
publishing market and the sharp competition between
journals, the analysis has great relevance to marketing
and publishing strategies, as well as to the development
of editorial policies adjusted to the changed market con-
ditions (Wormell, 1998b).
The sample of selected journals was designed to in-
clude core library and information science journals with
the reputation of having an international readership. The
list is dened below.
1. Libri International Journal of Librar-
ies and Information Services
2. Scientometrics An International Journal for all
Quantitative Aspects of the
Science of Science, Com-
munication in Science and
Science Policy
3. JASIS Journal of the American Soci-
ety for Information Science
4. J Doc Journal of Documentation
5. IPM Information Management &
Processing
6. C&RL College & Research Libraries
7. Comp J The International Journal of
Computing
The current status as well as the historical develop-
ments of the journals are described since 1972, from the
beginning of the Social Science Citation Index. The
analyses cover two ve-year citation windows, 1987
1991 and 19921996, with two-year publication win-
dows, 19871988 and 19921993.
The geographical distribution was categorized in the
following regions:
Europe
North America
South America
Asia & Pacic
Australia
Africa
Subscription data for 1996 was provided directly by
the publishers. Knowing that these gures are commer-
cially sensitive data, the necessary guarantee for the con-
dential handling of the information was given. Thus,
the published results of the analysis are only shown in
the form of a percentage while the actual numbers of
subscribers are hidden.
A statistical analysis was carried out to test how
strong the correlation is between the geographical distri-
bution of authors, citations and subscriptions, as well as
to determine the signicance of differences in their dis-
tribution patterns. Considering the small size of the sam-
ple, the results of the calculations were interpreted and
ne-tuned by regression analysis and also weighted by
other possible factors.
The Pearson test examined rst the correlation between
author and citation data in the two periods of the investi-
gation. Inuence, cause and effect relationship, and re-
gional effect were measured to verify the homogeneity of
the data. Second, a similar test checked the correlation be-
tween authors and citations in the two periods of time.
Finally, the likelihood ratio chi-square test was run to ana-
lyze the signicance of the difference in the distribution of
authors and subscriptions as well as of citations and sub-
scriptions. These three steps of the analysis have been exe-
cuted for the seven selected journals.
With 95% condence level, the chi-square test gener-
ally showed a weak or no correlation between the dis-
tribution pattern of authors, citations, and subscriptions.
The statistical signicance of these two distribution pat-
terns is, however, so critical that it is meaningful to ana-
lyze them. The test generated a critical value for the
statistical signicance of the differences for each of the
journals. The cases where the values are found to be
exceptionally deviant signal situations that might attract
special attention on the part of the researcher. The re-
sults of these quantitative analyses are useful in answer-
ing some questions and/or in raising new ones for
publishers, editors, and users when they are diagnosing
the journals for various purposes.
In the following we will present the results of the
analysis for one of the selected seven journals, Libri. For
the rest of the analysis the reader is referred to CIS Report
7, which can be obtained by contacting the Centre for
Informetric Studies in Copenhagen (Wormell, 1998a).
2
LibriInternational Journal of Libraries and Information
Services is one of the oldest international library journals.
It publishes original articles on all aspects of libraries and
2
Centre for Informetric Studies in Copenhagen, Royal School of Library and
Information Science, Birketinget 6, DK-2300 Copenhagen S, Denmark. Tel:
45 32 58 60 66; Fax: 45 32 84 02 01; website: www.db.dk/cis/
Wormell
14
information services. International visibility, scholarly
publishing, and good review-articles were the main fea-
tures of the journal. During the 1980s, however, the
scholarly level of the journal fell, and the share of ap-
plied research contributions increased. Severe budget
cuts in Western libraries caused signicant loss both in
European and U.S. subscriptions.
In 1989 the publisher set up a new editorial team and
tried to move the journal in a new direction. The
changed publishing policy, the efforts of the new edi-
tors, promotion, new design for the cover, etc., resulted
in a positive development for the journal. During 1992
1993, for instance, we can see (Figure 1) an increase in
the number of authors coming from North America and
a reduction in the number from Africa.
In the case of Libri, as regards the intellectual input
(authors), as well as the concentration of users (measured
here as the number of citations given to the journal),
Europe and North America are the dominant regions.
Looking at the relation between the concentration of
users and subscriptions, however, this is not reected in
the number of subscriptions on the part of North Amer-
ica. The available subscription data for 1996 indicates
some market opportunities here for the journal!
The low 21% share of subscriptions in that region,
compared with the 35% share of the authors and the
39% share of citation impact, point to the fact that there
are potentialities to increase the number of subscriptions
in the region (see Figure 3).
The chi-square test showed low likelihood ratios
between authors and subscriptions (p 0.002) and
between citations and subscriptions (p 0.006),
which normally would be regarded as values of no
correlation. However, the calculated values provide
useful indications about the size of differences in
these distribution patterns, which call for attention
and highlight the special situations. For example,
compared with the calculated normal distribution of
subscriptions, at present Europe has few authors and
Figure 1.
The international visibility of Libri, seen as the geographical distribution of
authors writing in Libri during the two publication windows.
Figure 2.
The international citation impact of Libri, seen through the geographical
distribution of citations given to the journal.
Figure 3.
The 1996 subscription map of Libri, with percentage of subscriptions in the
six regions.
Informetrics and Webometrics
15
few citations; in contrast, North America has (too)
many authors and citations; and Africa has (too) many
authors.
For a small sample like this, the calculated critical
values have to be even more carefully analyzed and
weighted with other possible causes than in the case of
large samples.
It must be noted that subscriptions are not the only
registered form of the use and the revenue generated
by the journal: Article delivery services, databases, elec-
tronic publishing, and various forms of copyright
charges, for example, all represent different patterns in
the use of the journal and its market position. For a
fuller examination of the distribution pattern of users
and the revenue of the journal, it is necessary to com-
plete the analysis with additional data.
To analyze the subject character of journals citing
Libri, and to evaluate how much interdisciplinary impact
the journal has in its topical frame, we traced, with on-
line citation analysis, the subject areas from which the
external citations were coming. The export of knowledge
from Libri had the following distribution patterns during
the two periods of the investigation, indicating a very
weak interdisciplinary impact:
19871991 199296
24 92.3% INFORMATION SCIENCE &
LIBRARY SC.
39 100.0% INFORMATION SCIENCE & LIBRARY
SC.
1 03.8% COMPUTER APPLICATIONS
& CYBERNETICS
1 2.6% COMPUTER SCIENCE, INFORMATION
SYSTEMS
1 03.8% ECONOMICS
1 03.8% EDUCATION & EDUCA-
TIONAL RESEARCH
1 03.8% OPERATIONS RESEARCH &
MANAGEMENT SCIENCE
1 03.8% PLANNING & DEVELOP-
MENT
The knowledge export of the journals is an indicator
that measures whether the given journal has the scien-
tic strength and impact to break through the traditional
borders of the home eld, and whether it can attract
authors and citations from neighboring disciplines.
Other possible indicators are, for example, citation half-
life, most cited authors, and the balance between exter-
nal citations and self-citations.
Having analyzed simultaneously the three categories
of actors in the global information market (authors, cita-
tions, subscribers), the study presents a methodology for
evaluating the international visibility and impact of sci-
entic journals, as well as their market shares and com-
petitive relations to other journals.
The following questions helped to establish how in-
ternational the scientic journals are in scope and impact:
Is the journal a national, international, continental, inter-
continental product?
What is the origin of the intellectual input (authors
writing in the journal)?
In which regions are the users concentrated (geographical
distribution of citations)?
Where does the export of the knowledge published in the
journal go (from which subject areas are coming the
citations)?
Does the distribution of users correspond with the distri-
bution of subscribers?
The geographical distribution of authors-citations-
subscriptions is measured and shown in diagrams. By
inference analysis the hypothesis is tested whether there
is a relation between the distribution pattern of users
and subscriptions. The correlation between these facts
can have many explanations, but certainly it can raise
some useful thoughts and ideas among the publishers
about unexplored market potential.
Wormell
16
Based on the correlation between the geographical
distribution pattern of authors, citations, and subscrip-
tions it was possible to dene a new robust indicator for
the international visibility and impact of the scientic
journals. The analysis of the statistical signicance of
correlation and differences gave some useful data, the
importance of which to marketing and publishing strate-
gies is obvious.
Informetric Analysis of the Welfare State
While in the science and technology disciplines there
are long traditions of using quantitative analysis based on
bibliometric methods, in the eld of social sciences and
humanities the use of these methods has until now been
limited.
Due to the differences in the traditions and tech-
niques of subject representation between the science and
humanities disciplines, and the different patterns of sci-
entic communication, quantitative analysis for moni-
toring the literature in the soft sciences requires other
techniques and attitudes.
Therefore, the challenge of the present project was to
explore these differences and improve the existing
methodologies by applying them to a soft, wide, and
complex subject domain such as Welfare is within the
social sciences. The Welfare State was chosen as the
subject of the study because it is possible to draw many
Danish, Nordic, and international perspectives within
the domain, and it is of interest in todays societies. The
historical dimension of the Welfare concept was another
attribute that caught the interest of the research team.
The aim of the analysis was threefold:
1. To study the past and current developments of the Wel-
fare State as research phenomenon, and to show what
metric studies can offer to the exploration of the deeper
segments of knowledge production and political programs
in a broad social science domain.
2. To improve the analytical techniques and methods in
handling large bibliographical data sets in the social sci-
ences for informetric analyses.
3. To test the usability of the issue-tracking methodology to
monitor the critical issues of the modern Danish Welfare
State (to analyze the relationship between information
ows coming from the research, economic, political, and
social systems in Denmark, and to track some of the key
issues through these sources to see how they move and
develop over time).
The rst part of the study used the technique of coordi-
nated online searches in clusters of international biblio-
graphic databases to map the development patterns in
international Welfare research. The quantitative analysis of
the number of publications and word frequencies was
combined with similarity measures and other statistical
methods to produce tables, diagrams, and clusters showing
how the topical areas within the research eld have devel-
oped through the last 25 years, divided in three periods of
time. The results of the analyses are quantitative data com-
bined with some agged issues for the consideration of
experts and strategic planners (Wormell, 2000a). Figure 4
illustrates one type of cluster analysis carried out to map the
subject domain of international welfare research in 1990
1997.
QUANTITATIVE ANALYSIS WAS COMBINED WITH
STATISTICAL METHODS TO PRODUCE TABLES,
DIAGRAMS AND CLUSTERS SHOWING HOW THE
TOPICAL AREAS WITHIN THE RESEARCH FIELD
HAVE BEEN DEVELOPED THROUGH TIME. THE
RESULTS OF THE ANALYSES ARE QUANTITATIVE
DATA COMBINED WITH FLAGGED ISSUES FOR
THE CONSIDERATION OF EXPERTS AND
STRATEGIC PLANNERS.
The second part of the study focused on Denmark, aim-
ing to unearth important past, current, and future con-
ditions related to the development of the Welfare
concept, specically to monitor how the concept of
Welfare moved from the theoretical to the empirical
problems of the Welfare State. Using the methodology
of issue tracking, the analysis followed the development
of selected topics in the research environment; their
implementation in the political, legislative, and social
system of the country; and nally the reections they
caused in the popular press and media. The purpose was
to trace possible future trends and to make forecasts and
scenarios (Wormell, 2000b).
Issue tracking is a useful methodology if one is inter-
ested in following how a concept (originating from an
innovation or a new idea) moves through the path of
various publication forms, for example:
Theoretical research 3 applied research 3 techniques and
engineering 3 popular press and mass media 3 legislation
The Danish national study was based on the current
three main questions in the criticism of the Welfare
State:
Informetrics and Webometrics
17
Economic aspects Can we afford it?
Legitimacy Do the people believe in it and
how much do they support it?
Functionality How does it work?
Through analysis of the information ow between
research, media, and the political system, the study was
designed to show how the economic, legitimacy, and
functionality aspects of the Welfare State in Denmark
can be traced by informetric analysis, focusing on the
following types of quantitative data:
Publications originating from publicly funded research
Identication of the
Number of books and articles published in the selected
topical areas
Publication and term frequency analysis of the popular
press and mass media
Welfare-related words and expressions (the language of
Welfare)
Legislation work and the political activities through some
signicant types of documents
A panel of domain experts and other specialists has
been involved with the project, which has contributed
advice and necessary feedback in the evaluation of nd-
ings. The purpose was to improve and develop method-
ologies to identify trends and current and future
conditions related to the concept of the Welfare State,
and to focus them for expert consideration.
Generally, the published results are limited in deliver-
ing qualitative analysis, but they signal the relevant
quantitative data and ag the issues that might be can-
didates for further analysisfor example, for the pro-
duction of indicators in the mapping of scientic, social,
and cultural trends in the development of the modern
Welfare State. They are supposed to be used by domain
Figure 4.
Cluster dendrogram, showing the similarity between the 13 main topical research areas in international welfare research in 19901997.
Wormell
18
specialists and social policy makers as raw material in
connection with the further analysis of specic areas,
and hopefully they will contribute to a better under-
standing of the theoretical or empirical aspects of the
Welfare State.
In cooperation with the Institute for Future Studies,
Denmark, we are planning to continue this informetric
analyses to show the gap between the politicians and
the populations opinions about the Welfare State in
Denmark. This is an extensive project where informetric
analyses will be one of the quantitative methods used in
mapping the opinion of the Danish people and com-
pared with the ofcial political program.
Webometric Study of Impact, Visibility, and
Connectivity in the Web Space
Researchers at the Centre for Informetric Studies in
Copenhagen have studied the interesting idea of utiliz-
ing informetric methods on the Web, and have started
to lay the basis for a newly emerging area of webomet-
rics. The novel aspect is to regard the Web as a citation
network wherein the traditional information entities
(scientic articles and the citations given to and from
them) are replaced by webpages with external and inter-
nal links. In this context these pages are the entities of
information on the Web, with hyperlinks from them
acting as citations. It should be noted that, although
there are other investigations focusing on the Web from
quantitative viewpoints, the CIS studies elaborate on the
idea of conducting the same types of informetric analy-
ses on the Web as is possible via citation databases.
The rst webometric study in 1997 tested and de-
scribed the core of search options implemented to draw
a picture of Denmarks use of the Web compared to
Norway and Sweden. This study also reviewed types of
webpages: discipline, size, and number of links. It is
obvious that informetric methods using word counts can
be applied here (Almind & Ingwersen, 1997).
Webometrics can be used for many purposes. In
the context of the Information Superhighway and
Information Society 2000 programs there are enor-
mous possibilities in using the Web and HTML as
analytical tools to measure the relative visibility of a
company/organization/country on the Internet. The
proposed analysis method can be regarded as a tool
for measuring the accuracy of Web search engine
performance and website organization, linking, and
structuring of pages.
Figure 5 shows the most important data elements of
webometrics. In the ISIs citation databases the subject
contents of documents are described in three forms: Au-
thor Keyword (DE), KeyWords Plus (ID), and Research
Fronts (RF). For the subject access points of webpages
an author can use tags such as EM and
Figure 5.
Comparison of search codes used on the WWW and in the citation databases.
Informetrics and Webometrics
19
STRONG. Frequencies of terms are measured by
some Web indexes. The titles of the webpages are
found either within the TITLE or HI tag, and
can be identied uniquely by the URL of the page.
Whether the author is a person or corporate source can
only be identied manually. Corporate source or aflia-
tion for webpages is given by the rst part of the URL,
but the institution hosting a webpage is not necessarily
connected with the author of the webpage.
The lack of any enforced conformity of form and
content in the webpages, together with the dynamic and
real-time nature of the Web, creates both advantages
and disadvantages in the analytical work; therefore, it is
an exciting process to test how traditional search meth-
ods function on this new platform.
With reference to the concept of Journal Impact Factor
( JIF) as presented in the introduction to this article, our
denition of the Web Impact Factor (WIF) takes the logi-
cal sum of the number of external and self-link web-
pages pointing to a given country or website, divided by
the number of pages found in that country or web-
siteat a given point in time. The numerator thus con-
sists of the number of link pagesnot the number of
links.
The intensity of the links is hence not calculated in
contrast to the traditional JIF, for which not only the
number of different articles citing a journal is calculated,
but also the frequency of the citations given to that
journal and placed in those articles. (This is the reason
that the journal or individual IF can be articially raised
by self-citations.) For the Web-IF, our denition implies
that, in order to count, a new self-link must be placed
on a webpage not already holding such a link; and then
both the numerator and the denominator increase with
the value 1. In principle, a Web-IF can only increase
above 1.0 with the growth of the number of external
link pages that point, at least once, to a particular web-
site. Consequently, Web-IFs can only compare directly
with traditional IFs for which we know the number of
different sources citing a given object: for instance, that
Z articles published by journal X during time T were
cited by Y different articles at least once during time T.
The external link pages can be seen to mirror social
communication phenomena, such as strategic or tactical
referral behavior, and pragmatic or common semantic
interest in particular sites on the Web. An external
Web-IF becomes a measure of the extension of the at-
tractiveness of a given site. In addition, self-linkage does
also reect the logical structures used for organizing
webpages in the local servers. Unlike scientic citations
to journals, institutions, or individuals, which may be
stable or constantly increase, the number of pages link-
ing up to a particular Web object may indeed decrease
or disappear over time, for example, due to closedowns
or restructuring of websites. Thus, in contrast to the
common citation IF calculation a retrospective Web-IF
is not reproducible.
Ingwersen (1998) demonstrated a workable method
to calculate the WIF for various types of Web domains
over a series of snapshots taken of the Web during a
one-month period in August/September 1997. (For a
more detailed description of the data isolation procedure
and the calculation methods, as well as for the tests dis-
cussed, the reader is referred to his article.)
Figure 6 shows the resulting WIFs in descending or-
der for a selection of smaller and middle-size countries
and large, mainly U.S., web sectors. Figure 7 displays
WIFs for institutions: selected research locations and
two well-known scientic journals, Nature and Science.
For each location in Figure 6, the current Web Im-
pact Factor, Web-IF Self-Link, and the Web-IF Exter-
nal Link are displayed. The isolated number of
webpages per site is shown in the last column. The de-
viation values are mostly negative, implying that the
Web-IF has a lower value, or commonly is more con-
servative than the Simple WIF.
Each country, as well as large segments of the Inter-
net show a Web-IF with an acceptable deviation below
1.7% between the two intermediate arithmetic mean
values isolated by the Logic A and Inverted Logic A
operations. Although statistically one is not permitted to
sum up the results of different samples from different
snapshots, we have done so in order to demonstrate a
kind of World Web-IFwith a relative mean of
0.899 and a deviation of 0.29%. The difference be-
tween the Web-IF and the Simple WIF for countries
shows a deviation ranging from 7.52 to 1.88%.
Not surprisingly, Norway performs best of the Euro-
pean countries in this selection. From the analysis we
are aware of the Norwegian efforts put into marketing
via the WWW, an effort that seems to pay off in impact
(Almind & Ingwersen, 1997). If the gures from the
mainly U.S. Web sectors are calculated together and
taken as the current estimate, the relative U.S. Web-IF
is 0.943, with the number of webpages being
17,999,611 and 16,981,914 link pages. The reason for
this rather low Web-IF is that the business and academic
educational sectors are very large with quite low Web-
Wormell
20
IFs. However, the U.S. Web-IF is higher than the ex-
pected value, that is, the World Web-IF of 0.899.
We are aware that some servers registered in generic
domains, such as .edu, are not located in the United
States. A possible bias can be tested following the
method proposed above. Further, locations may actually
block the access of Web crawlers to their servers. So this
activity will slightly raise national Web-IFs, but decrease
institutional IFs. In the national case the denominator
will suffer from blocked out webpages; in the institu-
tional case the numerator will decline in value by the
omitted external link pages.
One may note that the current Japanese Web-IF is far
below the expected mean value for countries and sectors.
This situation leads to considerations about the inuence of
language as well as of national cultural and social factors on
the meaning and interpretation of impact factors in gener-
alalso because the Japanese Web-IF for self-linking is the
lowest observed in the data set.
The variation of the Web-IF over different snapshots
taken within short intervals does exist (see, e.g., Finland),
but can evidently be much more signicant for smaller
websites (see, e.g., www.sciencemag.org in Figure 6). Na-
tional and sector Web-IFs demonstrate far more robustness
in this respect, possibly because the quite restrictive do-
main: command cannot be applied to the local servers
illustrated on Figure 6. The host: command used here is
far more unconditional in its functionality.
Figure 6.
Selected National Impact Factors for the WWW:Web-IFAug. 20Sep. 21, 1997.
Figure 7.
Selected Institutional Web-IFsAug. 20Sept. 21, 1997.
Informetrics and Webometrics
21
We may observe that the Web-IF Self-Link values cen-
ter around 0.5, implying that in general half of the national
webpages contain self-references. In order to exceed a
Web-IF value of 1.0 the results so far indicate that the Ex-
ternal Link Web-IF should take a value of at least 0.6.
Figure 7 demonstrates a greateryet acceptable
variation (3%) between the Logic A and Inverted
Logic A results for smaller Web locations. The largest in
the table is the academic sector UK (.ac.uk/) with
481,881 webpages. The only website with a consistent
number of link pages was Natures local server at the
time of observation. Since only one webpage was de-
tected, there cannot exist any self-link pages. The online
analysis of the Royal School of LIS (www.db.dk/) re-
veals that the isolation method is workable for that size
of Web locations since the approximately correct num-
ber of known webpages was retrieved.
Most important, the data set in Figure 7 demonstrates
uctuated and quite substantial deviations from the di-
rect isolation of simple link pages to the logical sum
of mean values for self-link and external link pages, that
is, the Simple WIF/Web-IF deviation: from 14.1
over zero to 10.5%.
Such unsatisfying variations suggest to avoid the use
of the Simple WIF as an impact factor measure, both
on national and institutional levels.
Conclusion
As observed previously, the duration of the observation
windows and snapshots, the logical retrieval operations,
and the form of the search arguments are crucial when
generating the data, foremost due to temporary close-
downs, reorganization, size and structure of Web serv-
ers, and the search engines sampling methods in real
time. One may also point to the fact that page revision
dates can be added to the retrieval arguments proposed
above. Various publication or linkage windows, for ex-
ample the last two years linkage to a particular Web
segment, may consequently be put to analysis.
One may detect at least three spin-off effects of this
and similar webometric studies. First, they may in turn
provide novel insights into the retrieval process on the
WWW. For instance, clusters of websites can be de-
tected by means of link page co-occurrence. Second,
the proposed analysis method can be regarded as a tool
for measuring the accuracy of Web search engine perfor-
mance and website organization, linking, and structuring
of pages. Third, Web Impact Factor studies may open
up a Pandoras Box concerning the validity of the mat-
ter, in particular because most impact factor analyses are
contested. More detailed qualitative investigations of the
nature of intra-web linkage may uncover the signi-
cance and properties of Web-IFs.
In conclusion we observe that the proportional distri-
bution of webpages between the Nordic countries pre-
sented in this study conforms to the results obtained in
earlier webometric analyses (Almind & Ingwersen,
1997). We are condent that analyses of the Web-IFs of
national, sector, and larger Web segments or sites are
reliable. For smaller institutional sites on the WWW,
the Web Impact Factors are less dependablehowever,
within reach following the proposed method of calcula-
tion. As for traditional IFs, comparisons should be per-
formed with caution, and preferably be carried out
within the same snapshot.
Implications for CI
The aim is to learn how to use websites and online da-
tabases, not only as a registry, but also as an analytical
tool. This article draws attention to the vast potential of
online databases and to the many new possibilities that
advanced search techniques offer for those who want to
explore online databases, Web, and HTML, not only
for accessing documents or nding facts, but also for
tracing trends and developments in various disciplines
and environments.
Since an increasing number of CI professionals are using
the Web for their job, the proposed analysis method can
be regarded as a useful tool to measure the relative visibility
on the Internet of a company/organization/country, or to
carry out tasks such as issue management, gathering analyti-
cal data for social and business intelligence, or for research
evaluation and innovation studies. Webometric studies pro-
vide novel insights into the retrieval process on the WWW
and, nally, they raise awareness about the pitfalls in the
search engines sampling and in the organization and struc-
turing of websites.
THE AIM IS TO USE WEBSITES AND ONLINE
DATABASES NOT ONLY AS A REGISTRY, BUT AS AN
ANALYTICAL TOOL FOR TRACING TRENDS AND
DEVELOPMENTS IN VARIOUS DISCIPLINES AND
ENVIRONMENTS.
The results demonstrate that Web Impact Factors are
calculable with high condence for national and sector
domains while institutional WIFs should be approached
with caution.
Wormell
22
The present study signies a new approach in Infor-
metrics, where advanced bibliometric methods are ap-
plied not only to the evaluation of science and
technology (S&T) but also to the analysis of their soci-
etal, business, and other specic relations.
Those who want to know more about the efcient
use of the Web as an information source for research,
and the ways of content analysis of webpages retrieved
by the major search engines, are referred to a newly
published article giving many useful methodological
hints as well as input for discussion on the potential of
the web as information and bibliographic source for the
skilled searcher (Bar-Ilan, 2000).
References
Almind, T., & Ingwersen, P. (1997). Informetric analysis on
the World Wide Web: Methodological approaches to webo-
metrics. Journal of Documentation, 53(4), 404426.
Bar-Ilan, J. (2000). The Web as an information source on
informetrics? A content analysis.
Journal of the American Society for Information Science,
51(5), 432443.
Ingwersen, P. (1998). The calculation of Web Impact Factors.
Journal of Documentation, 54(2), 236243.
Wormell, I. (1998). Informetric analysis of the international
impact of scientic journals: How international are the in-
ternational journals? Journal of Documentation, 54(5), 584
605.
Wormell, I. (2000a). Bibliometric analysis of the welfare state
as a research phenomenon.
Scientometrics, 48(2), 203236.
Wormell, I. (2000b). Critical aspects of the Danish welfare
stateas revealed by issue tracking. Scientometrics, 48(2),
237250.
About the Author
Irene Wormell is professor of information management at the
University College of Bors, Sweden. Until September 2000
she was head of the Centre for Informetric Studies at the
Royal School of Library and Information Science in Copenha-
gen, Denmark. Her major research interest is the strategic use
of information resources in business and social intelligence. She
has undertaken research and consultancy for a wide range of
organizations worldwide. She has published intensively and is
frequently invited to speak at international conferences and
seminars. Most recently she has edited two special journal is-
sues relevant for CI: Competitive Intelligence from the Per-
spective of Todays Information Professional, FID Review,
Vol. 1, 1999, No. 4/5, and Frontlines in the Nordic Bib-
liometric Research, Scientometrics, Vol. 48, 2000, No. 2.
She is active in the SCIP Scandinavian Chapter and promoter
of interdisciplinary research efforts in the area of information
and management sciences. She may be contacted at the Univer-
sity College of Bors, Swedish School of Information and Li-
brary Studies. Allegatan 1. S-501 90 Bors, Sweden. Tel:
46 33 164413; Fax: 46 33 164005; e-mail:
irene.wormell@hb.se.
Informetrics and Webometrics
23

Anda mungkin juga menyukai