Anda di halaman 1dari 6

Core Competency E.

Design, query and evaluate information retrieval systems



Introduction
InIormation Retrieval Systems (IRSs) are essential tools used by the majority, iI
not all oI Library and InIormation ProIessionals (LIPs). InIormation retrieval (IR) occurs
aIter a user presents an inIormation need in the Iorm oI a query 'to a database. IR is the
process oI the database seeking, locating retrieving and presenting that inIormation to the
user.
An IRS is a database oI records. In some cases, these records will all be surrogate
records. In other IRSs they will be both surrogate records and the records themselves. In
either case, an LIP will oIten act as an intermediary between the person who has the
query and the IRS. IRSs are not uniIorm in the way they are designed and structured, and
a query oIten needs to be expressed a speciIic way Ior a speciIic IRS. It is the LIP`s role
to direct the patron to the most appropriate IRS Ior his/her query and to show the patron
how to translate the inIormation need into a query that is readable by the IRS being used.
Designing an IRS begins the same way almost all user-oriented tasks begin: by
considering the user populations` needs. The designers oI an IRS system must decide
what inIormation is going to be stored and represented in the database, and iI they want
to use either indexing that is pre-coordinate, post-coordinate or both. In short, the
designers oI an IRS need to consider the user in all their designing eIIorts. They should
be concerned with how to provide the user with the most intuitive, precise IRS possible.
TO this end, designers will also want to Iigure out ways to prevent
disambiguation, which occurs when an IRS has to determine the meaning oI a word that
has one or more homonym. IN LISNews, JeIIerey Beall outlines an example oI
disambiguation using the word 'boxer. He says, 'the word boxers` is a homonym with
several diIIerent meanings, and the search engine doesn`t know which meaning you want.
Boxers are a breed oI dog, a category oI athlete, and a kind oI men`s garment (Beall,
2010). One way to address the problem oI disambiguation is through a controlled
vocabulary (as opposed to natural language). A controlled vocabulary has a limit on the
number oI possible values that can be used Ior attributes. Although it doesn`t allow Ior
the intuitive user/IRS interIace that natural language does, a controlled vocabulary
reduces ambiguity and oIten provides Ior more precise results. II the designer oI an IRS
system wants queries to be submitted using a controlled vocabulary, the designers need to
create and establish those speciIic terms.
Some oI the decisions that the designers oI the IRS made in the initial stages have
tradeoIIs in terms oI user Iriendliness. For example, pre-coordinate indexing allows
strings oI terms to be combined to describe a complex concept. The advantages oI this are
a lower rate oI disambiguation and a higher rate oI precision (retrieving only relevant
results). The user may Iind querying a pre-coordinate indexed IRS more diIIicult but
might also be happier with the results. Conversely, a post-coordinate indexed IRS is more
intuitive Ior the user, as post-coordinate IRSs will allow Ior many search terms and
usually uses Boolean logic/descriptors. Although this type oI indexing may be more user-
Iriendly, it also tends to yield high-recall results (a high number oI records retrieved) and
less precision (a Iewer number oI relevant records retrieved).
While recall and precision are key indicators oI the eIIectiveness oI an IRS,
ProIessor Enid Irwin has also introduced us to the concept oI SEI, something that was
created by a Iormer SJSU SLIS student. All oI the methods oI evaluating an IRS are
really looking to measure the value oI the results retrieved. But remember, IRSs are
designed to help a speciIic user with an inIormation need. Thus, the value oI the results
retrieved ,s interpreted by the user is oIten the best way determine eIIectiveness. For
example, an IRS could retrieve results that are highly relevant to a user`s query, but are oI
little value (Ior example, iI the user already has the inIormation retrieved). So., in all
aspects oI IRSIrom designing to querying and Iinally evaluating a sytem, the user plays
an essential role. In Iact, although designers oI an IRS might create a system that is as
user-Iriendly as possible, its eIIectiveness will also depend on how much knowledge the
user has about IRSs in general and, speciIically, with the type oI system he or she is
using.

Evidence #1 Assignments 2, Part A; LIBR 202: Database Design
In LIBR 202, Inform,tion Retriev, Systems, ProIessor Enid Irwin arranged
students into groups and asked each to create a database using DB/Textworks. The goal
oI this assignment was to use the concepts we had read about (e.g., attributes, pre-
coordination vs. post-coordination, etc.) in a very 'hands-on, practical way and design
an IRS (a database). This task included outlining the client (user) group, the scope and
goals oI the IRS, the language we would use (natural vs. controlled), the rules oI any
controlled languages (pre-coordinate vs. post-coordinate and what the pre-coordinate
levels and terms would be), search guidelines, a list oI attributes and indexing rules.
I`m very grateIul that LIBR 202 was a mandatory class in the SLIS program. I
don`t think, based on my areas oI interest, I would have taken the class iI it wasn`t
required and I really believe I will be a better LIP Ior having completed this class and this
database assignment. I am also grateIul that we were able to complete this assignment in
groups. My teammates and I relied heavily on each other to Iigure out how to implement
the concepts we understood via lectures and readings into the creation oI a database.
Although it was a diIIicult assignment, it help me develop a clear competency regarding
the inner workings oI IRSs and I believe I accomplished the goals ProIessor Irwin stated
when she outlined the database assignment.

Evidence #2: Children`s Literature Complete Database Presentation
Aristotle said, 'Those that know, do. Those that understand, teach. I can`t say
that I will ever be a master oI IRSs, but I have developed a thorough understanding oI
them and have Iound that I am capable oI sharing my knowledge on the subject with
others In LIBR 210, ReIerence and InIormation Services, ProIessor Tash asked each oI
the students to research a database, compile any necessary inIormation about that
database and present that inIormation to the rest oI the class in an Elluminate session.
Although I have used many IRSs in my personal, academic and proIessional work, I am
including my presentation on the Children`s Literature Complete Database (CLCD)
because it systematically highlights how to use CLCD in the most eIIective way Ior
users.
As I researched CLCD, I learned about some very valuable tools and practices. I
was Iamiliar with how to use CLCD beIore beginning the 210 assignment, but I was
interested to approach the database Irom the perspective oI developing the capability oI
instructing others how to use it. During the process I developed a strong Iamiliarity with
the Help Desk and tutorial Iunctions and, although I thought I knew how to use CLCD
eIIectively, reviewing these helped me to 1) learn more about how to use the database
than I had gleaned Irom my previous, selI-directed work, and 2) learn to provide
instructional inIormation in a clear way. This was a good reminder that, no matter how
Iamiliar I think I am with an IRS, I should always take care to review the guidelines so I
can query the IRS in a way that will give me high value results.

Evidence #3 Assignment 2, Part B; LIBR 202: Database Evaluation
The second part oI Assignment 2 required us to evaluate the database we had
created. The goal oI the assignment was to ascertain the useIulness oI our database,
emphasizing concepts we had learned in the semester such as recall, precision and a new
concept called SEI. Developed by Marc Schatkum, a student at San Jose State University,
the Search EIIectiveness Index (SEI) is a Iormula intended to show the end user how
close a given search comes to the perIect search result, where relevant records (P) and
retrieved records (R) 1 and 1, respectively; non-relevant records retrieved (F) and
relevant records not retrieved 0 and 0, respectively. (Schatkun, 2009). A 'perIect
search would mean that all relevant records were retrieved in a query (and, conversely, no
relevant records were missed). A perIect SEI 1; a Iailed search (a query that retrieved
no relevant records) 0. All other searches Iall either as a decimal between these two
integers or is expressed as a percentage. I used SEI as one oI the ways I evaluated my
group`s (Team One) database. I also used two other methods to evaluate (recall and
precision) the database. Even when these methods can`t be employed in a strict
mathematical way, it`s important Ior any LIP to understand the concepts behind them and
their importance in Iilling a user`s inIormation need.

Conclusion
As a person who entered the LIP Iield with the goal oI dedicating my career to the
Iree Ilow oI inIormation in the context oI social impact, my relationship with IRS studies
going into SJSU was not exactly Iront and center on my mind. As I proceeded toward
earning my degree (and in my proIessional work as a Young Adult Librarian), however, I
have learned how critical these systems are to the realization oI even my most loIty
objectives. Through my work developing an IRS system oI my own, teaching and
evaluating existing systems (such as CLCD), and understanding both traditional and new
system measurement techniques (including SEI), I have developed a high degree oI
competency with the inIrastructure required to ensure the inIormation Ilow and exchange
necessary Ior libraries to serve their purposes in society.

References

Beall,J. (2010, February 11). The importance oI word-sense disambiguation in online
inIormation retrieval. LISNews. Retrieved Irom
http://lisnews.org/importancewordsensedisambiguationonlineinIormationretrieval

Meadow, C., Boyce, B., KraIt, D., & Barry, C. (2007). Text inform,tion retriev, systems
London, UK: Academic Press

Schatkun, M., (2009). Hi LIBR 202 students. Retrieved May 15, 2009 Irom the San Jose
State University ANGEL website Ior LIBR 202-03-04, Spring 2009 site:
https://liIIey.sjsu.edu/

Anda mungkin juga menyukai