Anda di halaman 1dari 3

.

Internet Watch

Ganging up on Information Overload


Al Borchers, Jon Herlocker, Joseph Konstan, and John Riedl University of Minnesota

Effort, of course, can be reduced via automation. While collaborative ltering is not necessarily effortless, it requires a relatively small amount of effort on the part of the user and provides very individualized recommendations. The collaborative filtering systems that we discuss here each offer a high degree of personalization, but each system takes a different approach to automation, attempting to find the best trade-off between the amount of work the users must put into the system and the perceived value and benets they receive in return.

TAPESTRY
Collaborative ltering research began in the early 1990s at Xerox PARC in response to the overwhelming number of e-mail messages within PARC, which numbered far more than could be easily managed by mailing lists and keyword ltering. The Tapestry system enabled users to add annotations to messages. Two databases stored the incoming stream of documents and the linked annotation records. A sophisticated query system allowed users to browse for messages based on both their content and annotations. Users could set up standing filter queries that would watch the document stream and annotation records, nding documents that matched the query at any time, present or future. For instance, a user could ask for all messages about collaborative ltering rated excellent by a superior. Only when the message was rated excellent would it be selected and forwarded to the user. Tapestry was the rst step in automating recommendation-sharing among friends and colleagues. It capitalized on the idea that humans working with computers could be more effective information filters than computers or humans working alone. People understand and judge information in ways that current computer systems cannot, largely because people can more readily determine quality as well as content. Because users needed to know whose recommendations to follow, Tapestry worked best in a small community of people who already knew each other.

umans have always addressed the high cost of finding information by sharing itinventing oral traditions, written language, and the Web as information-sharing tools. The printing press, broadcast media, and most recently the Internet have all changed the nature of the information problem. Information is no longer scarce. Indeed, there is far too much of it for any one person to review, let alone organize. Instead of being starved for information, we find ourselves overloaded. When information is abundant, the knowledge of which information is useful and valuable matters most. We all use our network of family, friends, and colleagues to recommend movies, books, cars, and news articles. Collaborative ltering technology automates the process of sharing opinions on the relevance and quality of information. Collaborative ltering is one technique among many information ltering techniques that range from unltered to per-

Instead of being starved for information, we nd ourselves overloaded.

Editor: Ron Vetter, University of North Carolina at Wilmington, Mathematical Sciences Dept., 601 South College Rd., Wilmington, NC 28403; voice (910) 9623671, fax (910) 962-7107; vetter@cms. uncwil.edu

sonalized and from effortless to laborious, as illustrated by the chart shown as Figure 1. Libraries or the Web are good examples of unfiltered information sources. E-mail directed to one recipient is a good example of a ltered information source. A best-seller list requires little effort for the user, but provides the same recommendations to all users, so it is in the upper left of the chart. Filters based on demographics, such as age, sex, or marital status, require some effort from the user in providing the demographics, and provide some level of personal filtering, so they are near the middle of the chart. Collaborative filtering requires relatively little effort from the user, and provides individually targeted recommendations, so it is in the upper right of the chart.

106

Computer

USENET AND GROUPLENS


Usenet, one of the earliest and largest bulletin board systems, was originally a valuable source of information. But as the number of users grew, the system became increasingly overloaded. It reached the point where most users found only a few useful articles in a group filled with dozens or even hundreds of articles a day. The GroupLens system, started at the University of Minnesota in 1992, attempts to make Usenet useful again by providing personalized predictions on the quality of the messages. (GroupLens has undergone dramatic changes, including being commercialized by Net Perceptions. This column discusses the GroupLens Research system.) Because there are differences among user tastes, the GroupLens system asks users to rate articles on a one to five scale. GroupLens then collects and compares these ratings to nd users sharing similar tastes. If, for example, you need a prediction for an unread article, GroupLens would see how the other users sharing your tastes had rated the article. If they liked it, chances are you will too, so GroupLens gives that article a high ranking. GroupLens extends the Tapestry model in several ways. The small Tapestry communities were limited to reading and evaluating only a relatively small set of messages. A large community was needed to generate recommendations across a large stream of information, like Usenet news. GroupLens created a large virtual community where users could share recommendations without actually knowing each other. Surprisingly, the virtual community of GroupLens allowed personalization at the same time it assured privacy and anonymity. You did not need to know the identity of those you correlated with to gain the benet of their recommendations, unlike Tapestry where the benets came directly from your personal relationships with recommenders. Conceptually, GroupLens works by computing a correlation distance between each pair of users. For example, users who are close to user Jane, according to the distance function, form a neighborhood for Jane. GroupLens uses the opinions of the users in Janes neighborhood to form predictions about her Automatic
Best-seller list New York Times critic Consumer Reports Demographics Implicit collaborative filtering Explicit collaborative filtering

Impersonal

Tapestry Web surfing Library research Word of mouth

Personal

Manual

Figure 1. Information retrieval techniques. The vertical dimension indicates how difcult it is for the end user to access the ltered information, while the horizontal dimension indicates the level of personalization. Filters based on demographics require some effort from the end user and provide some level of personal ltering, so would be placed near the middle of the chart. Automated collaborative ltering requires relatively little effort from the end user and provides individually targeted recommendations, so it would be placed in the upper right corner of the chart.

interests. The opinions are weighed according to how close each member of the neighborhood is to Jane. GroupLens shows that predictions from an automated recommender system can be meaningful to users. Predictions generated by the GroupLens engine correlate well with user ratings and are more accurate than average ratings. Highly rated articles are more likely to be read and rated, which means that users are more likely to rate articles so that the system can better understand their interests.

RINGO AND VIDEO RECOMMENDER


In the mid-1990s other systems experimented with variations on the GroupLens model and algorithm. Upendra Shardanand and Pattie Maes developed Ringo, an e-mail and Web system that recommends music. They compared the GroupLens algorithm with others based on different statistical measures of similarity and another based on similarities among the music CDs rather than among users.

Their research veried that predictions improve as more ratings are collected. Video Recommender, which makes recommendations on movies, found a middle ground on the trade-off between lots of work and lots of value (the Tapestry model) and no work and little value (ratings by movie critics). In exchange for submitting ratings on a selected set of movies, the system generates personalized predictions that are more accurate than critic recommendations. Video Recommenders predictions have a 0.62 correlation coefcient, while movie critics achieve only a 0.22 correlation coefcient. Both Ringo and Video Recommender show that collaborative filtering can apply to all media, even domains like music and movies where computer-based content analysis is not yet possible. These systems showed collaborative filtering allows serendipity where content-based systems might not. If youve shown interest only in country-western music, for
April 1998

107

Internet Watch

example, a content lter would only recommend more country western. In a collaborative recommender system, however, users whose interests correlate with yours on country western might lead you to discover blues albums of interest. Ringo and Video Recommender also extend the virtual community to a real connected community by allowing users to post comments for others to read and by revealing e-mail addresses of users who have volunteered to reveal their identities. Users wanted to get to know others who shared their tastes and even requested a Video Recommender singles club. The knowledge derived from such clubs made users more condent in the recommendations they received.

To Read More about Collaborative Filtering


D. Goldberg et al., Using Collaborative Filtering to Weave an Information Tapestry, Comm. ACM, Dec. 1992, pp. 61-70. P. Resnick et al., GroupLens: An Open Architecture for Collaborative Filtering of Netnews, Proc. CSCW 94, ACM Press, New York, 1994, pp. 175-186. J. Konstan et al., GroupLens: Collaborative Filtering for Usenet News, Comm. ACM, Mar. 1997, pp. 77-87. D.A. Maltz and K. Erlich, Pointing The Way: Active Collaborative Filtering, Proc. CHI 95, ACM Press, New York, 1995, pp. 202-209. W. Hill et al., Recommending and Evaluating Choices in a Virtual Community of Use, Proc. CHI 95, ACM Press, New York, 1995, pp. 194-201. U. Shardanand and P. Maes, Social Information Filtering: Algorithms for Automating Word of Mouth, Proc. CHI 95, ACM Press, New York, 1995, pp. 210-217.

LOTUS
Lotus developed an active collaborative filtering system that revived the Tapestry model. Lotus researchers believed people could always give more relevant recommendations than any computed function, so they chose to ask for more work from the users in exchange for better predictions about user interests. Built in Lotus Notes, the system made it easy to send pointers to Web pages. Pointers could include hypertext links and annotations explaining the content, context, and relevance of the document. One pointer, for example, might read: Sally, you should denitely see this page on collaborative ltering.Jane. In the Lotus system, pointers could be sent to groups or individuals or published for all to see. Lotus found a striking division between those who would provide information and those who would use it. In the system Lotus implemented, one user was responsible for 80 percent of the pointers. These information mediators can help ensure the quality of information, helping other users grow to trust their recommendations. As in Tapestry, in small social workgroups information mediators may generate enough value to spend much of their time mediating information. In large anonymous groups, information mediation may require shared work from a larger community. 108
Computer

roupLens continues to evolve at the University of Minnesota and we are experimenting with new techniques to help people nd information that is of value to them. Weve found that time spent reading is a fairly accurate measure of a users rating for an article. Future GroupLens systems, then, will use time measurements to gather implicit ratings and to build predictions from those ratings. Users can of course immediately see the benefits of such a system, which requires little extra work to personalize their information needs. Collaborative ltering can also incorporate agent technology through ltering robots. Filterbots can automatically rate new articles as they appear by using different content analysis algorithms. The rst human raters to see these articles will already see predictions, personalized by their correlations with the various lterbots. New prediction algorithms will likely help counter the sparsity problem, where users have not rated enough items in common to correlate, and the scalability problem, where huge numbers of users and items require overwhelming computing resources. y

Al Borchers is a visiting faculty member and postdoctoral researcher at the University of Minnesota. He is developing collaborative filtering algorithms to power the next GroupLens system. Jon Herlocker is a PhD student at the University of Minnesota, researching algorithmic issues in collaborative ltering and ways to measure the effectiveness of recommender systems. Joseph Konstan is an assistant professor of computer science and engineering at the University of Minnesota. He also serves as consulting scientist for Net Perceptions, a company that he cofounded to commercialize collaborative ltering. John Riedl is an associate professor of computer science and engineering at the University of Minnesota. He is also chief technical ofcer of Net Perceptions and the cocreator of GroupLens. Contact the authors at {borchers,herlocke,konstan,riedl}@cs.umn.edu.

Web Recommender Systems


To experiment with recommender systems, you dont have to wait. There are already some online. Here are a few good examples: http://www.wisewire.com http://www.amazon.com http://www.moviender.com http://www.movielens.umn.edu http://www.cdnow.com http://www.bignote.com

Acknowledgements The authors gratefully acknowledge the contributions of GroupLens cofounder Paul Resnick, Hack Week participants Dave Maltz and Brad Miller, all the members of the GroupLens Research team, and the support of the National Science Foundation under grant IRI9613960.

Anda mungkin juga menyukai