Anda di halaman 1dari 19

Search Engines

Tefko Saracevic

Definition
Search
COMPUTING (transitive verb) to examine a computer file, disk, database, or network for particular information

Engine
something that supplies the driving force or energy to a movement, system, or trend

Search Engine
a computer program that searches for particular keywords and returns a list of documents in which they were found, especially a commercial service that scans documents on the Internet
Tefko Saracevic 2

Brief History
Very First tool used for searching was Archie created in 1990. Aliweb was next to come in 1993 which used the crawler. Web crawler and Lycos were next to come in 1994.

Tefko Saracevic

How Search Engines Work

Crawler
URL1 URL2

Indexer

The Web
URL3 URL4

Search Engine Database


Tefko Saracevic

Eggs?

Eggs.

Eggs - 90% All About Eggo - 81% Your Eggs Ego- 40% by Browser Huh? -Am S. I. 10%
4

Ways of Searching
Keyword searching Refined Searching Relevancy Rankings Information on meta tags Concept based Searching

Tefko Saracevic

Few Search Engines


AltaVista (www.altavista.com) Excite (www.excite.com) Infoseek (www.go.com) Lycos (www.lycos.com) HotBot (www.hotbot.com) Yahoo (www.yahoo.com) Google (www.google.com)

Tefko Saracevic

Web Crawler
Create a copy of all visited pages for later processing by a search engine. used for automating maintenance tasks on a website, such as checking links or validating HTML code

Tefko Saracevic

can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
for a number of reasons crawlers cover only a fraction, not cover-invisible web.

Tefko Saracevic

Indexing
Search engine Indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query. Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power.

Tefko Saracevic

elaboration similarities, differences


all search engines have these basic parts in common BUT the actual processes methods how they do it are based on various algorithms & they differ most are proprietary with details kept mostly
secret but based on well known principles from information retrieval or classification to some extent Google is an exception they published their method

Tefko Saracevic

10

Case of
developed by Sergey Brin and Lawrence Page while students at Stanford
in the beginning run on Stanford computers

basic approach has been described in their famous paper The Anatomy of a Large-Scale Hypertextual Web Search Engine
well written, simple language, has their pictures in acknowledgement they cite the support by NSFs Digital Library Initiative i.e. initially, Google came out of government sponsored research describe their method PageRank - based on ranking hyperlinks as in citation indexing We chose our system name, Google, because it is a common spelling of googol, or ten on hundredth power
Tefko Saracevic 11

Coverage Differences
no engine covers more than a fraction of WWW
estimates: none more than 16%
hard (even impossible) to discern & compare coverage, but they differ substantially in what they cover
Tefko Saracevic 12

o in addition: many national search engines own coverage, orientation, governance many specialized or domain search engines

own coverage geared to subject of interest


many comprehensive sources independent

Tefko Saracevic

13

Advantages of search engine


Search vast databases Very easy to use Sophisticated searching often available Normally global
Tefko Saracevic 14

Limitations
Automated method of collecting informations rather crude. Information may be out of context . May produce out of date sites.

Tefko Saracevic

15

Search engines are also many times victims of spamdexing. use of techniques that push rankings higher than they belong is also called spamdexing. methods typically include textual as well as link-based techniques.

Tefko Saracevic

16

Search (SEO)

Engine

Optimization

SEO is one of the key Web Marketing activities.


It is a part of search engine marketing. SEO + SEM = PPC(pay par click) When any user search on Google, on the right side, display some adds on right side under Sponsor Links section, these are called Pay Per Click adds.

Tefko Saracevic

17

Tefko Saracevic

18

Thank you

Tefko Saracevic

19

Anda mungkin juga menyukai