Friday, September 24, 2010

Components of Search Engine

Broadly search engines are divided into 2 main categories:
a). Crawler based search engines b). Human powered directories
Any crawler based search engine is made up of 3 basic components.
a). Crawler or Spider
b). Index
c). Search engine software
All these components work one after one and list the page on search engines. Search engines find websites in 2 ways:
1. By accepting listings sent by webmasters 2. By crawlers that roam the internet storing links and information about each page they visit.
Once the site is found by the search engine, crawlers scan the entire site. While scanning, the crawler visits the web page, reads it and then follows link to other pages within the site. Major search engines like Google, Yahoo and MSN use multiple search engines simultaneously.
Google uses 4 spiders which crawl over 100 pages per second and generating around 600KBs of data each second.
Then index program starts after the crawler. Once a webpage is crawled, it is necessary to transfer them to the database. The index contains a copy of each web pages scanned by the crawler. If the webpage is changed, the index is updated with the new information. It is very important that your pages are added to the index. Until and unless it is indexed, it is not available to those searching with the search engines.
The search engine software performs a task of relevant listings. It searches the entire database i.e. indexed pages and matches it with the search. It then ranks and lists the most relevant matches. These listings are done on how the search engine software is programmed. It delivers listings according to what it believes the most relevant content is!

1 comments:

basically search engines are made by 3 components-
1)spider
2)index
3)search engine software
In Google search engine 4 spiders are used which are crawl
more than 100 pages per sec.

Post a Comment

Share

Twitter Delicious Facebook Digg Stumbleupon Favorites More