Kamis, 12 April 2012

I, robot: How do search engine spiders and robots work?

Some internet surfers still hold on to the mistaken belief that actual people visit each and every website and then input it for inclusion in the search engine’s database. Imagine, if these were true! With billions of websites available on the internet and with a majority of these sites offering fresh content it will take thousands of people to achieve the tasks made by search engine spiders and robots – and even then they won’t be as efficient or as thorough.

Search engine spiders and robots are pieces of code or software that have only one aim – seek content on the internet and within each and every individual web page out there. These tools have a very important role in how effectively search engines operate.

Search engine spiders and robots visit websites and get the necessary information that it needs to determine the nature and content of the website and then adds the data to the search engine’s index. Search engine spiders and robots follow links from one website to another so that it can consistently and infinitely gather the necessary information. The ultimate goal of search engine spiders and robots is to compile a comprehensive and valuable database that can deliver the most relevant results to the search queries of visitors.

But how exactly do search engine spiders and robots work?

The whole process begins when a web page is sent to a search engine for submission. The submitted URL is added to the queue of websites that will be visited by the search engine spider. Submissions can be optional though because most spiders will be able to find the content in a web page if other websites link to the page. This is the reason why it is a good idea to build reciprocal links with other website. By enhancing the link popularity of your website and getting links from other sites that have the same topic as your website.

When the search engine spider robot visits the website, it checks if there is an existing robots.txt file. The file tells the robot which areas of the site are off limits to its probe – like certain directories that have no use for search engines. All search engine bots look for this text file so it is a good idea to put one even if it is blank.

The robots list and store all of the links found on a page and they follow each link to its destination website or page.

The robots then submit all of this information to the search engine, which in turn compiles the data received from all the bots and builds the search engine database. This part of the process already has the intervention of search engine engineers who write the algorithms employed in evaluating and scoring the information that the search engine bots compiled. The moment all of the information is added to the search engine database this information is already made available to search engine visitors who are making search queries in the search engine.

Tidak ada komentar:

Posting Komentar