Senin, 18 April 2011

Latent Semantic Indexing and Search Engines Optimimization (SEO)

By Jose Nuñez

The closest search engines have come to actual applications of this technology so far is know as "Associative Indexing" and it is put in effect under Stemming, or the indexing of words on the basis of their uninflected roots (plurals, adverbs, and adjectival forms are reduced to simple noun and verb forms before indexing).

Latent Semantic Analysis (LSA) is a technique in natural language processing, in particular in vectorial semantics, invented in 1990 [1] by Scott Deerwester, Susan Dumais, George Furnas, Thomas Landauer, and Richard Harshman. In the context of its application to information retrieval, it is sometimes called Latent Semantic Indexing (LSI).

Here are some quick facts about Latent Semantic Indexing:
1. LSI is 30% more effective than popular word matching methods.
2. LSI uses a fully automatic statistical method (Singular Value Decomposition)
3. It is very effective in cross-languages retrievals.
5. LSI can retrieve relevant information that does not contain query words.
6. It finds more relevant information than other methods.

Latent Semantic Indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines document collections as a whole, to see which others do contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones that have few words in common to be semantically distant. This method correlates surprisingly well with how a human being looking at content, classifies multiple documents.

Tidak ada komentar:

Posting Komentar