Sabtu, 23 Juli 2011

Latent Semantic Analysis (LSA) and Search Engines (SEO)

By Jose Nuñez

Latent Semantic Analysis (LSA) is applied by taking millions of web pages, where the search engines can learn which words are related and which noun concepts relate to one another. Searh Engines are considering related terms and recognizing which terms that frequently occur together, maybe on the same page, or in close enough proximity. So it is mainly used for language modeling or most other applications.

Part of this process involves looking at the copy content of a page, or included on the links, and looking through the ways on how they are related. Latent Semantic Analysis (LSA) is based on the well known Singular Value Decomposition Theorem from Matrix Algebra but applied to text. That is why some of the semantic analysis that is done at the page content level it may also be done on the linkage data.

LSA represents the meaning of words as a vector, thus calculating word similarity. Iit has been very efficient to that purpose, and is still used. Regarding text for this application, is considered linear. This makes LSA slow due to using a matrix method called Singular Value Decomposition to create the concept space. But it does only address semantic similarity and not ranking, which is the SEO priority.

Scientific SEOs have a similar goal. They try to discover which words and phrases are most semantically linked together for a given keyword phrase, so when Search Engines crawl the web, they find that links to particular pages and content within them is semantically related to other information that is currently in their database. So, in conclusion, LSA calculates a measure of similarity for words based on possible occurrence patterns of words in documents and on how often words appear in the same context or together with the same set of common elements.

Tidak ada komentar:

Posting Komentar