SEMANTIC WEB: Latent Semantic Indexing has changed the web for good!

Posted on Posted in International SEO, Internet Marketing, Technical SEO

By Mary,
(disclaimer: check 2016 updates added to the bottom of the article)

What does semantic mean? It is the study of meaning in communication.

So, what then is Semantic Search?

Semantic Web, part of the new 3.0 technology is being used by the engines. Latent semantic indexing allows a search engine to determine what a page is about outside of specifically matching search query text.

This was not a big factor with Google, but all that has changed. Semantic Search is now at the core of its search engine technology.

Semantic technology is a human-software architecture that identifies associations and concepts related to a query. It allows the robots to read keywords but also the similarity between them and the links within a site to make sense of what a page is about.

Another way to picture this is to use a cube where all these factors need to be moved around to compose a perfect six-sided match. It’s the same way with semantic search–3-dimensional approach. The bot engine comes to a site and no longer looks at one side of your page; it reads through the entire site’s meaning, by observing closely at the composition of the contextual elements to determine where the site stands as a whole. For instance, if you search for “web design” a semantic search engine might retrieve documents containing the words “HTML”, “Dreamweaver” and “video training”, even if the word “web design” is not found in the source document.

.

A semantic search engine automatically identifies the concepts structuring the text content.

“For example, if you search for ‘principles of physics’, our algorithms understand that ‘angular momentum,’ ‘special relativity,’ ‘big bang’ and ‘quantum mechanics’ are related terms that could help you find what you need,” wrote Ori Allon, technical lead of Google’s Search Quality team, and Ken Wilder, team engineer at the company’s Snippets project on their blog.

In January of this year, during Google’s fourth-quarter earnings conference call, CEO Eric Schmidt touched briefly on this topic, hinting that the company is getting more serious about semantic search technology. “Wouldn’t it be nice if Google understood the meaning of your phrase, rather than just the words that are in the phrase? We have [done] a lot of discoveries in that area that are going to roll out [soon],” Schmidt said.

Microsoft is in the workings as well. Last year they acquired Powerset, one of the companies developing and perfecting semantic search engines, in order to improve its Web search engine with semantic search technology.

And most recently, since july 19th, 2010 Metaweb is now part of Google, so very soon Google will have tools for webmasters to plug their sites into Entities.

.

What are Entities?

It is a singular person, place or thing. For example, one thing could be named different ways or vise-versa, one word could mean many things. So by using the process of identifying millions of identities and what other sites use to relate to them, Metaweb (now Google) builds a map of single words into entities to show how they are related.

Entities are smarter than words. “It’s a collaborated process that involves the online community.” Metaweb explains on their blog. Wait!”community?” Does this sound like Google is trying to put SEO out of business? You will be the judge of that. Watch this video explaining the entire process.

TO .

.

Where does this leave current SEO tactics?

For now, lets think of search engines as a 5-year old child, and try to talk to it in terms it will understand. So keywords are still a key on-page element. For example, it’s good to do your keyword research and use those keywords into your pages, but additionally try to use variations of SYNONYMS of the same keyword.

Other Recommendations

-Use independent topics instead of terms!

Use close variants. This would include, singular/plural forms, misspellings, abbreviations and acronyms, and stemmings (like “floor” and “flooring”). Synonyms (like “quick” and “fast”) and related searches (like “flowers” and “tulips”) are not considered close variants.

The illustration below outlines the relative reach of different keyword match type strategies. As it shows, modified broad match keywords match more searches than the equivalent phrase match keyword, but fewer searches than the equivalent broad match keyword. Match behavior also depends on the specific words you modify. For example, the keyword formal +shoes will match the search “evening shoes,” but the keyword +formal +shoes will not.

.

.

.

.

.

.

.

.

.

.

.

..

……………………………………………………………………………………………………………………………………………………………………………………….

.

IMPORTANT: Keywords that have become more important than title tag (believe it or not).

-Use Keywords that trigger action or interaction with the user and/or provides the user with something valuable. Like using the following keywords: “for example,” an example of this,” “an study conducted by,” ” here is a demo,” “comment on this” (this would be for blogs), “read white paper,” “Join,” “poll(s),” “news,’ “related articles,” “follow us,” “case study.” and so on.

If you’ve noticed, some of these keywords are the description of tools that creates on-the-fly elements, otherwise called, real-time SEO [Read more about real-time SEO.] Google is reading into these real-time content generators as important factors, even more than title tags.

.

For web development

Buildings sites based on entities, which is a matter of plug-in in your site to an entity data base.
(tool will soon be available on Google webmaster tools).

Update:

To learn how to go about turning your keywords into smart keywords (themes), and plug thise into entities (databases), check our 2016  article about Understanding Intent & Concepts Themes (SEO Smart Keyword Research)

It’s suggested to begin using blogs instead of regular sites. For example, WordPress can run a site just as good as any other regular website technology, and the upside is that Google is ranking sites with social factors and freshness in them. Blogs are a perfect fit for that. A blog can be designed and developed to look like a full-blown website, with the advantage of all the widgets you can add to it.

A case study on this is a recent multiple local listing I did for a client, we set up a blog specifically to rank for one of his offices in Florida. I themed about 20% with related (not relevant–how about that) local mentions and close variant keywords. In less than two days, we were ranking in third position on Google organic. In the other hand, It took about two weeks to show up on Google maps. What this means is that the blog was faster on ranking than the submission we did to Google maps.

Another option is to use SCRM. CRM (customer relationship management) is actually evolving to SCRM (Social CRM). More than a tool, Social CRM is a shift we are seeing in behavioral and interaction based not technology driven. We can see from the diagram below what this evolution means to both the company and the customer. It could be confusing for some web developers to understand that applying social elements and using CRM systems is not longer a one way street task anymore. Websites now at days need to be adapt to serve the demand for the new generation of social customers/users.


Lean LSA modeling method

“Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text.” Explains U.C.

It is essential that you understand the LSA modeling methods before using semantic applications on a website. A good way to start learning LSA is by checking the study conducted by the University of Colorado on LSA.

 

Apply the SVD to a term document matrix:

– The intermediate dimensions correspond to topics

– Terms that usually occur together get bundled (synonyms)

– Terms having several meanings get assigned to several topics (polysemes)

What do we need?

– Relate single terms to topics

– Relate documents to topics

– Relate query terms to topics

Note: It’s good to remember that LSI is only one of many types of NLP (Natural Lenguage Processing).

Update (February 06, 20016)

I wrote this post on semantic search six years ago, and it was kind of ahead of its time, in that semantic was a very new concept that not many even heard of. SEOs and Digital Masters that new the old school of doing things, got lost in the many updates of Google algorithms that occurred in the years that followed (Panda, Penguin, EMD, Page layout, Hummingbird, and other).

Google

To get more up-to-day data algoroo.com is a great tracking tool of Google Algorithm updates

Till this day, 20016, many online markers and technical SEOs are trying to figure out what all these Google updates mean.  I still recommend understanding semantic search, since it is at the core Google’s goal and vision.  By understanding what semantic is and what it can do, it is more likely  to predict where google is heading and aiming for.

This post (from 2010) was so ahead of its time, that I kept it in my blog instead of posting this into a major publication. I sort of regret it because I was so correct, and it was a good peace of SEO intel to share.

I will continue to add to this post, so it can contain more application than theory. 

In 2012, two years after I wrote my post, Google introduced knowledge Graph, an intelligent model that understands real-world entities and their relationships to one another: things, not strings.  In other words, for years the Google search engine was essentially about matching keywords. With semantic, Google now tries to understanding the meaning and intent of your search query and serving search results that best answer it. For example the word, “red carpet.: could have many meanings. It could be the name of a restaurant, the actual name of a carpet in red color, but also a movie, a song, or a hotel…….

The Next Generation Of  Search

Namedentity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify elements in text into pre-defined categories such as the names of persons, organizations, locations, expressions of times, quantities, …read more

Understanding Intent & Concepts:

Intent:
The easy way to explain Intent is by reading a quote from Google’s own Senior Vice President Amit Singhal (former–he left Google this week) said, “people communicate with each other by conversation, not by typing keywords – and we’ve been hard at work to make Google understand and answer your questions more like people do.” What they have been working on is to deliver search results based on User Intent.

Concepts:
Concepts are based on Information Retrieval (IR). The term IR was introduced by Calvin Mooers in 1951, who defined in this way:

“Information retrieval is the name for the process or method whereby a prospective user of information is able to convert his need for information into an actual list of citations to documents in storage containing information useful to him. It is the finding or discovery process with respect to stored information. It is another, more general, name for the production of a demand bibliography. Information retrieval embraces the intellectual aspects of the description of information and its specification for search, and also whatever systems, technique, or machines that are employed to carry out the operation. Information retrieval is crucial to documentation and organization of knowledge”. (Mooers, 1951, p. 25).

Example of N.E.R

BEFORE
BEFORE
Example_of_NER2
AFTER

 

Read more about user Intent and Concept themes

Again, providing valuable and relevant content but mixed with new semantic tactics. This is the new SEO.

You can also read this article on how Google Explains Social Links Are Better Than Link Building. One of the many factors semantic has changed the web for good.

Related news and articles:

Update (February 6, 2016) NER part of the New Keyword Strategy

 

More about the term Semantic Search

______________________________________________________________________________

 

Semantic search

From Wikipedia, the free encyclopedia

Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results. Author Seth Grimes lists “11 approaches that join semantics to search”[1], and Hildebrand et al. [2] provide an overview that lists semantic search systems and identifies other uses of semantics in the search process.

Guha et al.[3] distinguish two major forms of search: Navigational and Research. In navigational search, the user is using the search engine as a navigation tool to navigate to a particular intended document. Semantic Search is not applicable to navigational searches. In Research Search, the user provides the search engine with a phrase which is intended to denote an object about which the user is trying to gather/research information. There is no particular document which the user knows about that s/he is trying to get to. Rather, the user is trying to locate a number of documents which together will give him/her the information s/he is trying to find. Semantic Search lends itself well here.

Rather than using ranking algorithms such as Google’s PageRank to predict relevancy, Semantic Search uses semantics, or the science of meaning in language, to produce highly relevant search results. In most cases, the goal is to deliver the information queried by a user rather than have a user sort through a list of loosely related keyword results.

Other authors primarily regard semantic search as a set of techniques for retrieving knowledge from richly structured data sources like ontologies as found on the Semantic Web [4]. Such technologies enable the formal articulation of domain knowledge at a high level of expressiveness and could enable the user to specify his intent in more detail at query time.

\

Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace, whether on the Web or within a closed system, to generate more relevant results. Author Seth Grimes lists “11 approaches that join semantics to search”[1], and Hildebrand et al. [2] provide an overview that lists semantic search systems and identifies other uses of semantics in the search process.

Guha et al.[3] distinguish two major forms of search: Navigational and Research. In navigational search, the user is using the search engine as a navigation tool to navigate to a particular intended document. Semantic Search is not applicable to navigational searches. In Research Search, the user provides the search engine with a phrase which is intended to denote an object about which the user is trying to gather/research information. There is no particular document which the user knows about that s/he is trying to get to. Rather, the user is trying to locate a number of documents which together will give him/her the information s/he is trying to find. Semantic Search lends itself well here.

Rather than using ranking algorithms such as Google’s PageRank to predict relevancy, Semantic Search uses semantics, or the science of meaning in language, to produce highly relevant search results. In most cases, the goal is to deliver the information queried by a user rather than have a user sort through a list of loosely related keyword results.

Other authors primarily regard semantic search as a set of techniques for retrieving knowledge from richly structured data sources like ontologies as found on the Semantic Web [4]. Such technologies enable the formal articulation of domain knowledge at a high level of expressiveness and could enable the user to specify his intent in more detail at query time.

The following two tabs change content below.

By Mary

Head of Integrated Digital Marketing and International Outreach
Maris Pozo is an Integrated Digital Marketer and member of the American Marketing Association, with over 9 years of experience on the agency side-and has worked with clients such as HP, LATimes, and Monster.com--In bilingual markets in Spain and U.S.

230total visits.

  • This information is very good! didn’t know about semantic. Thx.

  • samirthukral

    hi
    Thank you for u'r post

    Samir Thukral

  • samirthukral

    Hai

    Nice post

  • AVV

    Thats interesting!

  • AVV

    Nice article Mary, thanks