The Semantic Web is here. Those that are taking advantage of Semantic Technologies to build a Semantic SEO strategy are benefiting from staggering results. From a research paper put together with the team at WordLift, presented at SEMANTiCS 2017, we documented that structured data is compelling from the digital marketing standpoint.
For instance, on the analysis of the design-focused website freeyork.org, after three months of using structured data in their WordPress website we saw the following improvements:
- +12.13% new users
- +18.47% increase in organic traffic
- +2.4 times increase in page views
- +13.75% of sessions duration
In other words, many still think of Semantic Technologies belonging to the future, when in reality quite a few players in the digital marketing space are taking advantage of them already.
Semantic SEO is a new and powerful way to make your content strategy more effective. In this article/guide I will explain from scratch what Semantic SEO is and why it’s important.
Why Semantic SEO?
In a nutshell, search engines need context to understand a query properly and to fetch relevant results for it. Contexts are built using words, expressions, and other combinations of words and links as they appear in bodies of knowledge such as encyclopedias and large corpora of text.
Semantic SEO is a marketing technique that improves the traffic of a website by providing meaningful data that can unambiguously answer a specific search intent. It is also a way to create clusters of content that are semantically grouped into topics rather than keywords. In a famous Google patent on context-vectors, an example with the word “horse” is provided.
The document looked at how the same word can have different meanings: a “horse” is an animal for an equestrian, a working tool for a carpenter, and sport equipment for a gymnast. In Semantic SEO, much like Wikipedia does, content is cataloged and organized around each context in such a way that
Before we dive into how to use Semantic SEO, we need to talk about how important it is, and where it originated.
On Feb. 2009, Sir Tim Berners-Lee, the founding father of the web was performing a TED talk in Long Beach, Calif. He spoke of the the formation of a new web, built over the past net. A Semantic Web based on open linked data.
Now the Semantic Web is here, and its technologies are available to digital marketers to make their SEO strategy more effective. How did we get there? Let’s take a few steps back.
The Old Web
Today well over a billion websites comprise the web.
http://t.co/D9pwMXuZOa recently passed a billion websites by their count….
— Tim Berners-Lee (@timberners_lee) September 16, 2014
In less than a decade, the number of websites exploded. It comprised millions of pages. Berners-Lee figured out he could connect web pages with what we all know today as hypertext. However, surfing the web was still limited because you could just go from one page to the next through links: the effort it took to find what you were looking for was massive.
That is why many ventured out in finding a way to search through those pages to find specific content to queries.
This idea led to the creation of PageRank, which was the foundation of Google, an algorithm that could rank pages on the web based on the popularity of each page.
The more quality backlinks a website received the more it could rank higher in the SERP. Backlinks are still the backbone of the web. However, on that backbone a new web blossomed.
The New Web
In 2012, futurist Ray Kurzweil arrived at Google, with one mission: make search engines understand human language. From that quest Google updated its algorithm in 2013, with Hummingbird and later on in 2015, AI became a major factor for search RankBrain.
It was a revolution. In fact, even though Google looks at more than 200 factors to assess the ranking of a page, it also uses Artificial Intelligence to rank those pages. In other words, Google looks more and more at the intention behind a user query based on the context rather than keywords.
For instance, if I type in the search box “french fries” I may be looking for something to eat or just the story behind the name. Of course, if I do this search at 8 a.m., I’m probably more interested in learning the story behind the food.
If I do the same search at 8 p.m., I may be looking for something to eat for dinner. But how does the search engine know what is the context? It reads human language, through Natural Language Processing (NLP). Before we dive into it, how does NLP work?
The Power of Natural Language Processing
When I type in Google’s search box “moon distance,” that is what I get:
You may think this is simple keyword matching, but it is not.
In fact, if I ask “How far is the moon?”
I get the same answer:
Google’s ability to understand language goes further. If I search “moon distance in meters” that is what I get:
In short, Google knows I’m referring to the same thing and gives me the proper answer.
But if Google isn’t looking anymore at keywords (at least for specific queries) where is it getting the results? Google learns through topics. In the context of the Semantic Web those concepts are called entities. Those entities are organized in Giant Graphs. In fact, on May 16, 2012, Google announced the use of a massive Knowledge Graph.
In other words, it is a knowledge base to provide more useful and relevant results to searches using a semantic-search technique.
I’m going to reconstruct how a knowledge graph works, by starting from an entity. But really, what is an entity?
What is an Entity?
According to Wikipedia:
An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not.
In the context of Semantic Web an entity is much more than that:
In the Semantic Web an entity is the “thing” described in a document. An entity helps computers understand everything you know about a person, an organization or a place mentioned in a document. All these facts are organized in statements known as triples that are expressed in the form of subject, predicate, and object.
Why are Entities Way More Effective than Keywords?
For three simple reasons. Entities are:
Through entities you create meaningful relationships that can be read, understood, and interpreted by search engines. That is what Semantic SEO allows you to achieve. Entities, in the context of the Semantic Web, are really data points that computers can use to analyze and interpret the human language.
Let’s take this a step at the time.
How do entities gain context?
Metadata: Data About Data
In its most basic definition, metadata is just data about data.
The concept of metadata is not new. In fact, librarians have been using it for a long time to discover, and manage documents. Imagine, that for each document you’re specifying the author, date, book length, and so on. That is all metadata that helps classify a book. Therefore, it makes it easier to find it later on.
To work appropriately, metadata has to follow a logic of classification that everyone understands. In short, there must be a set of rules that everyone can follow to make the system work. Like in a vocabulary where grammatical rules are arbitrarily selected to create a standard language. Ontologies are the foundation of metadata.
The simplest form of an ontology is a vocabulary. The vocabulary that today makes Semantic SEO possible is called Schema.org.
Schema Markup: The Gold Standard of the Semantic Web
At the question “What is Schema.org?” that is what Google says:
As we saw, the pillar of the Semantic Web is a linked open vocabulary shared across different websites, driven by an open community and usable alongside other open vocabularies and ontologies. Like in language, where the lack of standard grammatical rules makes it hard for the same language to exist. So the Semantic Web wouldn’t be here without a gold standard. Schema.org is that gold standard.
In fact, out of all the competing standards that existed, Schema.org was the first linked open vocabulary that was introduced for a business-driven purpose (helping search engines organize the web and improve the quality of their results).
There are today 617 open vocabularies in the linked data world and they can be combined to organize and structure different knowledge domain.
In terms of SEO, Schema, being created by the search engine themselves is the most useful.
By adding schema markup to web pages, content is interlinked with data using standard linked vocabularies like schema.org and becomes more accessible.
What is Structured Data?
Structured data is a standardized format for providing information about a page and classifying that content on the page; for example, on a recipe page, what are the ingredients, the cooking time, the temperature, the calories, and so on.
Structured data that uses Schema.org as a reference vocabulary and can be embedded in web pages using three formats:
Imagine a book supported in three different formats: ebook, paperback, and hardcover. Each has different weights, sizes and so on. So does Schema.org.