Searching for truth: we ask an expert to explain how search engines have evolved over the years...

Dylan Yates, SEO Manager at Blue Array takes a look at how search engines have evolved over the last few years to understand semantic meaning behind search queries, and dives into some of the key factors that help them to do this.

Since the launch of Google’s Hummingbird algorithm in 2013, understanding the semantic meaning of words has played a key role in how content is ranked for organic search.

Not only is Artificial Intelligence now utilised by Google in the form of RankBrain, the ability of a computer to draw semantic meaning from phrases is helping search engines accurately solve long tail, previously unseen queries.

With the rise of Voice Search, and the hunt for ‘One True Answer’, never has the need for search engines to clearly understand the precise semantic meaning behind a user’s query been so important.

What is Semantics?

Semantic search relies on the ability of the algorithm to draw commonality between entities (people, places and things). This way, the search engine can better understand the meaning of word it hasn’t come across before.

The relationship of this new word to surrounding words (context), and the progressive sequence of searches a user makes when pursuing an answer are two of the main ingredients used by Google.

Dictionary.com describes Semantics as a form of linguistics. It defines semantics as;

a) The study of meaning.
b) The study of linguistic development by classifying and examining changes in meaning and form.

There are a wealth of commonly used English expressions spelt alike that express different meanings. The intention behind a user’s query may be unclear if one of these is used in isolation.

‘Native’ is a good example of this; typed into a search engine as a singular expression, the user’s intent is obscure, though we might assume they’re looking for a definition. ‘Native’ is also the name of a One Republic album, as well as being the name of a brand of electronic production software.

If unsure about the user’s intent, search engines will often display a range of results, including documents that fulfill queries for all the above. Their understanding of the word would be greatly improved by a longer tail search query (e.g. ‘native instruments’), or by the web page that the user selects from the range presented to them. Google is using these signals, as well as many others, to help understand the semantics meaning of words.

Factors that help search engines understand the meaning of the word, may include the following:

1) Personalisation, and historical data: searches performed and documents previously accessed by the user
2) Sequential searches (a sequence of ‘Penguin’, followed by ‘Penguin Links’ indicates an interest in SEO, not ornithology)
3) Location (the answer for a search such as ‘weather now’ relies on the user’s locality)
4) The device on which the search is made (Google automatically gives searches on mobile device’s local intent)
5) Different spellings, or variations of a word (‘World Cup Winner’ and ‘World Cup Winners’ searches have different intent, despite the difference between the two strings signified by just one letter, ‘s’)
6) The time of day a search is performed (evening searches for ‘places to eat’ might yield different results to the same search in the morning)
7) Global search history (trending news topics might cause an algorithm to produce a different result for ‘hurricane’ depending on current news)

Understanding content in context

Google also uses many signals within a piece of copy to help it understand the context in which it should be placed. Filling your pages with “keywords” is no longer an option; a document has to be understood as a whole piece, with external signals that support its meaning.

External and onpage factors that may aid search engines decipher the context of a piece of content include the following:

1) Relationships between entities in a document (the proximity of words to each other)
2) The relevancy of any outwardly-linked domains
3) The relationship between URL and searcher in response to their query (which results they click, or ignore)
4) Advertisements presented on the domain in response to the query
5) Anchor text in the in-linking URL’s
6) The domain on which the document is hosted (other content around the site)

As SEO’s, this means we need a holistic approach to creating content, thinking about the way we link to external sources, and the way URLs link to us. The better we present the theme of a document, the easier it is for search engines to understand the types of queries that should be returned.

‘Truth’ in search: Does an improved semantic understanding of words get search engines any closer to providing users quick, single-sentence answers?

The query; ‘who discovered Eminem’, might lead to a number of answers. Was it Dr Dre, the rap-producer mogul who signed Eminem to his Aftermath Records label in 1998? Was it Dean Geistlinger, the Interscope intern who saw Eminem at the Los Angeles rap olympics before taking a copy of his ‘Slim Shady EP’ to give to Dr Dre? Perhaps it was the Bass Brothers, a production duo from Detroit who helped Eminem to produce the ‘Slim Shady EP’ he was carrying that day at the rap olympics?

A search on Google (as of 20th October 2017) for this query produces a Featured Snippet box, of which biography.com is credited with providing (in Google’s eyes) the most accurate answer:



Biography.com comes to the conclusion that Dr Dre was the man to be credited with the discovery. Just a couple of results further down, however, and this answer is contradicted by popular Hip Hop news site HNHH, with the assertion that in fact it was the intern Dean Geistlinger who discovered the rap mega star.

The point is, there is no clear answer to this question. Even if posed to humans, it would produce varying points of view and it’s unclear anyone could offer a ‘definitive’ answer. Often this is the case for real life situations, where black and white distinctions are rarely valid answers. This makes Google’s search for “One True Answer” problematic.

The future for semantic search?

Semantic search has advanced leaps and bounds in the four-and-a-bit years since Hummingbird was rolled out. Generally, search engines are now very good at providing a list of documents that will satisfy a user’s query with one of its results.

Search engines are still a long way, however, from answering questions in one self-contained, snippet. Perhaps this is the unavoidable nature of truth, and something a machine will never be able to replicate to the same success as a human (if humans, even, are capable of this).

One thing, though, is clear: SERP results such as Featured Snippets are still at an algorithmically immature stage, and will need a lot of work before we can rely on their validity in the same way we’ve come to rely on the traditional array of 10 blue links.

ABOUT BLUE ARRAY

Blue Array was founded in 2015 as a specialist rather than generalist agency focused completely on the discipline of SEO. So different is the model, they had to make up a whole new word to describe themselves. Part agency, part consultancy, a ‘Consulgency. In just a couple of short years the team has grown from two to fifteen with an enviable list of clients from big brands such as Time Inc, Netmums and carwow to smaller startups such as Lexoo and RiseArt. The team is led by founder Simon Schnieders and Head of SEO Sean Butcher.

October 2017