X
    Categories: GoogleGoogle RankBrainSEO

RankBrain: Everything We Know About Google’s AI Algorithm

We have a new piece of the SEO puzzle from Google.  They have introduced a new ranking signal in the form of machine learning and artificial intelligence called RankBrain.

Bloomberg broke the story that not only has Google introduced RankBrain, but that it has already been using it for some time in the search results.

We also reached out to Google, and they responded to The SEM Post with plenty of answers with how this works and what it impacts.  Author Jack Clark has shared additional details outside of the Bloomberg article, to give additional insight into the new AI.

What is RankBrain?

RankBrain is an artificial intelligence Google is using in order to serve better search results, particularly for the 15% of daily search queries that Google has never seen before.

The Bloomberg article describes it in simpler terms:

RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand. If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.

It is only one of the pieces of Google’s search algorithm, but it is a fairly significant one.

When It Went Live

It is not known the exact date this went live, although it was earlier this year.  Bloomberg reports it as being active for the past few months.

I asked Google and they did not have a more specific date to share, just that “it was rolled out gradually starting early in 2015.”

There is a big difference between “a few months” and “early in 2015”, for those trying to see if any previous ranking changes we noticed could possibly be attributed to RankBrain.  There have been multiple times we have noticed clear changes that Google has made to the core ranking algo, but we can’t be sure if any of these updates we saw can be attributed to RankBrain launch.

As Google doesn’t generally comment on core ranking changes, unless we get a more specific time frame, it will be difficult to narrow down which of the changes could be attributed to RankBrain and then reverse engineer the changes we saw.

That said, for it to be such a strong ranking signal, it is almost certain one of the “updates” we saw was actually RankBrain rolling out.

Covers All Languages

This is not just for English queries, Google confirmed to The SEM Post that they are using RankBrain on all languages.

This is especially important because it shows RankBrain can be applied regardless of the language used.

Third Most Important Signal

According to the Bloomberg article, this is the third most important signal in the Google ranking algorithm, so that is pretty significant.

I also reached out to Google, and a Google spokesperson shared that “It is one of hundreds of signals, but a significant one.”

It also stands to reason that because it does well at those 15% of never before seen search queries, that it will be a larger ranking signal with those queries.

The Google spokesperson also said “It’s especially helpful on long-tail queries, such as the 15% never seen before each day.”

And no, Google did not tell Clark the first or second most important signals, although many can make a pretty educated guess at this.

Used on a Large Set of Queries

One of the big misunderstandings is that RankBrain is just used on the 15% of new-to-Google search queries.  But it is used much more.

While this technology is particularly good at results for the 15% of searches, it is being used on a large percentage of Google’s search queries.

Not Restricted to Types of Search Queries

Some were wondering whether this RankBrain might be more useful for certain types of searches – beyond the 15% new ones – and if Google would skew towards using RankBrain for those specific types of queries.  But this is not the case.

The Google spokesperson said it is “not limited to any particular set of queries.”

It also means SEOs can’t reverse engineer it, to see if it is applied more for certain market areas or topics.

Not Continually Learning

Gary Illyes on Twitter was asked whether it was continually learning, it is not.  So it doesn’t evolve with each search query on-the-fly.

RankBrain Updates

The author shared on Twitter that it is periodically “re-trained”.

When I asked Google about the frequency of updates, they said they will update as needed.  “We’ll keep experimenting with and testing new models, and we’ll make updates as we come up with models that do a better job than the existing one,” the Google spokesperson said.  “That could be about refreshing the data or developing new neural net architectures.”

We could see this continue to evolve, but it will be hard to translate current or new RankBrain signals into SEO.

What Does This Mean for SEO?

Whenever Google makes a change to their algo, there is always a chorus of “SEO is dead” followed up by many articles about it.  But Gary Illyes confirms that “SEO magic” still works.

And while we have definitely seen algo changes, attributed to the usual “updates to one of our hundreds of ranking signals”, SEO is still alive and well even with the introduction of RankBrain.

Thought vectors

Here is where it gets really interesting.  According to Jack Clark, it is “converting words and phrases into vectors.”

The Hinton referred to is Professor Geoff Hinton with many accomplishments in artificial neural networks.  He is a professor at the University of Toronto and, when his company DNNresearch Inc was acquired by Google, as a Distinguished Researcher for Google.

Word2vec Connection

Word2vec is one that many are speculating is the basis for RankBrain.  Clark posted additional comments on the Word2vec connection on Hacker News.

They wouldn’t explicitly confirm that it is word2vec, but everything we discussed indicated it’s likely doing something roughly equivalent to word2vec, and is also doing similar conversions for sequences which is likely connected to Sequence to Sequence learning (PDF: http://papers.nips.cc/paper/5346-sequence-to-sequence-learni…). It also links to Geoff Hinton’s stuff on Thought Vectors which implicitly involves word2vec.

When I asked Google if it was based on Word2vec, the Google spokesperson said “It’s related to word2vec in that it uses ’embeddings’ — looking at phrases in high-dimensional space to learn how they’re related to one another.”

Converting Words and Phrases Into Vectors

From a technical aspect, RankBrain is converting words and phrases into vectors, which can then be used for deep learning.

Hinton gave a keynote lecture on deep learning at The  Royal Society that talks about these connections.

The implications of this for document processing are very important.

If we can convert a sentence into a vector that captures the meaning of the sentence, then Google can do much better searches.  They can search based on what is being said in a document.

Also, if you can convert each sentence in a document into a vector, you can then take that sequence of vectors and try and model why you get this vector after you get these vectors.  That’s called reasoning, that’s natural reasoning, and that was kind of the core of good old fashioned AI and something they could never do because natural reasoning is a complicated business, and logic isn’t a very good model of it. 

Here we can say, well, look, if we can read every English document on the web, and turn each sentence into a thought vector, we’ve got plenty of data for training a system that can reason like people do. Now, you might not want to reason like people do on the web, but at least we can see what they would think.

So I think what is going to happen over the next few years is this ability to turn these sentences into thought vectors is going to rapidly change the level that we can understand documents.

He also talks about the current scaling issues.

To understand at human levels, we are probably going to need human level resources, and we’ve got trillions of connection and the biggest neural net we run so far have at most a few billion connections, so we are a few orders of magnitude off still.   But I’m sure the hardware people will help us out.

Possible RankBrain Patents

Bill Slawski has already written about the possible connections with RankBrain and patents, including one that specifically deals with how Google can replace search terms within a query.  The patent is “Using concepts as contexts for query term substitutions,” filed in 2012 but published August 2015.

You will find Slawski’s analysis Investigating Google RankBrain and Query Term Substituions well worth the read, as well as the patent itself, for those looking to learn as much about RankBrain as possible.

Deep Diving Into Deep Learning

For those who really want to deep dive into this more, in addition to the video above, there are several papers related to deep learning and thought vectors.

Deep Learning, Nature, LeCun, Y., Bengio, Y. and Hinton, G. E. (PDF)

Distilling the knowledge in a neural network, Hinton, G. E., Vinyals, O., and Dean, J. (PDF)

And some more videos:

And a couple of older ones that provide a great background of deep learning:

Spam

This definitely raises the question of whether or not an AI is smarter at detecting spam, or if it can prevent itself from serving spam that the rest of Google’s core ranking algo fails to catch.

Game changer?

Is this a game changer?  It definitely changes how Google sees and handles searches, even if it went largely unnoticed by the SEO community.  I am sure Google will test dialing this up and down as a ranking signal, if it hasn’t already, especially as it continues to update with new data or models.

Update: Want to know what the industry thinks about RankBrain? 9 industry experts weigh in.

The following two tabs change content below.

Jennifer Slegg

Founder & Editor at The SEM Post
Jennifer Slegg is a longtime speaker and expert in search engine marketing, working in the industry for almost 20 years. When she isn't sitting at her desk writing and working, she can be found grabbing a latte at her local Starbucks or planning her next trip to Disneyland. She regularly speaks at Pubcon, SMX, State of Search, Brighton SEO and more, and has been presenting at conferences for over a decade.
Jennifer Slegg :Jennifer Slegg is a longtime speaker and expert in search engine marketing, working in the industry for almost 20 years. When she isn't sitting at her desk writing and working, she can be found grabbing a latte at her local Starbucks or planning her next trip to Disneyland. She regularly speaks at Pubcon, SMX, State of Search, Brighton SEO and more, and has been presenting at conferences for over a decade.