This is a guide to surviving RankBrain. I created it, in part, because there’s an amazing amount of misinformation about RankBrain. And the truth is there is nothing you can do to optimize for RankBrain.
I’m not saying RankBrain isn’t interesting or important. I love learning about how search works whether it helps me in my work or not. What I am saying is that there are no tactics to employ based on our understanding of RankBrain.
So if you’re looking for optimization strategies you should beware of the clickbait RankBrain content being pumped out by fly-by-night operators and impression hungry publishers.
You Can’t Optimize For RankBrain
I’m going to start out with this simple statement to ensure as many people as possible read, understand and retain this fact.
You can’t optimize for RankBrain.
You’ll read a lot of posts to the contrary. Sometimes they’re just flat out wrong, sometimes they’re using RankBrain as a vehicle to advocate for SEO best practices and sometimes they’re just connecting dots that aren’t there.
Read on if you want proof that RankBrain optimization is a fool’s errand and you should instead focus on other vastly more effective strategies and tactics.
What Is RankBrain?
RankBrain is a deep learning algorithm developed by Google to help improve search results. Deep learning is a form of machine learning and can be classified somewhere on the Artificial Intelligence (AI) spectrum.
I think of Deep Learning as a form of machine learning where the algorithm can adapt and learn without further human involvement. One of the more interesting demonstrations of deep learning was the identification of cats (among other things) in YouTube thumbnails (pdf).
How Does RankBrain Work?
Knowing how RankBrain works is important because it determines whether you can optimize for it or not. Despite what you might read, there are only a handful of good sources of information about RankBrain.
Greg Corrado
The first is from the October 26 Bloomberg RankBrain announcement that included statements and summaries of a chat with Google Senior Research Scientist, Greg Corrado.
RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand. If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.
This makes it pretty clear that RankBrain uses vectors to better understand complex language.
Word2Vec is most often referenced when talking about vectors. And it should be noted that Jeff Dean, Greg Corrado and many others were part of this effort. You’ll see these same names pop up time and again surrounding vectors and deep learning.
I wrote a bit about vectors in my post on Hummingbird. In particular I like the quote from a 2013 Jeff Dean interview.
I think we will have a much better handle on text understanding, as well. You see the very slightest glimmer of that in word vectors, and what we’d like to get to where we have higher level understanding than just words. If we could get to the point where we understand sentences, that will really be quite powerful. So if two sentences mean the same thing but are written very differently, and we are able to tell that, that would be really powerful. Because then you do sort of understand the text at some level because you can paraphrase it.
I was really intrigued by the idea of Google knowing that two different sentences meant the same thing. And they’ve made a fair amount of progress in this regard with research around paragraph vectors (pdf).
It’s difficult to say exactly what type of vector analysis RankBrain employs. I think it’s safe to say it’s a variable-length vector analysis and leave it at that.
So what else did we learn from the Corrado interview? Later in the piece there are statements about how much Google relies on RankBrain.
The system helps Mountain View, California-based Google deal with the 15 percent of queries a day it gets which its systems have never seen before, he said.
That’s pretty clear. RankBrain is primarily used for queries not previously seen by Google, though it seems likely that its reach may have grown based on the initial success.
Unfortunately the next statement has caused a whole bunch of consternation.
RankBrain is one of the “hundreds” of signals that go into an algorithm that determines what results appear on a Google search page and where they are ranked, Corrado said. In the few months it has been deployed, RankBrain has become the third-most important signal contributing to the result of a search query, he said.
This provoked the all-too-typical reactions from the SEO community. #theskyisfalling The fact is we don’t know how Google is measuring ‘importance’ nor do we understand whether it’s for just that 15 percent or for all queries.
Andrey Lipattsev
To underscore the ‘third-most important’ signal boondoggle we have statements by Andrey Lipattsev, Search Quality Senior Strategist at Google, in a Q&A with Ammon Johns and others.
In short, RankBrain might have been ‘called upon’ in many queries but may not have materially impacted results.
Or if you’re getting technical, RankBrain might not have caused a reordering of results. So ‘importance’ might have been measured by frequency and not impact.
Later on you’ll find that RankBrain has access to a subset of signals so RankBrain could function more like a meta signal. It kind of feels like comparing apples and oranges.
But more importantly, why does it matter? What will you do differently knowing it’s the third most important signal?
Gary Illyes
Another source of RankBrain information is from statements by Gary Illyes in conversation with Eric Enge. In particular, Gary has been able to provide some examples of RankBrain in action.
I mean, if you think about, for example, a query like, “Can you get a 100 percent score on Super Mario without a walk-through?” This could be an actual query that we receive. And there is a negative term there that is very hard to catch with the regular systems that we had, and in fact our old query parsers actually ignored the “without” part.
And RankBrain did an amazing job catching that and actually instructing our retrieval systems to get the right results.
Gary’s statements lend clear support to the idea that RankBrain helps Google to better understand complex natural language queries.
Paul Haahr
Perhaps the most interesting statements about RankBrain were made by Paul Haahr, a Google Ranking Engineer, at SMX West during his How Google Works: An Google Ranking Engineer’s Story presentation and Q&A.
I was lucky enough to see this presentation live and it is perhaps the best and most revealing look at Google search. (Seriously, if you haven’t watched this you should turn in your SEO card now.)
It’s in the Q&A that Haahr discusses RankBrain.
RankBrain gets to see some subset of the signals and it’s a machine learning or deep learning system that has its own ideas about how you combine signals and understand documents.
I think we understand how it works but we don’t understand what it’s doing exactly.
It uses a lot of the stuff that we’ve published on deep learning. There’s some work that goes by Word2Vec or word embeddings that is one layer of what RankBrain is doing. It actually plugs into one of the boxes, one of the late post retrieval boxes that I showed before.
Danny then asks about how RankBrain might work to ascertain document quality or authority.
This is all a function of the training data that it gets. It sees not just web pages but it sees queries and other signals so it can judge based on stuff like that.
These statements are by far the most important because it provides a plethora of information. First and foremost Haahr states that RankBrain plugs in late post-retrieval.
This is an important distinction because it means that RankBrain doesn’t rewrite the query before Google goes looking for results but instead does so afterwards.
So Google retrieves results using the raw query but then RankBrain might rewrite the query or interpret it differently in an effort to select and reorder the results for that query.
In addition, Haahr makes it clear that RankBrain has access to a subset of signals and the query. As I mentioned this makes RankBrain feel more like a meta-signal instead of a stand-alone signal.
What we don’t know are the exact signals that make up that subset. Many will take this statement to theorize that it uses link data or click data or any sundry of signals. The fact is we have no idea which signals RankBrain has access to nor with what weight RankBrain might be using them or if they’re used evenly across all queries.
The inability to know the variables makes any type of regression analysis of RankBrain a non-starter.
Of course there’s also the statement that they don’t know what RankBrain is doing. That’s because RankBrain is a deep learning algorithm performing unsupervised learning. It’s creating its own rules.
More to the point, if a Google Ranking Engineer doesn’t know what RankBrain is doing, do you think that anyone outside of Google suddenly understands it better? The answer is no.
You Can’t Optimize For RankBrain
You can’t optimize for RankBrain based on what we know about what it is and how it works. At its core RankBrain is about better understanding of language, whether that’s within documents or queries.
So what can you do differently based on this knowledge?
Google is looking at the words, sentences and paragraphs and turning them into mathematical vectors. It’s trying to assign meaning to that chunk of text so it can better match it to complex query syntax.
The only thing you can do is to improve your writing so that Google can better understand the meaning of your content. But that’s not really optimizing for RankBrain that’s just doing proper SEO and delivering better user experience (UX).
By improving your writing and making it more clear you’ll wind up earning more links and, over time, be seen as an authority on that topic. So you’ll be covered no matter what other signals RankBrain is using.
The one thing you shouldn’t do is think that RankBrain will figure out your poor writing or that you now have the license to, like, write super conversationally you know. Strong writing matters more now than it ever has before.
TL;DR
RankBrain is a deep learning algorithm that plugs in post-retrieval and relies on variable-length text vectors and other signals to make better sense of complex natural language queries. While fascinating, there is nothing one can do to specifically optimize for RankBrain.
The Next Post: The Future of Mobile Search
The Previous Post: Query Classes