Google is retiring ‘noindex’ and other unsupported robots.txt code


In a new post on their official Webmaster’s Blog, Google announced that it would be officially retiring unsupported lines of code in the robots.txt file that websites use to instruct how the search algorithm should crawl and index their site. It comes after a previous blog post announcing that their production robots.txt parse is going to switch to open source.

Google is retiring ‘noindex’ and other unsupported robots.txt code

“In the interest of maintaining a healthy ecosystem and preparing for potential future open source releases, we’re retiring all code that handles unsupported and unpublished rules (such as noindex) on September 1, 2019.”

 

The pending changes mean that websites currently using a noindex directive in their robots.txt need to find and implement an alternative before September 1st. Google has listed some alternatives they recommend:

  1. Use the noindex directive in robots meta tags (Google argues this is the most effective way to prevent URLs from being indexed)
  2. Using 404 or 410 HTTP status codes to indicate the page does not exist and should be removed from Google’s index
  3. Password protecting pages means they won’t be indexed unless you indicate the content is paywalled or subscription locked
  4. Using disallow in robots.txt to prevent the page from being crawled, which means that the content on said page won’t be indexed
  5. Google’s Search Console Remove URL tool

Ultimately, these changes are simply Google cleaning up how robots.txt is used so that they have a more uniform set of rules for people to follow. It shouldn’t have any negative effect on your website or its ranking as long as you implement an alternative method to flag which pages you don’t want indexed.





Source link

?
WP Twitter Auto Publish Powered By : XYZScripts.com