Google to Stop Supporting the Unofficial Noindex Robots.txt Directive

Ryuzaki · Jul 2, 2019

Source: https://webmasters.googleblog.com/2019/07/a-note-on-unsupported-rules-in-robotstxt.html

Heads up. There's a bit of bad information that floats around about using 'noindex' in your robots.txt file. Lots of people do it, and while it was never an officially supported directive for that file, sometimes Google would respect it as would other crawlers.

Google, in an attempt to start unifying and getting everyone on the same page with robots.txt, is open sourcing their parser for it.

This also coincides with them completely dropping support for noindex in the robots.txt. So if this is you, then you're going to need to use one of the other acceptable methods to keep a page out of the index:

Noindex in robots meta tags
404 and 410 HTTP status codes
Password protection
Disallow in robots.txt
Search Console Remove URL tool

But note, #3 and #4 can still result in your page being in the index if the page is still discoverable. #5 will result in the page still being indexed, though not showing in the SERPs. Those alone are not sufficient methods to noindex a page. #1 and #2 are the only workable methods for publicly discoverable pages. #3 and #4 just control crawling, not indexing. #5 is just for vanity.

Why this matters? Panda.

Ryuzaki · Jul 29, 2019

Final reminder:

Google is now sending out Search Console statements if you're using noindex within robots.txt. It was never officially supported and now is flat out not supported. You're going to be a sad camper if this is you and you ignore these warnings.

I know many avoid all Google products on their sites so I want to get the message out there. Beware!

Google to Stop Supporting the Unofficial Noindex Robots.txt Directive

Ryuzaki

お前はもう死んでいる

Ryuzaki

お前はもう死んでいる