Funky Robots.txt Issue

mackem

Test, Test, Test
Joined
Sep 6, 2018
Messages
50
Likes
107
Degree
0
Not sure if anyone can help out with this one but for some reason, the following is being added to my robots.txt file despite not setting it anywhere in my wordpress install:

Code:
Disallow: /wp-content/uploads/

The site is hosted on Siteground and running the Yoast SEO plugin and then going through CloudFlare

Bit at a loss as to what's causing this so any help is appriciated!
 
You many not realize this but the /wp-content/uploads/ folder will hold variations of sizes for each image. So if you upload an image, wordpress will create a thumbnail version, the original, and 2-3 other cropped versions. You DO NOT want that folder indexed - someone is saving your website from getting massed indexed a ton of low quality links, therefore impacting your overall “Domain Authority”.

However people should be allowed to make their own mistakes if they really want to, freedom; So to check if it’s CloudFlare edit your /etc/hosts file so when your computer goes to your domain it goes specifically to the IP Address you have bypassing cloudflare. Then check the robots.txt file - if the entry is in there, I doubt it is though, then it’s cloudflare. There is very little chance though cloudflare is doing this level of edit, but nothing surprises me anymore.

Most likely it’s your plugin, disable the plugin and check whether the robots file goes back to normal. It could also be another plugin like an image specific one. You guys really should be staying away from all these plugins. The more plugins the worse problems you’ll have down the road.

I’ve also seen instances of hosting companies dynamically editing robots.txt files to protect their bandwidth from “uneducated” users.
 
Thanks @CCarter. For some reason, it looks like CloudFlare was caching the robots.txt file when I was making changes.

Taking on board your comments, I've kept this as disallowed by Google Bot but added an exception for Twitter so that the cards show up properly on the feed (the reason I was looking into this).
 
how do you go about deindexing the uploads folder? Just noticed it is indexed on one of my sites
 
You DO NOT want that folder indexed
True, you don't want this full directory indexed but, you don't want to block Google from accessing images that you use within your content either. Otherwise Google will not discover and index your images. If you attempt to access your uploads directly (/wp-content/uploads/) and not the full image path it should return a 403 and not index that entire directory.

Google should only be able to discover images that you have linked to within your content, it will never see the other versions because it would have no way to crawl that directory with the 403.

how do you go about deindexing the uploads folder? Just noticed it is indexed on one of my sites

Can you view this directory yourself if you got to example. com/wp-content/uploads/? If you can you may want to look into getting this to return a 403.

For the cloudflare issue, they likely didn't add it but they do sometimes cache it. You might have a plugin adding it. Under page rules I have robots.txt cache level set to bypass.
 
Last edited:
Back