Content Pruning - Did You See an Increase in SERP Visibility?

bernard

BuSo Pro
Joined
Dec 31, 2016
Messages
835
Likes
630
Degree
3
I'm also just now seeing some really solid jumps in the rankings. I have NOT pruned, cause I only have like 60+ posts anyway, but I've done a lot of EAT stuff. On site, off site.
I just saw particular ranking success on "cheap widget", which is a product feed page combining 10+ feeds with 500+ products. I'm the only one to do so. Only 300-400 word content. Search intent reigns supreme. Cheap = user wants a selection of choices ranked on price. Could Google really be as smart as to recognize a product comparison search engine?
 

Stones

BuSo Pro
Joined
Nov 3, 2018
Messages
56
Likes
36
Degree
0
This seemed like the best place to post this rather than a new thread.

Big Established Site

I came across a large fashion site that has been around for appox 10 years with a current ahrefs monthly traffic of over a million and 5-6k referring domains. Looks like they upped their game in 2019 with increased inbound links and a large amount of pruning/optimising content.

Standard Stuff / On A Large Scale

Articles have been consolidated, removing the ugly urls(year/date/month), into better articles. But due to the nature of the fashion niche, a lot of articles were cut, maybe not relevant or fashionable anymore. The site has been cut down significantly and stands at about 9,000 pages in the index ATM.

301'd To NoIndex

The interesting thing about what they've done, is everything is 301ed to an innerpage named named "Archived Articles", with a short paragraph explaining how stuff changes and rather than give an old article here are the best of our recent content etc. The page then links to categories/profitable affil articles. Think a custom 404 page for SEO.

But this site has marked the page noindex/follow.

Any Thoughts?

The almost 400 linking domains, are going to an unrelated page but the whole domain is pretty niched down. What's the angle with noindex?

The current orthodoxy is that noindex will lead to (an effective) nofollow tag after some time.(if you believe this).

I'd be interested to hear anyone's thoughts on this.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,974
Likes
7,585
Degree
8
The current orthodoxy is that noindex will lead to (an effective) nofollow tag after some time.(if you believe this).
This was one of those stupid, vague, cryptic statements that another Google employee debunked on the same day, but the debunking never outpaces the #fakenews. John Mueller sometimes plays the same old ego game Matt Cutts used to play: "I know something you don't know, not because I'm smarter, but because I have access. But if I state it like a wise old sage where you have to unravel the riddle, then you'll think I'm smart."

What John Mueller meant by this is that if a page is noindex'd then it won't ever be crawled again, so the links won't ever be followed again.

Gary Illyes came in and rained on his parade by fully explaining it. If a page has internal links leading to it, external links leading to it, menu links leading to paginated links, etc... if it's easily crawlable, it will be crawled. And the links won't ever be nofollow'd. The only way John's scenario would ever play out is if your noindex page was completely orphaned from the rest of the web.

You can definitely have pages be noindex with followed links. That's actually the default status. And if Google didn't include all of this in their link graph, they would have a very inaccurate and broken link graph. And if they didn't crawl noindex pages they wouldn't discover a lot of the web pages out there to begin with.

I like that solution the fashion site did, though. Like you said, it's like a custom 404 for SEO to distribute the page rank juice where they want it to go. Clever.
 

Stones

BuSo Pro
Joined
Nov 3, 2018
Messages
56
Likes
36
Degree
0
@Ryuzaki , I missed the Gary Illyes statement on this and it was only when I saw that Yoast had changed how they handle archives I put some faith in it. Yoast...
 
Joined
Mar 15, 2018
Messages
49
Likes
10
Degree
0
After deleting content and submitting a sitemap containing the deleted URLs, Google search console shows them as errors. I created the sitemap since @Ryuzaki suggested this might accelerate the deindexing of deleted content. Should I just ignore the errors or is there anything I can do to make Google know this is on purpose?
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,974
Likes
7,585
Degree
8
@F5K7, they are technically "errors." Google expected content it knew about and got a 404 error instead. It's an error only because you have them in a sitemap. Once you remove that sitemap it'll move to the gray tab instead of the red tab. This is fine and not really an error, it's just where they place them in the Coverage Report.

After starting this thread I ended up talking about this issue in various places on the forum. Here's two posts you'll find of interest:
You asked what you can do to make Google know it's on purpose. You can throw a 410 error instead of a 404. You can think of a 404 error as "Missing" and a 410 as "Gone" (as in, yes, there was content here and we purposefully have removed it).

I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.

There was a few times, since I had so many URLs, that I ended up filtering the temporary sitemap down to what was left so I could get a fresh look in the Coverage Report. This also helped get those URLs crawled again.

_____

UPDATE
I never really updated what happened with this mini-project, but removing the 147 low quality posts, fixing a few indexation errors with categories, and then fixing around 700 blank URLs being indexed due to robots.txt... I got a full recovery as far as I can tell so far. I became convinced it was a Panda problem and treated it as such and it took about 11 months after the fixes were deployed to finally pop back up in the SERPs:

 

secretagentdad

Keyword Sheeter - The Bestest Keyword Tool
BuSo Pro
Joined
Jan 29, 2015
Messages
265
Likes
304
Degree
1
@F5K7, they are technically "errors." Google expected content it knew about and got a 404 error instead. This is fine and not really an error, it's just where they place them in the Coverage Report.

After starting this thread I ended up talking about this issue in various places on the forum. Here's two posts you'll find of interest:
You asked what you can do to make Google know it's on purpose. You can throw a 410 error instead of a 404. You can think of a 404 error as "Missing" and a 410 as "Gone" (as in, yes, there was content here and we purposefully have removed it).

I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.

There was a few times, since I had so many URLs, that I ended up filtering the temporary sitemap down to what was left so I could get a fresh look in the Coverage Report. This also helped get those URLs crawled again.

_____

UPDATE
I never really updated what happened with this mini-project, but removing the 147 low quality posts, fixing a few indexation errors with categories, and then fixing around 700 blank URLs being indexed due to robots.txt... I got a full recovery as far as I can tell so far. I became convinced it was a Panda problem and treated it as such and it took about 11 months after the fixes were deployed to finally pop back up in the SERPs:

I feel like you deserve some kinda congratulatory trophy for defeating a black and white animal very publicly.
 
Joined
Mar 15, 2018
Messages
49
Likes
10
Degree
0
I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.
What do you think about serving a 410 for all 404 pages. That would be pretty easy to do and I don't see the downside.

Alternatively there is a plugin that lets you serve 410s for posts that are in trash.

What's the easiest way to serve 410s for a list of URLs?
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,974
Likes
7,585
Degree
8
@F5K7, that 410's for Trashed Posts sounds like an easy solution for this job. Once a post is deindexed you can perma-delete it, then ultimately delete the plugin once they're all gone.

Otherwise you'd probably need to use PHP to change the HTTP headers or do it within your .htaccess file. With such huge lists of URLs many of us are working with, it makes it kind of prohibitive or at least a time waster to do it this way. I like that "if post in trash, then 410" method.

What do you think about serving a 410 for all 404 pages. That would be pretty easy to do and I don't see the downside.
I wouldn't do this. 404 errors happen ALL the time. The amount I have showing up in the Search Console Coverage Report is insane, and that's just the ones Google is deciding to report. People link to me in crazy broken ways, they make up URLs hoping a page exists, etc.

What I don't want to do is serve a 410 error there. There's zero risk of it getting indexed because it goes to my 404 page, and the last thing I want to do is imply there was ever any content purposefully at these busted URLs, which a 410 does imply. "It was here, now it's not" versus the 404's "never was here, maybe it was, dunno, it's not here now, someone made a mistake"
 
Joined
Aug 8, 2018
Messages
12
Likes
9
Degree
0
I'm not sure if this was mentioned already, but I was just doing something similar due to an affiliate link cloaking plugin. Old search console has a URL removal tool. I used it today and the URLs were out of the index in just a few hours. The downside is that it may only be temporary (90 days). You can do all urls that start with a specific prefix in one shot too.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,974
Likes
7,585
Degree
8
I'm not sure if this was mentioned already, but I was just doing something similar due to an affiliate link cloaking plugin. Old search console has a URL removal tool. I used it today and the URLs were out of the index in just a few hours. The downside is that it may only be temporary (90 days). You can do all urls that start with a specific prefix in one shot too.
Yeah but the problem is this only hides them from the SERPs and doesn't remove them from the index. It's how I prolonged my own problem, trying this a couple times. You have to actually get them deindexed if you want to recover from any negative Panda effects, even if they're hidden from view in the SERPs from the URL Removal Tool.
 
Joined
Aug 8, 2018
Messages
12
Likes
9
Degree
0
Ah ok. I'll be sure to get them recrawled in the mean time now that the problem should be fixed. Thanks.
 
Joined
Mar 15, 2018
Messages
49
Likes
10
Degree
0
Anyone got an idea why Google won't drop domains out of the index that were moved with a 301?

I've checked several domains I have redirected, some years ago, but they are still indexed with apparently all of their pages. I indicated the move in search console. I wonder if this also affects the number of pages Google has indexed for a site and the overall website quality rating.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,974
Likes
7,585
Degree
8
@F5K7, because sometimes Google thinks it's more relevant to show the SERP results of the original domain when something related to that domain is searched. The users are still redirected where you want them, but Google is showing results that result in a click and a more satisfied searcher. It helps the searcher understand that a change has occurred with the website, versus showing them a result for a website that they never searched for.

Like if the searcher looked for "Tom's BBQ Recipe" but you 301'd it to Jerry's site, it makes more sense to display Tom's site than Jerry's, even though they land on Jerry's site ultimately.
 
Joined
Mar 15, 2018
Messages
49
Likes
10
Degree
0
What do you think about installing a plugin that redirects any 404s to closely related pages, if available.

If you delete huge amounts of content, there is a good chance that what was discredit as an irrelevant link adds up. This way you wouldn't lose any backlinks. Obviously that's only a solution for big sites where individual 301 redirects aren't feasible.
 
Joined
Sep 3, 2015
Messages
395
Likes
184
Degree
1
What do you think about installing a plugin that redirects any 404s to closely related pages, if available.

If you delete huge amounts of content, there is a good chance that what was discredit as an irrelevant link adds up. This way you wouldn't lose any backlinks. Obviously that's only a solution for big sites where individual 301 redirects aren't feasible.
I use the redirection plugin to monitor my 404s and every few days check and manually redirect anything to the correct (or next best) content.
 
Joined
Apr 12, 2019
Messages
136
Likes
216
Degree
1
What do you think about installing a plugin that redirects any 404s to closely related pages, if available.

If you delete huge amounts of content, there is a good chance that what was discredit as an irrelevant link adds up. This way you wouldn't lose any backlinks. Obviously that's only a solution for big sites where individual 301 redirects aren't feasible.
I wouldn't trust a plugin to determine which page is most relevant. I prefer to setup the redirect right when I delete the post. Rarely do I have a deleted page 404. There's usually something semi-relevant that I can 301 it to.

But if you go the 404 route, @Darth 's idea is nice.
 
Joined
Sep 17, 2014
Messages
454
Likes
311
Degree
2
@F5K7, there is a plugin for Wordpress, Redirection by John Godley, that's trusted and supported. Just updated 4 days ago. If your worried about your .htaccess file getting too long and bloated, this does PHP redirects. You still have to create each one individually (is that your concern?), but it puts the redirects in the database instead of .htaccess.
 
Joined
Mar 15, 2018
Messages
49
Likes
10
Degree
0
Thanks for all the replies.

I use the redirection plugin to monitor my 404s and every few days check and manually redirect anything to the correct (or next best) content.
That's a nice idea, however I am more concerned about links that I missed or discredited while pruning 80% of the content (10,000+ pages).

I wouldn't trust a plugin to determine which page is most relevant. I prefer to setup the redirect right when I delete the post.
This is unfortunately not feasible for the amount of pages I am dealing with.

If your worried about your .htaccess file getting too long and bloated, this does PHP redirects. You still have to create each one individually (is that your concern?)
It's more that I don't want to go through 10,000+ pages and redirect them manually. None of them have links above a certain DR, I kept those pages of course. But there are many natural contextual links that I had to ignore when deleting because of low DR or otherwise I couldn't have pruned properly. Now I think having those low DR links pointing at 404s is less beneficial than using this plugin, which actually does decent redirects.
 
Joined
Oct 14, 2014
Messages
78
Likes
59
Degree
0
This was one of those stupid, vague, cryptic statements that another Google employee debunked on the same day, but the debunking never outpaces the #fakenews. John Mueller sometimes plays the same old ego game Matt Cutts used to play: "I know something you don't know, not because I'm smarter, but because I have access. But if I state it like a wise old sage where you have to unravel the riddle, then you'll think I'm smart."
Oh god yes, this. So much this.

As for @Stones 's story, I've done similar with a site of my that had a ton of old links going to all sorts of thin inner content pages. I 301'd each to an "archive" page, and placed thumbs and descriptions to the main categories, along with a few manually selected featured/popular posts. I didn't noindex, of course. This was done a few years ago, and has worked very well. Juice and weight seems to flow like intended, and pretty much all new articles rank without having to do anything else.

This was an expired domain I took over, btw. Like, dropped, repurposed, re-regged.