Content Pruning - Did You See an Increase in SERP Visibility?

Joined
Dec 31, 2016
Messages
679
Likes
502
Degree
2
I'm also just now seeing some really solid jumps in the rankings. I have NOT pruned, cause I only have like 60+ posts anyway, but I've done a lot of EAT stuff. On site, off site.
I just saw particular ranking success on "cheap widget", which is a product feed page combining 10+ feeds with 500+ products. I'm the only one to do so. Only 300-400 word content. Search intent reigns supreme. Cheap = user wants a selection of choices ranked on price. Could Google really be as smart as to recognize a product comparison search engine?
 

Stones

BuSo Pro
Joined
Nov 3, 2018
Messages
45
Likes
34
Degree
0
This seemed like the best place to post this rather than a new thread.

Big Established Site

I came across a large fashion site that has been around for appox 10 years with a current ahrefs monthly traffic of over a million and 5-6k referring domains. Looks like they upped their game in 2019 with increased inbound links and a large amount of pruning/optimising content.

Standard Stuff / On A Large Scale

Articles have been consolidated, removing the ugly urls(year/date/month), into better articles. But due to the nature of the fashion niche, a lot of articles were cut, maybe not relevant or fashionable anymore. The site has been cut down significantly and stands at about 9,000 pages in the index ATM.

301'd To NoIndex

The interesting thing about what they've done, is everything is 301ed to an innerpage named named "Archived Articles", with a short paragraph explaining how stuff changes and rather than give an old article here are the best of our recent content etc. The page then links to categories/profitable affil articles. Think a custom 404 page for SEO.

But this site has marked the page noindex/follow.

Any Thoughts?

The almost 400 linking domains, are going to an unrelated page but the whole domain is pretty niched down. What's the angle with noindex?

The current orthodoxy is that noindex will lead to (an effective) nofollow tag after some time.(if you believe this).

I'd be interested to hear anyone's thoughts on this.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,841
Likes
7,301
Degree
8
The current orthodoxy is that noindex will lead to (an effective) nofollow tag after some time.(if you believe this).
This was one of those stupid, vague, cryptic statements that another Google employee debunked on the same day, but the debunking never outpaces the #fakenews. John Mueller sometimes plays the same old ego game Matt Cutts used to play: "I know something you don't know, not because I'm smarter, but because I have access. But if I state it like a wise old sage where you have to unravel the riddle, then you'll think I'm smart."

What John Mueller meant by this is that if a page is noindex'd then it won't ever be crawled again, so the links won't ever be followed again.

Gary Illyes came in and rained on his parade by fully explaining it. If a page has internal links leading to it, external links leading to it, menu links leading to paginated links, etc... if it's easily crawlable, it will be crawled. And the links won't ever be nofollow'd. The only way John's scenario would ever play out is if your noindex page was completely orphaned from the rest of the web.

You can definitely have pages be noindex with followed links. That's actually the default status. And if Google didn't include all of this in their link graph, they would have a very inaccurate and broken link graph. And if they didn't crawl noindex pages they wouldn't discover a lot of the web pages out there to begin with.

I like that solution the fashion site did, though. Like you said, it's like a custom 404 for SEO to distribute the page rank juice where they want it to go. Clever.
 

Stones

BuSo Pro
Joined
Nov 3, 2018
Messages
45
Likes
34
Degree
0
@Ryuzaki , I missed the Gary Illyes statement on this and it was only when I saw that Yoast had changed how they handle archives I put some faith in it. Yoast...
 
Joined
Mar 15, 2018
Messages
42
Likes
10
Degree
0
After deleting content and submitting a sitemap containing the deleted URLs, Google search console shows them as errors. I created the sitemap since @Ryuzaki suggested this might accelerate the deindexing of deleted content. Should I just ignore the errors or is there anything I can do to make Google know this is on purpose?
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,841
Likes
7,301
Degree
8
@F5K7, they are technically "errors." Google expected content it knew about and got a 404 error instead. It's an error only because you have them in a sitemap. Once you remove that sitemap it'll move to the gray tab instead of the red tab. This is fine and not really an error, it's just where they place them in the Coverage Report.

After starting this thread I ended up talking about this issue in various places on the forum. Here's two posts you'll find of interest:
You asked what you can do to make Google know it's on purpose. You can throw a 410 error instead of a 404. You can think of a 404 error as "Missing" and a 410 as "Gone" (as in, yes, there was content here and we purposefully have removed it).

I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.

There was a few times, since I had so many URLs, that I ended up filtering the temporary sitemap down to what was left so I could get a fresh look in the Coverage Report. This also helped get those URLs crawled again.

_____

UPDATE
I never really updated what happened with this mini-project, but removing the 147 low quality posts, fixing a few indexation errors with categories, and then fixing around 700 blank URLs being indexed due to robots.txt... I got a full recovery as far as I can tell so far. I became convinced it was a Panda problem and treated it as such and it took about 11 months after the fixes were deployed to finally pop back up in the SERPs:

 

secretagentdad

BuSo Pro
Joined
Jan 29, 2015
Messages
221
Likes
257
Degree
1
@F5K7, they are technically "errors." Google expected content it knew about and got a 404 error instead. This is fine and not really an error, it's just where they place them in the Coverage Report.

After starting this thread I ended up talking about this issue in various places on the forum. Here's two posts you'll find of interest:
You asked what you can do to make Google know it's on purpose. You can throw a 410 error instead of a 404. You can think of a 404 error as "Missing" and a 410 as "Gone" (as in, yes, there was content here and we purposefully have removed it).

I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.

There was a few times, since I had so many URLs, that I ended up filtering the temporary sitemap down to what was left so I could get a fresh look in the Coverage Report. This also helped get those URLs crawled again.

_____

UPDATE
I never really updated what happened with this mini-project, but removing the 147 low quality posts, fixing a few indexation errors with categories, and then fixing around 700 blank URLs being indexed due to robots.txt... I got a full recovery as far as I can tell so far. I became convinced it was a Panda problem and treated it as such and it took about 11 months after the fixes were deployed to finally pop back up in the SERPs:

I feel like you deserve some kinda congratulatory trophy for defeating a black and white animal very publicly.
 
Joined
Mar 15, 2018
Messages
42
Likes
10
Degree
0
I never did the 410 method simply because I didn't want to deal with writing code for it. But with the 404 you may have to get the pages crawled a couple of times before Google says "okay, we get it, it's gone and we'll deindex it." Most pages will drop out pretty quickly and the final stragglers will take months.
What do you think about serving a 410 for all 404 pages. That would be pretty easy to do and I don't see the downside.

Alternatively there is a plugin that lets you serve 410s for posts that are in trash.

What's the easiest way to serve 410s for a list of URLs?
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,841
Likes
7,301
Degree
8
@F5K7, that 410's for Trashed Posts sounds like an easy solution for this job. Once a post is deindexed you can perma-delete it, then ultimately delete the plugin once they're all gone.

Otherwise you'd probably need to use PHP to change the HTTP headers or do it within your .htaccess file. With such huge lists of URLs many of us are working with, it makes it kind of prohibitive or at least a time waster to do it this way. I like that "if post in trash, then 410" method.

What do you think about serving a 410 for all 404 pages. That would be pretty easy to do and I don't see the downside.
I wouldn't do this. 404 errors happen ALL the time. The amount I have showing up in the Search Console Coverage Report is insane, and that's just the ones Google is deciding to report. People link to me in crazy broken ways, they make up URLs hoping a page exists, etc.

What I don't want to do is serve a 410 error there. There's zero risk of it getting indexed because it goes to my 404 page, and the last thing I want to do is imply there was ever any content purposefully at these busted URLs, which a 410 does imply. "It was here, now it's not" versus the 404's "never was here, maybe it was, dunno, it's not here now, someone made a mistake"
 
Joined
Aug 8, 2018
Messages
12
Likes
8
Degree
0
I'm not sure if this was mentioned already, but I was just doing something similar due to an affiliate link cloaking plugin. Old search console has a URL removal tool. I used it today and the URLs were out of the index in just a few hours. The downside is that it may only be temporary (90 days). You can do all urls that start with a specific prefix in one shot too.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,841
Likes
7,301
Degree
8
I'm not sure if this was mentioned already, but I was just doing something similar due to an affiliate link cloaking plugin. Old search console has a URL removal tool. I used it today and the URLs were out of the index in just a few hours. The downside is that it may only be temporary (90 days). You can do all urls that start with a specific prefix in one shot too.
Yeah but the problem is this only hides them from the SERPs and doesn't remove them from the index. It's how I prolonged my own problem, trying this a couple times. You have to actually get them deindexed if you want to recover from any negative Panda effects, even if they're hidden from view in the SERPs from the URL Removal Tool.
 
Joined
Aug 8, 2018
Messages
12
Likes
8
Degree
0
Ah ok. I'll be sure to get them recrawled in the mean time now that the problem should be fixed. Thanks.
 
Joined
Mar 15, 2018
Messages
42
Likes
10
Degree
0
Anyone got an idea why Google won't drop domains out of the index that were moved with a 301?

I've checked several domains I have redirected, some years ago, but they are still indexed with apparently all of their pages. I indicated the move in search console. I wonder if this also affects the number of pages Google has indexed for a site and the overall website quality rating.
 

Ryuzaki

女性以上のお金
Staff member
BuSo Pro
Digital Strategist
Joined
Sep 3, 2014
Messages
3,841
Likes
7,301
Degree
8
@F5K7, because sometimes Google thinks it's more relevant to show the SERP results of the original domain when something related to that domain is searched. The users are still redirected where you want them, but Google is showing results that result in a click and a more satisfied searcher. It helps the searcher understand that a change has occurred with the website, versus showing them a result for a website that they never searched for.

Like if the searcher looked for "Tom's BBQ Recipe" but you 301'd it to Jerry's site, it makes more sense to display Tom's site than Jerry's, even though they land on Jerry's site ultimately.