Index or Noindex for Category and Archive pages in WP?

Joined
Dec 15, 2020
Messages
6
Likes
2
Points
0
For a niche site with 5 categories, should I set to NOINDEX for Category and Archive pages in WP via htaccess? What are the recommendations?
 

Ryuzaki

お前はもう死んでいる
Moderator
Joined
Sep 3, 2014
Messages
4,823
Likes
9,297
Points
9
To be very clear, Google's answer is "no". Keep it all indexed.

My personal answer is "yes, but with a caveat," with the caveat being that I keep the first page of the pagination indexed, and I set /page/2/ and onward to noindex.

You won't have access to the logic you need to pull that off in the .htaccess file anyways. You'll want to use functions.php to select the templates or page types and then insert it into the <head> using wp_head().

I absolutely don't use tags and would noindex them if I did because they get so out of hand. You'll have a million tags with one post each, dragging down your site's quality score because so much of those thin pages are being indexed.

And that's why I noindex the paginated pages of archives in general. They're all duplicate content (excerpts) from posts. It's okay to have them but I do everything possible to keep my quality (Panda) score as high as possible. On /page/1/, the only one I keep indexed, I have custom, static content on there so it's not all duplicated from across my site.

On a sidenote, some people will set "noindex, nofollow" so all links on the page are nofollowed. I don't do that. Just because my paginated pages are set to noindex does not mean that I don't want Google crawling them for discovery and passing page rank through the links. I just don't want them indexed. Big difference here.

That's what I do and unless you're interested into getting into the nitty gritty of it, creating a child theme and a new functions.php, figuring out how to code it to work the way you want, the answer is no. Just let it be indexed, as long as there's enough posts in there that they aren't basically empty "trash" pages you wouldn't want indexed.
 
Joined
Sep 3, 2015
Messages
502
Likes
237
Points
1
This topic has come up again - https://www.seroundtable.com/google-on-pagination-indexing-31018.html

One commenter points out that if you noindex archives, then Google will eventually stop visiting and seeing that some of your older (and potentially still highly relevant and valuable) pages are linked. They would then become orphaned in Google's eyes and drop?

John also said that its OK to just noindex...
 
Joined
Mar 27, 2015
Messages
427
Likes
566
Points
2
No Index, follow and make the category pages loaded with posts so depth does not get crazy and posts do not get buried as they age.
 

Ryuzaki

お前はもう死んでいる
Moderator
Joined
Sep 3, 2014
Messages
4,823
Likes
9,297
Points
9
then Google will eventually stop visiting and seeing that some of your older (and potentially still highly relevant and valuable) pages are linked. They would then become orphaned in Google's eyes and drop?
Google won't visit through your sitemap or your list of indexed pages if it's set to noindex. But they'll visit the first page of the category and then crawl through the rest through the pagination.

Your other posts won't be orphaned because page rank flows through posts that aren't in the index, too. If they're crawlable, they aren't orphaned.

Plus you're likely interlinking anyways. Every important post is going to have at least one internal link to it. You could even go through the effort to make sure every post gets one internal link.

But nothing will become orphaned or suffer crawling problems because the posts will be in the index (a crawling origin point) and will be in your sitemap (a crawling origin point). Not to mention many will have at least one backlink, too.
 
Joined
Sep 3, 2015
Messages
502
Likes
237
Points
1
Google won't visit through your sitemap or your list of indexed pages if it's set to noindex. But they'll visit the first page of the category and then crawl through the rest through the pagination.

Your other posts won't be orphaned because page rank flows through posts that aren't in the index, too. If they're crawlable, they aren't orphaned.
I don't quite get why indexing the first page of the archives is necessary? If that is enough to trigger /news/2/ etc getting crawled, then why wouldn't a link to /news/ from the HP be enough to trigger /news/ being crawled (despite being noindex) and followed to /news/2/ etc and achieving the same thing?
 

Ryuzaki

お前はもう死んでいる
Moderator
Joined
Sep 3, 2014
Messages
4,823
Likes
9,297
Points
9
I don't quite get why indexing the first page of the archives is necessary? If that is enough to trigger /news/2/ etc getting crawled, then why wouldn't a link to /news/ from the HP be enough to trigger /news/ being crawled (despite being noindex) and followed to /news/2/ etc and achieving the same thing?
You could say that about any page, that links from the homepage would get them crawled. The reason is is because we do want the archives indexed and surfacing in Google. We can take those first paginated pages and add content and extra design to them (that aren't shown on the /page/2/ paged pages) to help them be unique and more beefy.

Because they're usually linked in the top navigation and sometimes bottom, and are sitewide links, Google sees them as important to our sites hierarchies and navigational experiences and they also collect a lot of page rank. If you caress these pages well enough they can rank for nice keywords or a lot of lower volume ones.

I try to strike a balance between bloating out my indexation with pages with zero unique content on them (which Google expects, understands, and says is fine) and helping Google have the first page of an archive be high quality enough that they still have something to surface in that regard when they want to.

The quick answer would be "organic traffic".

I think that this whole thing can be avoided by just letting your paged pages (/category/3/) be indexed. It's how the web works and Google gets it. But I don't trust them that those pages won't hurt your quality score, like with Panda. It make sense if you think about them saying it's fine in the context of something you or me might do.

But then you think about the guys where each new post goes into a category and gets 25 new tags, tags that will never get used again. A site might have 50 posts and and 1,250 tag pages, an author archive with 5 pages, 10 categories, and a bunch of monthly date archives. Let's call that 1300 archive pages of zero quality vs. 50 posts of quality.

That's how quickly that can get out of hand, and I don't for one minute believe that Google is going to let it slide without Panda knocking at their door.