Big site structure

Joined
Aug 2, 2017
Messages
156
Likes
96
Degree
0
I'm starting a new site that (I hope) is going to be pretty big content-wise.

The structure is something that I want to get right from the beginning because I know it can be a pain in the ass to fix once your site is big.

Looking at some big sites I got some ideas and this is what I think is best:

  • A virtual silo for the posts
  • A physical silo for the categories

Examples:

Post about Terrier Dogs

mysite.com/something-something-terrier-dogs

At the same time, this post would be in 3 categories:

mysite.com/Animals/Pets/Dogs

I would mark up the post with schema breadcrumbs and at the bottom/top some visible tags for easy navigation

mysite.com/animals would have the most recent posts but also some fixed links to all the subcategories (in this case > pets). Some static and unique content too if posible.

I think that this gives me flexibility in case that I need to add categories/subcategories or change some posts around without breaking anything and with minimal redirects in case something needs to be changed

This also gives the user a good structure to navigate the site.

What do you think?

Thanks!
 
I would advise against post/categories if you use Wordpress. It's made for blogging, updated content and the standard category archives are usually data based and not visually effective.

Use pages and custom posts. You can have parent-child pages just like categories. It's really not an issue anymore, with Wordpress at least, because you can easily drag-and-drop grid like archive functions into your pages.

With pages, your "parent" pages are also your category pages, except you are not limited to these bad practises of posts/categories. You can have your parent pages be both skyscraper, information content while still having an "archive" section for navigation and indexing purposes.
 
I think that looks good. At the end of the day, I don't think it matters too much any more. I like the hardcoded hierarchy of .com/folder/sub-folder/post-name. But as you've pointed out, there's the huge benefit of being able to move posts around to different categories when you keep a flat hierarchy at the .com/post-name level, without disrupting a single thing.

I like the user to see the sub-folders in the URL because it implies the site is bigger and they can easily navigate by editing the URL too, but there's the negative of URLs getting longer. Short ones are nice too, but I don't buy that the benefits are that huge of having them short, but there is one there in regards to the ease of of users sharing and pasting around.

I think you'd be perfectly fine doing it the way you've laid out.
 
I chose category because it's easy to select it when you create a new post.

I create a new post about Terrier dogs, and put it under the "Dogs" category, which by hierarchy is under Animals/Pets

So every category page is filled with that new post. I don't know if you can do that with pages.

I would create beautiful category pages, not the ugly archive page style.

Category pages are in fact harder to style individually, but it can be done via category-slug.php or category-id.php

Or something via ACF https://www.advancedcustomfields.com/resources/adding-fields-taxonomy-term/

Most of them would fall under some standard layout, I don't want a specific layout for each one.
 
Yeah, I know this exact issue. I thought about it and changed my structure from category to page anyway.

You can have both of course too, the only thing I am saying is, for money pages and the most important static content, it's best to have it in pages. I'd rather add tags to pages than deal with posts.

Consider it SEO wise too, if you have category pages, your breadcrumbs run up to your category page, which indeed you can style, but it isn't meant as a major landing page/money page. Instead, if you have pages with breadcrumbs, then your parent pages will show for every subpage in breadcrumbs and seo value will run up to your parent page, which IS a landing page.

Fundamentally, I don't like using posts because posts are designed as date based taxonomies and it shows ins a variety of ways, the category page being one. Pages are meant to be static.

Posts can still be used for news items and the like, but I really think Pages are the better choice long term in Wordpress for your static content.

Anyway, I understand that with Wordpress you can hack your way around.

This guy agrees:

https://nichesiteproject.com/wordpress-silos/#
 
You can add anything you want to a category page by editing the Category Description:

TiqODiO.jpg

Drop this in your functions.php:
Code:
remove_filter('pre_term_description', 'wp_filter_kses');
By doing that, you stop category descriptions from stripping out HTML. Now you can build a post right there for each category to serve as your "parent page."

From there, just drop this at the top of your category.php:
Code:
<?php echo category_description(); ?>

And it'll all display.

That's just as easy (perhaps easier) as going the Page route and having to edit in a custom loop and pagination to handle the archives, or manually adding each and every child page you create.

The reason I'm voting for this method is because DarkRed said it's going to be a big site. Automating and removing as much friction as possible is going to save an enormous amount of time later once he hits scale.

Even if you make a loop to loop through child pages, you lose all the benefits of it being a "static page" where you can order the posts in the loop however you want. You can still achieve this using sticky posts if you desired. Or you can drop internal links out of the content portion to the posts that matter the most (better since it's contextual at that point).
 
Ryu, I agree for scale, it is easier, but why then does so many recommend the page route?

You can add categories to pages too though.

In your meta you can have your "seo breadcrumbs", note for SEO, I would say this is fairly important, which links to your upper scyscraper content and landing pages, then below it you can have a "Category: Pets", which links to a normal or edited category page. Assuming you add categories to pages, which isn't there by default.

More important question, outside user and admin easy of use, are there any legit seo tests done on pages vs posts on larger sites?
 
Automating and removing as much friction as possible is going to save an enormous amount of time later once he hits scale.
I wanted to highlight this. It's EXTREMELY important to consider, up front, when planning a Wordpress site that you think will reach significant scale.

@ragnar, I totally get where you're coming from. I've been down that path. The concerns you have are real. The real issue has to do with the standard behavior of taxonomies in Wordpress. Typically a set of paginated archives. Those archives, by default, show duplicate posts between category and sub-category. At scale, that is considerably less than ideal.

URL Structure
These days, for most content sites, I tend to prefer the structure @DarkRed laid out:
  • Posts -> Virtual Silo
  • Categories -> Physical Silo
It works well, and makes it considerably easier to deal with re-categorization in the future.

That won't be ideal for every site though. To some degree, it will be dependent on the type of site and niche. For example, if much of the taxonomy is going to remain static and have little reason to change, a pure physical silo (/category/sub-category/post/) might work better. One example might be geographic taxonomies:
/illinois/chicago/real-estate-resource-guide/


To get to Ryuzaki's point, let me offer an alternative. Ask yourself, what are we really trying to achieve here? What we're really looking for is:
  1. Good internal link structure
  2. Dedupe posts between taxonomies
  3. Flexibility for taxonomy changes
  4. Good UX
If we think about it, the standard category & post structure, combined with virtual silos, achieves the majority of this. The important thing is, it does this by default. Breadcrumbs help account for #1. Virtual silos account for #3. Various default widgets and related posts features begin to account for some of #4, though they can be improved.

How To Dedupe Posts from Hierarchical Category Archives
Let me offer an option that can potentially address number two. The caveat is, I have not used this function on a site at scale yet, so use at your own risk. So far, on several small sites, it's working fantastic for me.

  1. First, set only one category per post. Whether it's parent or child, only select 1. This is key.

  2. Second, make sure your parent category has at least one post selected. If there isn't, this may not work, or may give you some weird errors.

  3. Next, we need to optimize the posts query for de-duplication. We can use the "pre_get_posts" hook to achieve this in many cases.

  4. Then we use the "get_queried_object" hook to work with the Wordpress category object.

  5. To account for hierarchical taxonomy, the "get_term_children" hook helps us grab a list of subcategories.

  6. We then use the "category__not_in" parameter to filter out posts in our subcategories. We're almost there!

  7. Lastly, we just need to use an "add_filter" function to apply our dedupe function to the "pre_get_posts" hook, and you're done!

Here's what the completed code looks like. I'd recommend adding this to your functions.php. If you add it elsewhere, like directly in template files, it may not work and might even cause some errors, so be forewarned!
PHP:
// Remove subcategory posts from parent category archive
function dedupe_child_categories( $query ) {

    if ( $query->is_category ) {

        // Because of the conditional this grabs the category object
        $queried_object = get_queried_object();

        // Recursively grabs all subcategories and throws in an array
        $child_categories = (array) get_term_children( $queried_object->term_id, 'category' );

        if ( ! $query->is_admin )

            // Removes subcategory posts from main category archive
            $query->set( 'category__not_in', array_merge( $child_categories ) );

        }
        return $query;

}
add_filter( 'pre_get_posts', 'dedupe_child_categories' );

Consider that, Posts, Taxonomies, and Hierarchical Taxonomies are a fundamental aspect of Wordpress. As such, they just work by default, and they require nothing. This is good, and it keeps things simple, which is preferable when you grow a site like crazy.

There are some really cool things that can be done with custom post types. I've also seen people do similar things with Wordpress Page types. The thing is, though, most times I see people use those types, they inevitably try to recreate ~80-90% of the functionality of standard WP Taxonomy archives. Really think about that.

Handling Pagination
Pagination is easy to deal with. If you want to kill all pagination, like I do, just set your archives to display 9999 posts or whatever. You may also have a theme that needs its own settings for post count as well. Then be diligent about categorization. Like if you have 100 posts in a category, it's probably a big category, and could benefit from subcategories of other relevant subtopics. That can also have the up-side of better UX. Most users aren't going to like looking at 100 different things at once. They probably only care about a handful of specific topics at any given time.

Internal Linking Subcategories
The only thing I haven't addressed so far is internal linking for the subcategories themselves. I don't have the code in front of me right now, but there are multiple ways to achieve this. One option would be modifying your archive template, and adding another content block. If it was me, I'd do something like a query for subcategories, then loop over them and return a list on page.

Then you could turn that into something cool, like a card grid of links to subcategories within that parent category. You can also do loops like this to even return a few posts for each, to strengthen the content of the parent category. That can get tricky though, based on the queries you're making, filtering, and exactly where you have that code.

Intermediate Category Pages
The point I'd like to impress on people is to think about higher level category pages in a different way. In many cases, they are very broad categories. Think about what a user landing in that category might actually be looking for. Say the category is "Personal Finance". Does anyone think a user is going to enjoy seeing a list of dozens of posts, all on random subtopics within personal finance? They don't even know where to start.

Instead, what would work better in that case is an intermediate category page. In essence, for higher-level category pages, it might be better for the content to focus on listing the subcategories as the primary CTAs, versus prioritizing individual content pieces. This helps the user figure out where to go. So on that Personal Finance category page, we might want to have the majority of the page be a card grid of subcategories, like this:
  • Personal Finance
    • Retirement Planning
    • Estate Planning
    • Banking and Loans
    • Credit & Debt
    • Financial Software
See what I mean? Hit a page like that, and it's actually pleasant to find what you're looking for! These things make a huge difference, and they're also the things most don't bother to DO anything about. On massive sites, these are the types of strategies that can deliver multi-million dollar ROIs when implemented correctly.

Here's a fantastic example of what I'm talking about with intermediate category pages:

ecommerce-sub-category-pages-05-chemistdirect-macys-818a244a91a3e9b987d0db55a19b3a70.jpg

Courtesy Baymard Institute

See what I mean? At the end of the day, standard WP posts and taxonomies have almost all of the nuts and bolts to do what we need. The real issue is simply the presentation and all taxonomies being treated relatively the same. All we need is to be able to choose a different presentation at different levels.
 
Wow thanks so much!

For me, it doesn't matter that much that in a category you get the posts from the subcategories. After all, they are related. But it's interesting to have that option.

The intermediate category pages is a great idea. I also wanted to have links to subcategories, but this is much better.

It's great for the user and it looks professional. It shouldn't be too difficult. We can assign a featured image with ACF for each category and then use them from the php templates.

I managed to get JSON breadcrumbs for posts using the category hierarchy
Disclaimer: this was just trial and error, I'm no expert in php so it's probably a mess, but it works :D

PHP:
if ( is_single() ) {

  $categories = get_the_category();
  $last_category = end(array_values($categories));
  $get_cat_parents = rtrim(get_category_parents($last_category->term_id, true, ','),',');
  $cat_parents = explode(',',$get_cat_parents);


  echo '<script type="application/ld+json">';
    echo '{
      "@context": "http://schema.org",
      "@type": "BreadcrumbList",
      "itemListElement": [{
        "@type": "ListItem",
        "position": 1,
        "item": {
          "@id": "'. get_site_url() .'",
          "name": "'. get_bloginfo('name')  .'"
        }
      },';

      foreach ( $cat_parents as $category ) {
        $n++;
        $cat_link = new SimpleXMLElement($category);
        if( $category != end($cat_parents) ){

            echo '
                    {
                        "@type": "ListItem",
                        "position": '. ($n +1) .',
                        "item": {
                          "@id": "' . $cat_link['href'] . '",
                          "name": "'.  $cat_link .'"
                        }
                      },
                    ';
        } else {

            echo '
            {
                "@type": "ListItem",
                "position": '. ($n +1) .',
                "item": {
                  "@id": "'. $cat_link['href'] .'",
                  "name": "'. $cat_link .'"
                }
              },{
                        "@type": "ListItem",
                        "position": '. ($n +2) .',
                        "item": {
                          "@id": "'. get_permalink($post->ID) .'",
                          "name": "'. ucwords(str_replace("-", " ", basename(get_permalink($post->ID)))) .'"
                        }
                      } ]
                    }
                    </script>';
        }
    }
}

using this to display it on the posts for navigation propurses

PHP:
function category_display() {

  $categories = get_the_category();
  $last_category = end(array_values($categories));
  $get_cat_parents = rtrim(get_category_parents($last_category->term_id, true, ','),',');
  $cat_parents = explode(',',$get_cat_parents);

echo '<p class="category-display-title">SOME TITLE:</p><div class="category-display-container">';
  foreach ( $cat_parents as $category ) {

      echo '<div class="category-display">'. $category  .'</div>' ;
  }

  echo '</div>';

}
 
Very interesting discussion.

I am fairly sure standard category pages are outright bad for SEO, why @turbin3 mentions eliminating the dupe content - very important - but that's imo not really enough. There needs to be unique content on these pages.

The problem with using these category pages as hubs is that Google in the past did not like these pages. I know because I tried to turn them into landing pages on a large(r) site with disastrous results. Some years ago Google strongly disliked Wordpress archive pages. This was a direct response to the fact that Wordpress tag pages (and category to lesser degree), used to rank really well, so Google went the other direction and seemingly banned Wordpress archive pages from the SERPS. Which led to "best practice" of noindex, follow-ing your WP archive pages.

Did it change? Probably.

I think I will strike a compromise. I will keep my money page landers on pages and use their parent-child hierarchy as breadcrumbs there. I think breadcrumbs are very important SEO wise. Not only because they are above the fold links, but because they are topically extremely important. Click around in Google for yourself on images for example. I search for "pets" and go to images, then I get "birds", "healthy pets", "cats" etc. on top. These are Google topical groupings. Google can do stuff like that now by itself. So naturally I want that maximum topical grouping on my main info/money silo. I have "Pets/Dogs/Labrador/Training", that's going to be linked (breadcrumbs), grouped, highly topical content groups and Google can literally go hierarchally down the breadcrumbs and see that it is a content grouping like it would use itself. That's got to work. Now, on the other hand, if I have "Animals/Pets/Cats/FunnyMemePosts", then I would probably not need that, outside of a "Category description" with some content and the kind of stickypost/subcategory tactic described above.
 
I think that what Google didn't like was the typical category/archive page with all the posts and the excerpts. But the objective is not to rank these pages. If it happens then great.

The main goal for me is link juice flow (no orphan pages or pages too far away from incoming links/homepage) and UX (great for the user)

This is a good example: https://www.androidauthority.com/apps/ (built on wordpress)

Go to some post in that category. It's a virtual silo. And all the posts have a link back to the main category.

I would do what turbin said and display the subcategories first.

Also, they don't have a breadcrumb markup
 
@ragnar, my question to you is what is the fundamental difference is between a post and a page? My own answer would be "there is no difference." They both pull content from the database and display it.

The same goes for category pages. There's no intended use for them. It's 100% in how people use them. Non-developers are trusting theme developers to use them appropriately, but theme developers aren't SEO's. I'd venture a guess that 95% of themes don't even use the category description capabilities. And then if people are multi-categorizing posts, that's their own fault. And if people aren't no-indexing /page/2/ - page/99/, that's their own fault. They can't lean on theme developers to do everything for them. That's partially what plugins are for.

One of my category pages, with one paragraph of static content, was ranking top 5 for a pretty voluminous two word phrase. I wrote a post for that term and now it ranks, but the point being that category pages can definitely rank, largely because they end up being so juicy. If you put some good content on them they can 100% rank and hold ranks.

There's no difference between a page and a post other than those labels given to them, and the same goes for taxonomy pages versus a page that you encode your own loop into. They're the exact same, and a loop is a loop. Google doesn't know what tool you used to generate the page if all things are kept equal. It entirely falls back to how it's being used.
 
And if people aren't no-indexing /page/2/ - page/99/, that's their own fault.
I suppose you're indirectly advocating noindexing,following paginated results?

If that's the case, what do you make of what John Mueller said in that Hangout (sauce: https://www.seroundtable.com/google-long-term-noindex-follow-24990.html)

So it's kind of tricky with noindex. Which which I think is something somewhat of a misconception in general with a the SEO community. In that with a noindex and follow it's still the case that we see the noindex. Snd in the first step we say okay you don't want this page shown in the search results. We'll still keep it in our index, we just won't show it and then we can follow those links.

But if we see the noindex there for longer than we think this this page really doesn't want to be used in search so we will remove it completely. And then we won't follow the links anyway. So in noindex and follow is essentially kind of the same as a noindex, nofollow. There's no really big difference there in the long run.

EDIT: Say you have a large site, with hundreds of posts in each category, to the point where it wouldn't make sense not to have pagination.
 
Yeah, I've seen that. It's a good thing to keep in mind. I don't put stock in that comment, though. I think it's off the cuff nonsense he's known to spit, and makes no sense on their side to do that either.

The real risk is filling up the SERPs with 100's or 1000's of taxonomy pages that offer no value that increase your risk of tripping the Panda filter.

John Mueller said a thing in a casual setting and followed it up with vague crap on Twitter like "it depends" versus the actual Panda algorithm change we watched destroy companies.

What does suck, is if that's true then we're losing page rank all over the place. What remains true is they don't need that to crawl if you have a sitemap. In this case, it's contradictory stuff that doesn't serve Google positively at all.

A nice thing you can take out of it is confirmation that they do index everything they find to maintain an accurate link graph, so dofollow on noindexed pages can help with ranking. For them to change that for no good reason is to damage their link graph, which the entirety of their search engine depends on.

Maybe the best move is to make the taxonomy pages show a lot more posts than we think is good, like up to 50 versus 10 each, or whatever, and try to reduce the number of /page/2's as possible.
 
Really interesting stuff.

I was thinking of this too. I don't like the idea of noindex.

Maybe the best move is to make the taxonomy pages show a lot more posts than we think is good, like up to 50 versus 10 each, or whatever, and try to reduce the number of /page/2's as possible.

I like this idea. But I would do something a bit different.

I would put a limit to the posts per category. Like 30-50 maybe, Something that looks decent for the user and not an endless scrolling.

But I would combine this with links to the first level of subcategories (Pets would link to dogs/cats but not to Bulldogs. Dogs category would link to Bulldogs)

That way it's almost imposible to have orphan posts. Unless you write 50+ posts about really specific topics. And if it happens, I would just create more categories.

Yes, a post about bulldogs would not receive a link from the pets category, because it's likely that you have 50+ posts on "Pets". Or even from the dog category.

But it will always get a link from the Bulldog category, which will get a link from the Dogs category.

Of course, this would be combined with interlinking besides the category hierarchy.
 
Back