Inflated Keyword Volume: Ahrefs vs. GKP

Joined
Dec 11, 2018
Messages
245
Likes
308
Degree
1
Ahrefs KW volumes are significantly bigger on some keywords than the numbers that I get from GKP (through Keywords Everywhere). I've seen gaps as big as:
  • Ahrefs volume: 2300
  • GKP volume: 400
The quick comparisons I've done with my Google Search Console show that GKP data (through Keywords Everywhere) is usually accurate. At least when looking at the specific search term itself, ignoring the longtails. Everything filtered for U.S. volume only.

So what gives? Is Ahrefs' clickstream data integration messing with their data? Or do you think that Ahrefs is more accurate, and my conclusion is wrong?

Most importantly - what do you use for your keyword volumes?
 
From memory I would agree with you, I was also under the impression that Ahrefs has much larger numbers than Keywords Everywhere. But when I just compared a dozen keywords, it was actually the other way around, albeit not with such big gaps as in your example.

But that could also just be a coincidence. I guess you would need to compare it for a much larger dataset to make any strong conclusions.
 
@eliquid , the serpwoo volume data is pulled from Google though. You would have a much better perspective on which data was more accurate, Keyword Planner or Ahrefs(clickstream).

The fact that Ahrefs gives segmented data for singular and plural and word order always led me to assume it was better data.

<Edit> on second read, you gave a pretty clear answer. Google keyword volume is unreliable.
 
Creepstream doesn’t improve the data.

I don’t get why everyone’s using it. It’s marketing bullshit, you can make a way better model with out it.

Most accurate for long tails is models based on older data from pre 2015. The background noise levels and googles methodology changed for the worse.
 
@secretagentdad I was hoping you'd chip in on this. So you'd recommend working with the 2015 GKP data instead - that seems to be the lesser evil? Or how would you approach this from a beginner's perspective?

Because we need some kind of data to base our decisions off, right? How else can you prioritize which content to go for? So even though Google's numbers are inaccurate, they're the best we've got?
 
@eliquid , the serpwoo volume data is pulled from Google though. You would have a much better perspective on which data was more accurate, Keyword Planner or Ahrefs(clickstream).

The fact that Ahrefs gives segmented data for singular and plural and word order always led me to assume it was better data.

<Edit> on second read, you gave a pretty clear answer. Google keyword volume is unreliable.

Both of them are unreliable.

However, people want volumes. Or at least some idea of volume.

As a business person, you have to give the market what it wants. While I know both GKP and Ahrefs and SEMRush and ( insert any tool ) are all not accurate, I also realize you can't change people and you have to give them what they want too.

If we didn't include volume in SERPWoo, people would go ape shit.

It's kinda like eating pizza for dinner. No one needs a fork to eat pizza. Come on. But if a restaurant didn't offer them people would go ape shit and ask for them all the time. Most people can't be changed to think otherwise or realize the difference in why.

Sometimes you have to give them what they want even though you know yourself they don't need it or it's not correct.

If any reps from those other services think their volume data is correct, they can choke on a hot dog.
 
@Poplanu
I prefer it and think its very useful for content gap mining in evergreen niches.

You lose out on new stuff completely. You're not gonna identify a hot new product using old data.
If you're looking for new trends to jump on. Just use google or social tag counting tools.
 
Last edited:
Ahref's data is definitely inflated.

Since it's obvious that accuracy can't be had in the keyword volume world, what you have to focus on is precision. What that means is, since nobody is giving you real deal, on-target numbers, you need to make sure they're at least hitting the target in the same spot each time.

Because if a provider's numbers are at least consistent, then with some experience you can figure out how inaccurate they are. You could come up with a model like:
  • 10 - 500 volume under reports by 50%
  • 501 - 3,000 volume over reports by 25%
  • 3,001 - 20,000 over reports by 33%
  • 20,001+ over reports by 50%
If you have a large enough site with a big enough keyword spread, you can figure out what's what. You can understand the volume that comes in an entire basket if you dominate a set of queries. Most of the time you can remove long-tail queries and get back down to the parent term.

It works the same for Ahref's total organic traffic, SEMRush's graph, etc. You can figure out how far off they are in specific traffic ranges and get a more realistic idea about how your competitors are doing (they all under report fairly drastically until a certain range, where they get more accurate, then they get off again increasingly so).

The game is to choose a provider and learn how bad they're off. The most accurate that I've seen so far is SERPWoo, and I'm not just blowing smoke up their butts. I don't know where their data comes from but it's far more realistic.

Like Ahref's will report a keyword I'm #1 for at 140,000 volume, while SERPWoo will report it at 18,000. They may not be accurate either, but they're a lot closer to reality. The problem there is can't just grab volumes from them without tracking first (though I think you could work around it inefficiently with their Keyword Finder module, which doesn't work the way people think. The way it actually works is gold).

This goes for all metrics. Moz's DA is stupid until you hit about DA 30. Ahref's KD is off pretty drastically in my opinion, especially on low KD's.

This data has to be used to inform you to help you make decisions, otherwise we'd drown in data. But it can't be used to make decisions for you. You have to add another layer of work on top if you want a semblance of accuracy. Fortunately, precision covers what we need most of the time.
 
Fantastic replies. This is some of the best information I've ever seen on the topic. Nobody is talking about where the keyword data comes from, and its reliability. And nobody seems to be asking the question, either.

I mean the common advice is still:

"Go to Ahrefs, filter KD 0-10, sort by volume".

1300044776986.jpg


Eubanks recently recommended doing high-level topic research using only clickstream data from Ahrefs. I'm even more skeptical about that.

I wouldn't be surprised if some guru makes a post about this in 1-2 months.
 
I've tried to rely on the various providers' volume data as a guideline and, quite frankly, I've been burned once or twice. As of late, I've been thinking the best bet may be to rely (more) on a combination of Google Search Console's figures and to run a PPC campaign (at the outset) to get tangible real-world figures. The latter admittedly requires burning through some money upfront but it's better than the alternative (at least in my case) spending years targeting various keywords to finally discover that one or two of the keywords don't generate any volume through organic traffic.
 
the serpwoo volume data is pulled from Google though.

We've never stated that. In fact we've gone through great lengths to not reveal our sources. We have more than 4 sources for keyword volume. We can't always catch some of the latest trends, but eventually we get it all.

I don't know what those other guys do, but when Moz and Ahrefs jumped to "clickstream" data and Google hid keyword volume, every blogger on earth started contacting us asking for our sources.

They went through great lengths and signed up as users to simply ask for our sources. We never revealed them, several days later there were all these bloggers outing all the sources of various tools and most of them stated "SERPWoo wouldn't say their sources". It's not rocket surgery on why, we KNEW they were going to out them, why else would a bloggers contact us?

Now that all the competitor's sources are all outed and pretty much drained to the bone - we're still here. The short-term thinking of bloggers eventually killed off several leftover sources. It's why experienced SEOs do not sit around blogging about their latest findings, and why Blackhats no longer publish their findings - so people like Glen Allsopp and gang don't create blogposts that then destroy the methods being used.

It's all a game of musical chairs, it's all about where you are standing when the music stops spinning... I just watch when the teacher is about to hit the pause button.
 
Since it's obvious that accuracy can't be had in the keyword volume world, what you have to focus on is precision. What that means is, since nobody is giving you real deal, on-target numbers, you need to make sure they're at least hitting the target in the same spot each time.

Because if a provider's numbers are at least consistent, then with some experience you can figure out how inaccurate they are. You could come up with a model like:
  • 10 - 500 volume under reports by 50%
  • 501 - 3,000 volume over reports by 25%
  • 3,001 - 20,000 over reports by 33%
  • 20,001+ over reports by 50%
We've never stated that. In fact we've gone through great lengths to not reveal our sources. We have more than 4 sources for keyword volume. We can't always catch some of the latest trends, but eventually we get it all.

I don't know what those other guys do, but when Moz and Ahrefs jumped to "clickstream" data and Google hid keyword volume, every blogger on earth started contacting us asking for our sources.

I'm not going to say much, but what @Ryuzaki said about precision is exactly what I have been doing for years personally. Plus some extra touches.

I want to know what a keywords or a niche of keywords is doing volume wise? I have been doing exactly this for a decade +.

I want to know some numbers for PPC for a client? I have been doing exactly this for a decade +.

You just got to get out here and use your brain a bit.

And yes, people hound us about where SERPWoo gets it's data from. Ha. Sorry, no more IP theft is happening folks.

Not only do we have multiple sources, we have our own experiences to modify the data from those sources to reflect what we know to be true about those numbers when it comes to seasonality, inflation, and more. Other people who build tools but don't rank sites, can't possibly have the info me and Carter have because its experience over 2 decades we put in each. How can you know this info if you don't rank sites or don't rank big ass sites and don't do big ass PPC?

You can't.

On top of that, you gotta be taking notes on all this, all the time. Not very many people are doing big ass things, but those that do.. aren't taking well defined notes either to later use in stuff like this.

Are we accurate? I think so, but that's because we actually put our brains in on these numbers.

We can't get it right 100% of the time for everything, but for the large majority of research we are a lot closer I bet.

Why?

Because I know I had to stake my life on it before. To feed my kids and pay my bills.

Some developer or lifelong corporate marketer on a salary at [insert our competitors] doesn't have that to know it.

Thanks
Jason

.
 
Last edited:
I've tried to rely on the various providers' volume data as a guideline and, quite frankly, I've been burned once or twice. As of late, I've been thinking the best bet may be to rely (more) on a combination of Google Search Console's figures and to run a PPC campaign (at the outset) to get tangible real-world figures. The latter admittedly requires burning through some money upfront but it's better than the alternative (at least in my case) spending years targeting various keywords to finally discover that one or two of the keywords don't generate any volume through organic traffic.

Yes, that seems like a good model in the long run, and one that I know many affiliate SEOs use.

As for me, I don't trust Google Keyword Tool data much at all. I've seen keywords that say 10-100 in reality be 2000-3000 once you factor in the long tail, while some numbers saying 1-10K in reality barely scraping 1K. This is of course all due to intent, long tail and rich results. I feel like I've got a pretty good idea now though. You can judge by stuff like "people also search" and the like.
 
Back