Building a price tracker

Joined
Mar 28, 2019
Messages
105
Likes
97
Degree
0
As I'm starting to generate some traffic to my product database (from this thread), I'd like to add a price tracker. Just a basic overview of prices so users can find the cheapest place to buy the product. I like this model because it looks more legit/authoritative and actually helps users rather than just adding an Amazon link to each product (and helps reducing dependency on just Amazon).

PCpartpicker's price overview is a good example of what I'm trying to build. Users can select their country (will just be a few at first) and see the shops relevant to them.

Does anyone have any experience with building a price tracker? I'm not sure what the best way is to go about this. My website runs on Wordpress, so I could store some data in that database, in custom tables, an external database and pull it in from there. I'm most familiar with Javascript & PHP, and some basic SQL, so if I could leverage that knowledge that would be best.
 
So first thing, you'll need to parse a few prices from the main websites.

This is where it breaks already. You need to monitor the scrapers if anything is changfed on those sites.
Also product naming is a horror itself, almost no two shops name items the same.
 
So first thing, you'll need to parse a few prices from the main websites.

This is where it breaks already. You need to monitor the scrapers if anything is changfed on those sites.
Also product naming is a horror itself, almost no two shops name items the same.
I can see how scraping many websites for many products could be tricky. But some of the biggest websites have APIs you can access, and use standardized codes for each product, so you can be sure they are the same. Also I don't have thousands of products in my database so I feel it should be a bit more manageable.

Or am I overestimating how many shops have an API? or at least an API that is easily accessible like Amazon's.
 
Last edited:
Heres a few things I found when I was looking to build a price tracker

Python:

Code:
https://github.com/ponyriders/django-amazon-price-monitor

You could try reverse engineering a plugin like this for example or it may even have the shops you need already so you can get up and going with MVP

687474703a2f2f7770736f756c2e6e65742f692f636531312e676966


Code:
https://codecanyon.net/item/content-egg-all-in-one-plugin-for-affiliate-comparison-deal-sites/19195707

and they have another plugin:

Code:
https://codecanyon.net/item/affiliate-egg-niche-affiliate-marketing-wordpress-plugin/21852757

I'm not saying a plugin is the best way to do it, just saying it would be a good idea to look at their code (you can find nulled versions for free) and see how they implemented it, and then try build your own solution.
 
I imagine it's easiest to store tables like

Product Names: Product ID, product name, manufacturer ID
1, 3M tape, 1

prudct_store: ID, product ID, Store ID, product store ID
1,1,1,asin-123-4234-123

stores
id, name, ..., api key
1, amazon, ... , 324223-543542-234234-324234

prices:
ID, timestamp, product store ID, price, currency ID, available boolean
1, 2019-08-31 12:00:04, 1, 14, 1, 1

currencies: ID, ISO code
1, USD
2, EUR

currencies_values: currency_id from, currency_id to, value
1,2,0.8967


Then run currencies and product feeds on crontab, and you map the pretty product name in your store to the foreign product ID (ASIN etc)


Just my 2 cent after 2 min of thinking

Then it's more normalized too, i.e. if you rename 3M tape to 3M tape transparent 10mm, it's all good.
 
Did some more research and I also found this plugin (a bit similar to what @built linked):
Code:
https://www.datafeedr.com/

They have a very large network with many affiliate networks and shops, and a API for custom displaying of the data. The advantage would be that you could get prices from many different custom shops through one system. Also gives access to smaller/medium-sized shops that don't have an API themselves that would in other cases have to be scraped. Just have to see if they actually have all the data available in practice, and if I can make it work.
 
Depending on how much time and effort you want to sink into this price tracker, the most common method is using a web scraping framework to handle the data fetching. This framework needs to be able to handle heavy javascript websites, as well as being able to circumvent bot protection measures through methods like proxy rotation and user interaction simulation (for the heavily protected sites).

A common scraping framework for this would be scrapy , which was open sourced by Scrapinghub, and is written in python. The way it handles js heavy sites is through a renderer plugin called Splash. There are also plugins for proxy rotation that are available, and the entire framework is battle tested.

For the issue that darkzerothree mentioned:
This is where it breaks already. You need to monitor the scrapers if anything is changfed on those sites.
The scrapy solution for this is called spider contracts which allows you to check for whether the data format has changed.

One issue that you would have to implement yourself would be the scheduling of the spiders. This can be done simply through cron jobs and a hacky python script, but it would be quite fragile. There was recently an crawler admin called crawlab that was opensourced and posted on hackernews, though i have not tried it out personally.

For your data storage backend, scrapy is very versatile, you can use whatever interface and storage mechanism you want, or even post the data directly to your wordpress site.
 
Update: I'm nearly done building the first version of the price tracker. Currently I'm using two sources: the service I mentioned before, and the Amazon product API. The two APIs work nearly identical so that made it easier to set it up.

Both the DF service and Amazon API have pretty good documentation and you can essentially just copy the entire request, make sure it uses your parameters (API keys, and a list of products you want prices/data for), and you've got a successful response very quickly. For someone who hasn't really worked with APIs before, it was not at all hard to get started with.

I ended up using custom database tables in WP itself (networks, stores, and prices) which get updated whenever a new request is made. This took more time, because I had never made database tables myself yet, but once you know how it is not too hard. Currently still updated with a 'Fetch Prices' button on the admin page, automating it to run once daily is the next step.

Advantages of my setup (so far):
  • Easy to get accurate, up-to-date data from multiple shops
  • Data fetching is handled in two API calls
  • Tracking links are automatically generated (you do need to insert your tracking IDs still, but you can make a function to do this automatically)
Downsides/Challenges (so far)
  • Not every online shop is available
  • Still need to register with many different Affiliate networks, and apply to each store separately (not needed to get the data, but you won't make money without a tracking ID)
  • To use the AMZ Product API you need to have 3 sales already, so you can't start from scratch
    • This goes for each region separately, so because I have had sales in the US and Germany, I can access prices for those Amazon sites, but not UK prices because my UK AMZ affiliate account was inactive
Once I've got it live on my server I'll have to see how much people click on it, and see if there are stores people really miss. Might have to consider @FIREman 's solution for potential missing stores.
 
Back