How to scrape Domains from Google reverse image search?

Joined
Nov 26, 2014
Messages
44
Likes
49
Degree
0
I've found a guy who has hundreds of sites ranking in easy niches by putting an image from one of his sites into Google reverse image search.

Does anyone know how to scrape all the domains from there?

Thanks Guys!
 
I'm a bit confused with this question, since you can get into Google reverse images search by having proper on-page SEO, using the img alt tag, an image title tag, an a caption for the image (within the same table cell or div layer the images is in).

Scraping all the images in Google reverse image search is essentially scraping all images from websites with proper on-page SEO.

But to answer your direct question you can get started with CasperJS and knowing XPath which @turbin3 did an excellent write up on here: X Marks The Spot - Using XPath to Scrape Your Way to Sucess

You can actually use any programming language that allows you to find the XPath of an html element, but I should warn you that the images in Google Image search are not being pulled from the source but are in fact images using data URI scheme (base64), if you were trying to scraper\ the image itself.

Basically this:

Code:


when inputted into your browser's URL equals a picture of Natalie Dormer, but not hosted anywhere:

image.jpg


(Example above is hosted at postimg.org)

In fact it might be easier to scrape the images that way then convert them to images now that I think about it.
 
Sorry for not being clear enough: I'm looking for a way to scrape the results from a google reverse image search.

In my country you are required to put your contact information (real name, adress, email, etc.) on a certain page of your website. To dodge spammers, many peope use an image with these details on it. Almost always, they use the same image on every website they have.

By taking this image and putting it into google reverse image search, I'm able to see all of their websites (that are indexed).

This guy in particular is using >80% EMDs, by finding all of his websites, I'm basically getting a list of keywords. So I only have to figure out if the site ranks (how high) for the associated keyword to spot a few low hanging fruits, as I've outranked some of his sites with little effort in the past (he's building all of his sites after the exact same layout).
 
CasperJS and XPath is still definitely the way to go. CasperJS's first lesson is scraping Google. You may need some proxies and will need something like php/python/perl to control the scrapers.

Or are you looking for an out of the box easier to use solution like ubot? Also Kimono might be worth checking out. Here is a nice looking Moz guide on how to use Kimono to scrape Google: Using Kimono Labs to Scrape the Web for Free (Please note I haven't read this, but I still suspect all answers you are looking for will require multiple proxies).

Also, "MAYBE", Scrapebox might work, looks like there is a guide for it: Scrapebox: Google Images Scraper
 
I talk to someone not too long ago who was using this API for Google Reverse Image Search. I haven't tried it myself but I had it bookmarked. You could check if it's still active/working. They were doing a similar process of outreached for an image heavy site they were apart of.

If that doesn't work you can always pay for the TinEye API which is just more pricey. Using an API would be the way to go here though and it would only cost like $100-200 for a cheap dev to make it into a tool for you.
 
Good stuff, I will check it out and see what works for me.
 
Back