Recommend a simple google scraper script in Python with Proxy Support?

bernard

BuSo Pro
Joined
Dec 31, 2016
Messages
2,507
Likes
2,200
Degree
6
Recommend a simple google scraper script in Python?

Proxy support would be nice but not necessary.
 
Although this isn't the simplest, what I recommend and use myself is the Scrapy framework for Python. It's wonderful to work with. I put together a tutorial on Scrapy awhile back. In hindsight, though, please forgive the walls of text. ;-)

That framework takes a bit of setup, but it's well worth it. It's easy to add proxies and even do things like setup your bots to rotate user agents, which I detailed in the tutorial.

If you just want something as simple and lightweight as possible, I have several recommendations. Here are a few Python libraries that are excellent for various scraping uses, without resorting to something more complex like Scrapy:
First off, the Requests library is awesome. It's a great solution for starting out, as it's very complete and has excellent documentation. It can even handle proxies, different content types, SSL, and many other things. This library gets used a TON by people since it's so nice to work with. I highly recommend checking it out.

Beautiful Soup's claim to fame is making it really easy to grab only the content you want off a page, without getting too crazy with code. Here's an example of how to use Beautiful Soup to grab content from pages and save it to a CSV. There are easier ways with less code, but that was just one.
 
Back