I can't scrape shit (yet) - please advise

Joined
Dec 24, 2014
Messages
22
Likes
7
Degree
0
So I've been doing some research and it appears that having good scraping skills would be useful for a variety of reasons. I don't have any yet, and am wondering the best way to go about getting some. For now, I have been learning iMacros (I cannot afford uBot), and am working on a small project with it (scraping a forum for topics). I want to know what's the best bang for my buck language wise. Will any language do, or should I just learn Python/SQL?

Also any resources/tutorials that you have found helpful when you started out scraping would be helpful.

Thanks for all of your help,
gcomt
 
IMO it would be easier and faster for you to just buy software that will scrape for you VS learning a computer language just to make a scraper.
 
Better to use tools.
Check "Scraper" extension for Chrome.
If you want to scrape Google, look at scrapebox.

When someone learns to program, they think they're learning a language, so they fret about what language to learn, but actually they're learning to program. The language you're learning is being able to speak to computers, think of stuff like PHP, Python, etc. as dialects. They both speak the same language, just in different accents.
 
Better to use tools.
Check "Scraper" extension for Chrome.
If you want to scrape Google, look at scrapebox.

When someone learns to program, they think they're learning a language, so they fret about what language to learn, but actually they're learning to program. The language you're learning is being able to speak to computers, think of stuff like PHP, Python, etc. as dialects. They both speak the same language, just in different accents.

This is true; to a certain extent. But start with PHP and you're going to pick up a ton of bad habits that will fuck you over in the long run, Python and Ruby are both great start languages IMO.
 
I have used imacros for scraping basic web content, but like others have said, it's best to learn to program using one of the aforementioned languages. Python, Perl, etc. it'll pay dividends in the end game as it's a transferable skill.

I can help with any imacros questions , PM me
 
Might be worth sharing what it is you're planning to scrape. Chances are there is already a bit of kit out there that can do it for you.
 
hire someone to handle it for you, scraping projects are small and depending on the project should only cost between 2-400$. The time and effort it takes to learn how to code is not worth the $, spend the time learning a more valuable skill as a marketer
 
Right now, I the only thing I am (trying to) scrape are forum topics to generate ideas for content. Copying and pasting all the topics in a forum takes forever. So I was going to scrape the topic and URLs, which hasn't been too big of a problem so far. The real issue is I'm broke as shit right now, so I have to figure out where I can invest my time to get the best bang for my buck.

I don't actually have a laundry list of things I need to scrape or even want to scrape, but I keep reading it's a good skill to have, so I figure I better planning on learning it in the future. My plan is to use iMacros until I outgrow it, and then move on to using a real language and database or hiring someone to do it.
 
Right now, I the only thing I am (trying to) scrape are forum topics to generate ideas for content. Copying and pasting all the topics in a forum takes forever. So I was going to scrape the topic and URLs, which hasn't been too big of a problem so far. The real issue is I'm broke as shit right now, so I have to figure out where I can invest my time to get the best bang for my buck.

Would using scrapebox or Gscraper not achieve this for you?

site:example.com inurl:(forum OR threads) intitle:keyword (if intitle is needed)

Can then use something like URL Profiler to get URL page title, links, social signals etc to gauge popularity.

Just a thought.
 
  • Like
Reactions: EPP
It's a good skill to have, because parsing web page data is a thing you have to do often, but I wouldn't bother scraping for ideas. I'd take a list of niches that interest you, and just use google to find forums -- so master some advanced search operators.

Inside those forums I would just find the most popular non-general boards and look at what's going on in there. I bet 30 mins of doing that would produce enough topics to keep you busy for a little while.
 
It's a good skill to have, because parsing web page data is a thing you have to do often, but I wouldn't bother scraping for ideas. I'd take a list of niches that interest you, and just use google to find forums -- so master some advanced search operators.

Inside those forums I would just find the most popular non-general boards and look at what's going on in there. I bet 30 mins of doing that would produce enough topics to keep you busy for a little while.

I already have the niche, it's a niche specific form, so I wanted to see what people were talking about question and topic wise.
 
Checked out kimono labs, it's perfect for the URLs, thanks.
 
Sounds like you are doing work, so props for that. You can tweak work to be smarter or better. You can't tweak a "bunch of nothing."

I scrape with python. Started out just like you. Broke and doing this stuff in my off hours. Keep at it. Once you figure out how to do something, turn that into currency in exchange for knowledge. Hopefully, you have enough to eat and a roof over your head. Working hungry from a library sucks. Did that once too.

Head up and hang on.
 
Scraped the forum using Kimonolabs. Was exactly what I was looking for. Only ran into a few small hiccups. Less than with iMacros though. Now I have a bunch of topics I have to sort through by hand since there's no way to tell what's useful and what's not.
 
Sounds like you don't even need a scraper since you're just crawling forums. There are plenty of off the shelf spidering programs that can pull every URL and title of the page on each site.
 
If you need some custom work, just contact me! I am happy to help you out if it is not THAT complicated!
 
If you dont fancy getting in to code there are alternatives, ubot is very cool and a lot of pre made stuff (bots) is available. There is also zennoposter but I find ubot a little easier on the mind. You cant beat programmin, but these are good tools If programming isnt your thing.

Basic Examples, grabbing all go daddy auction urls with certain parameterrs and saving to a file, then checking metrics For pbn sites.
Bulk checking expired domains and saving to a file For availability.
Exporting all the content from a directory (yelp etc) to give your local lead gen site some content.

Other than that you can crawl sites using screaming frog, xenu, 88legs (if you want loads of data). All depends on what you want.
 
Back