

If you are only interested in using the completed scraper, then you can head directly to the companion GitHub repository. What follows is a step-by-step guide explaining how to build up the code that's in this repository, but you should be able able to jump directly into a section you're interested in. My goal in this guide is to help scraping beginners bridge that gap. Here's what some of the fields we are interested in look like on the page.Įven for a well-designed and well-documented project like Scrapy (my favorite Python scraper) there exists a definite gap between the getting started guide and a larger project dealing with realistic pitfalls. 1 If you want to perform your own analysis of Steam reviews, you therefore have to extract them yourself.ĭoing so can be tricky if scraping is not your primary concern, however. While all kinds of Steam data are available either through official APIs or other bulk-downloadable data dumps, I could not find a way to download the full review dataset.

The Steam game store is home to more than ten thousand games and just shy of four million user-submitted reviews. This is a guest post from the folks over at Intoli, one of the awesome companies providing Scrapy commercial support and longtime Scrapy fans.
