Recent Posts

More Posts

A guide to scraping historical snapshots of webpages from the Archive.org Wayback Machine.

CONTINUE READING

The full code for the completed scraper can be found in the companion repository on github. Introduction I wouldn’t really consider web scraping one of my hobbies or anything but I guess I sort of do a lot of it. It just seems like many of the things that I work on require me to get my hands on data that isn’t available any other way. I need to do static analysis of games for Intoli and so I scrape the Google Play Store to find new ones and download the apks.

CONTINUE READING

An analysis of which stories are removed from the front page of Hacker News due to moderator intervention.

CONTINUE READING

A data analysis of how many deaths the DST transition causes due to tired driving.

CONTINUE READING

A data-driven exploration of how the Hacker News ranking algorithm works.

CONTINUE READING