Quick and Dirty PDF Table Extraction10 Oct 2014
Recently, I had to do a quick, dirty and un-automated extraction of some table data from PDF. I need to take a single table out of a financial statement and extract it to CSV.
While you can find plenty of tutorials on how to automate this process, I didn’t really find a whole lot on doing it just once for a specific PDF. So I wrote a quick tutorial.
Actually, calling this a tutorial is probably a strecth. It’s mainly a link to a great piece of open source software called Tabula and a recommendation to use it. They do a good job telling you how to install it on their web page, and even provide a cool Popcorn Maker on how to use it. I’m embedding it below. Enjoy!