PDF Table Extractor - repository to hold revisable version of code from http://ieg.ifs.tuwien.ac.at/projects/pdf2table/ by Burcu Yildiz
Note: Modified with some bug fixes and enhancements
For Windows:
- Place the pdftohtml.exe file in the same directory as pdf2table
For MacOS
- Download pdftohtml's tar.gz version. Go into the main folder and type "make".
- Copy the builded "pdftohtml" file into your /usr/local/bin folder in order to use it from everywhere. (To be able to copy this file you have to have the root password on your computer). .
- run pdf2table.jar
- wait for output.xml to be written to disk (you must check the output directory)
- run xsltproc csv_view.xsl output.xml
Copyright 2005, 2005 Burcu Yildiz
Contact: burcu.yildiz@gmail.com
pdf2table is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
pdf2table is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with pdf2table. If not, see <http://www.gnu.org/licenses/>.
pdftohtml = http://pdftohtml.sourceforge.net