This script is for Slackware 14.2 only and may be outdated.

SlackBuilds Repository

14.2 > Python > python-pdfminer (20140328)

PDFMiner is a tool for extracting information from PDF documents. Unlike
other PDF-related tools, it focuses entirely on getting and analyzing
text data. PDFMiner allows one to obtain the exact location of text in a
page, as well as other information such as fonts or lines. It includes a
PDF converter that can transform PDF files into other text formats (such
as HTML). It has an extensible PDF parser that can be used for other
purposes than text analysis.

PDFMiner comes with two handy tools: pdf2txt.py and dumppdf.py.

pdf2txt.py

pdf2txt.py extracts text contents from a PDF file.  It cannot recognize 
text drawn as images.  It also extracts locations, font names/sizes, 
writing direction.  It requires a password for password protected PDF 
documents.  You cannot extract any text from a PDF document which does 
not have extraction permission.

dumppdf.py

dumppdf.py dumps the internal contents of a PDF file in pseudo-XML
format. This program is primarily for debugging purposes, but it's also
possible to extract some meaningful contents (e.g. images).

Maintained by: Brenton Earl
Keywords: pdf,parse,analyze,extract,dump
ChangeLog: python-pdfminer

Homepage:
https://euske.github.io/pdfminer/index.html

Source Downloads:
pdfminer-20140328.tar.gz (dfe3eb1b7b7017ab514aad6751a7c2ea)

Download SlackBuild:
python-pdfminer.tar.gz
python-pdfminer.tar.gz.asc (FAQ)

(the SlackBuild does not include the source)

Individual Files:

• README

• python-pdfminer.SlackBuild

• python-pdfminer.info

• slack-desc

Validated for Slackware 14.2

See our HOWTO for instructions on how to use the contents of this repository.

Access to the repository is available via:
ftp git cgit http rsync