Saturday, November 15, 2008

Howto Convert PDF to TXT in Ubuntu Linux

Use pdftotext utility can be used to convert Portable Document Format (PDF) files to plain text.

$ sudo apt-get install poppler-utils

usage:

$ pdftotext abc.pdf xyz.txt

$ pdftotext -l 5 abc.pdf xyz.txt ( convert last 5 pages)

$ pdftotext -f 5 abc.pdf xyz.txt (convert first 5 pages)

$ pdftotext -upw 'password' abc.pdf xyz.txt ( for password protected pdf)

11 comments:

Unknown said...

Nice and useful thanks :D

Siva said...

sauch a way how I can convert text to pdf files..
thanks..
Siva

Siva said...

such a way how I can convert text to pdf files..
thanks..
Siva

faber said...

thanks

Faber

Anonymous said...

dont work for me. it gave out text file with two matrixlike characters

Liaofan said...

Thank you so much.
How about converting many PDFs into one single text?

Johanes said...

Thanks a lot - at least something to fresh up my old brain :-) didn't remember the right command.

@liaofan:
how about appending to the same file with a simple loop. well, assumed the pdf files are in the same directory.

for i in 1.pdf 2.pdf 3.pdf
do
pdftotext $i - >> out.txt
done

Pardeep said...

Thanks a lot...........

Very useful.........

Pardeep said...

Thanks a lot...........

Very much useful..........

Pardeep

Pardeep said...

Thanks a lot.........

Bonteruel said...

Great and useful program.Thumbs up!!