Command Line Tools

PDFBox comes with a series of command line utilities. They are available as Windows binaries and as standard Java applications.

See the Dependencies page for instructions on how to set your classpath in order to run PDFBox tools as Java applications.

In order to run them as window applications you will need to have the .NET framework installed and add %PDFBOX_HOME%\bin to your path.

ConvertColorspace

This application will convert a PDF that uses one colorspace to another. For example, all RGB colors to CMYK colors. Currently this only supports changing text and vector graphics and does not convert images.

usage: java -jar pdfbox-app-x.y.z.jar org.apache.pdfbox.ConvertColorspace [OPTIONS] <inputfile> <outputfile>

Command Line Parameter Description
-password The password to the PDF document.
-equiv Color equivalent to use for conversion.
inputfile The PDF file to convert.
outputfile The file to save the converted document to. Must be different of input file

The format for the color equivalent described above is
Colorspace(values)=Colorspace(values)

RGB(255,0,0)=CMYK(0,99,100,0)

RGB are integers between 0 and 255

CMYK are integers between 0 and 100

This option can be used as many times as necessary.

Decrypt

This application will decrypt a PDF document.

NOTE: You must have the owner password to decrypt the document!

usage: java -jar pdfbox-app-x.y.z.jar Decrypt [OPTIONS] <inputfile> [outputfile]

Command Line Parameter Description
-password Password to the PDF or certificate in keystore.
-keyStore Path to keystore that holds certificate to decrypt the document. This is only required if the document is encrypted with a certificate, otherwise only the password is required.
-alias The alias to the certificate in the keystore.
inputfile The PDF file to decrypt.
outputfile The file to save the decrypted document to. If left blank then it will be the same as the input file.

Encrypt

This application will encrypt a PDF document.

usage: java -jar pdfbox-app-x.y.z.jar Encrypt [OPTIONS] <password> <inputfile>

Command Line Parameter Default Description
-O The owner password to the PDF, ignored if -certFile is specified.
-U The user password to the PDF, ignored if -certFile is specified.
-certFile Path to X.509 cert file.
-canAssemble true Set the assemble permission.
-canExtractContent true Set the extraction permission.
-canExtractForAccessibility true Set the extraction permission.
-canFillInForm true Set the fill in form permission.
-canModify true Set the modify permission.
-canModifyAnnotations true Set the modify annots permission.
-canPrint true Set the print permission.
-canPrintDegraded true Set the print degraded permission.
-keyLength 40 The number of bits for the encryption key.
inputfile> The PDF file to encrypt.
outputfile The file to save the encrypted document to. If left blank then it will be the same as the input file.

ExtractText

This application will extract all text from the given PDF document.

usage: java -jar pdfbox-app-x.y.z.jar ExtractText [OPTIONS] <inputfile> [Text file]

Command Line Parameter Default Description
-password The password to the PDF document.
-encoding default encoding The encoding type of the text file, e.g. ISO-8859-1, UTF-8, UTF-16BE.
-console false Send text to console instead of file.
-html false Output in HTML format instead of raw text.
-sort false Sort the text before writing.
-ignoreBeads false Disables the separation by beads.
-force false Enables pdfbox to ignore corrupt objects.
-debug false Enables debug output about the time consumption of every stage.
-startPage 1 The first page to extract, one based.
-endPage Integer.MAX_INT The last page to extract, one based.
-nonSeq false Use the new non sequential parser.

Overlay

This application will overlay one document with the content of another document

usage: java -jar pdfbox-app-x.y.z.jar Overlay <overlay.pdf> <document.pdf> <result.pdf>

If the overlay document contains more than one page the pages are overlayed to the document on order e.g. if the document has 10 pages and the overlay contains 2 pages the order is Document: 1234567890 Overlay: 1212121212

PrintPDF

This application will send a pdf document to the printer.

You must have the correct permissions to print the document!

usage: java -jar pdfbox-app-x.y.z.jar PrintPDF [OPTIONS] <inputfile>

Command Line Parameter Description
-password The password to decrypt the PDF.
-silentPrint Print the PDF without prompting for a printer.
inputfile The PDF file to print.

PDFDebugger

This application will take an existing PDF document and allows to analyze and inspect the internal structure

usage: java -jar pdfbox-app-x.y.z.jar PDFDebugger [inputfile]

Command Line Parameter Default Description
-password The password to the PDF document.
-nonSeq false Use the new non sequential parser.
inputfile the name of an optional PDF file to open.

PDFMerger

This application will take a list of pdf documents and merge them, saving the result in a new document.

usage: java -jar pdfbox-app-x.y.z.jar PDFMerger <Source PDF files (2 ..n)> <Target PDF file>

PDFReader

An application to read PDF documents. This will provide Acrobat Reader like functionality.

usage: java -jar pdfbox-app-x.y.z.jar PDFReader [PDF file]

Command Line Parameter Default Description
-password The password to the PDF document.
-nonSeq false Use the new non sequential parser.
PDF file the name of an optional PDF file to open

PDFSplit

This application will take an existing PDF document and split it into a number of other documents

usage: java -jar pdfbox-app-x.y.z.jar PDFSplit [OPTIONS] <PDF file>

Command Line Parameter Default Description
-password The password to the PDF document.
-split Number of pages of every splitted part of the pdf.
-startPage The page to start at.
-endPage The page to stop at.
-nonSeq false Use the new non sequential parser.

Examples:

  • PDFSplit -split 2 sample_with_13_pages.pdf will split the pdf in pieces of 2 pages each except the last which will contain 1 page only.
  • PDFSplit -startPage 5 sample_with_13_pages.pdf will provide a pdf containing all pages of the source pdf starting at page 5
  • PDFSplit -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide a pdf containing all pages from 5 to 10 of the source pdf
  • PDFSplit -split 2 -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide 3 pdfs containing all pages from 5 to 10 of the source pdf 2 pages each

PDFToImage

This application will create an image for every page in the PDF document.

usage: java -jar pdfbox-app-x.y.z.jar PDFToImage [OPTIONS] <PDF file>

Command Line Parameter Default Description
-password The password to the PDF document.
-imageType jpg The image type to write to. Currently only jpg or png.
-outputPrefix Name of PDF document The prefix to the image file.
-startPage 1 The first page to convert, one based.
-endPage Integer.MAX_INT The last page to convert, one based.
-nonSeq false Use the new non sequential parser.

TextToPDF

This application will create a PDF document from a text file.

usage: java -jar pdfbox-app-x.y.z.jar TextToPDF [OPTIONS] <outputfile> <textfile>

Command Line Parameter Default Description
-standardFont Helvetica The font to use for the text. Either this or -ttf should be specified but not both.
-ttf The TTF font to use for the text. Either this or -standardFont should be specified but not both.
-fontSize 10 The size of the font to use.

The following font names can be used for the parameter standardFont:

  • Courier
  • Courier-Bold
  • Courier-Oblique
  • Courier-BoldOblique
  • Helvetica
  • Helvetica-Bold
  • Helvetica-Oblique
  • Helvetica-BoldOblique
  • Symbol
  • Times-Bold
  • Times-Roman
  • Times-Italic
  • Times-BoldItalic
  • ZapfDingbats

WriteDecodedDoc

An application to decompress PDF documents.

usage: java -jar pdfbox-app-x.y.z.jar WriteDecodedDoc <input-file> <output-file>

Command Line Parameter Default Description
-password The password to the PDF document.
-nonSeq false Use the new non sequential parser.
The PDF file to decompress
The destination PDF file