Language: Java
Data / PDF
With the increasing use of PDF for reports, invoices, and legal documents, Java developers needed robust libraries to handle PDFs. iText, initially released in 2000, and PDFBox, an Apache project, provide comprehensive APIs to work with PDF files without relying on external applications. They are widely used in enterprise systems for automated PDF generation and processing.
iText and Apache PDFBox are Java libraries for creating, manipulating, and reading PDF documents. They allow developers to generate dynamic PDFs, extract content, fill forms, merge or split documents, and manage metadata programmatically.
iText:
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext7-core</artifactId>
<version>7.2.5</version>
</dependency>
PDFBox:
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>3.0.0</version>
</dependency>implementation 'com.itextpdf:itext7-core:7.2.5'
implementation 'org.apache.pdfbox:pdfbox:3.0.0'iText and PDFBox provide APIs to create new PDFs, read existing documents, manipulate pages, extract text and metadata, encrypt/decrypt files, and fill PDF forms. iText has a commercial-friendly license for advanced features, while PDFBox is fully open-source under Apache License.
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
PdfWriter writer = new PdfWriter("example.pdf");
PdfDocument pdf = new PdfDocument(writer);
Document document = new Document(pdf);
document.add(new Paragraph("Hello, iText PDF!"));
document.close();Creates a new PDF file with a single paragraph using iText.
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import java.io.File;
PDDocument document = PDDocument.load(new File("example.pdf"));
String text = new PDFTextStripper().getText(document);
System.out.println(text);
document.close();Reads and prints all text from an existing PDF using PDFBox.
import org.apache.pdfbox.multipdf.PDFMergerUtility;
PDFMergerUtility merger = new PDFMergerUtility();
merger.addSource("file1.pdf");
merger.addSource("file2.pdf");
merger.setDestinationFileName("merged.pdf");
merger.mergeDocuments(null);Combines multiple PDF files into a single document.
import com.itextpdf.layout.element.Image;
import com.itextpdf.io.image.ImageDataFactory;
Image img = new Image(ImageDataFactory.create("logo.png"));
document.add(img);Adds an image to a PDF document using iText.
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
PDAcroForm form = document.getDocumentCatalog().getAcroForm();
form.getField("name").setValue("John Doe");
document.save("filled_form.pdf");Fills fields of an existing PDF form programmatically.
pdf.protect(new StandardEncryption(EncryptionConstants.ENCRYPTION_AES_128, "userpass", "ownerpass"));Adds password protection and encryption to a PDF document using iText.
Always close PDDocument or Document objects to release resources.
Use streaming APIs for large PDFs to avoid memory issues.
Validate PDF inputs when extracting text or filling forms.
Consider licensing for iText if using advanced features in commercial applications.
Use PDFBox for fully open-source projects and simpler PDF manipulation tasks.