PDFBox Extracting Image

In this section, we will learn how to extract image from the existing PDF document. The PDFBox library provides a PDFRender class which renders a PDF document into an AWT BufferedImage.

Follow the steps below to extract an image from the existing PDF document-

Load Existing PDF Document

We can load the existing PDF document by using the static load () method. This method accepts a file object as a parameter. We can also invoke it using the class name PDDocument of the PDFBox.

File file = new File("Path of Document"); 
PDDocument doc = PDDocument.load(file); 

Instantiate the PDFRender class

PDFRenderer class renders a PDF document into an AWT BufferedImage. The instance of this class needs a document object as its parameter. This can be shown in the following code.

Render Image

The renderImage() method of the Renderer class can be used to render the image in a particular page. This method need to pass the index of the page, where we have the image that is to be rendered.

Writing the Image to a File

We can write the rendered image to a file using the write () method. In this method, we need to pass three parameters -

The rendered image object.
String representing the type of the image (jpg or png).
File object to which we need to save the extracted image.

This can be shown in the following code:

Close Document

After completing the task, we need to close the PDDocument class object by using the close () method.

Example-

This is a PDF document which we are going to extract its page as an image by using PDFBox library of a Java program.

Java Program

import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;

public class ExtractImage {
	
public static void main(String[] args)throws IOException {
				
		//Loading an existing document 
	      File file = new File("/eclipse-workspace/blank.pdf");
	      PDDocument doc = PDDocument.load(file);
	
	//Instantiating the PDFRenderer class
	      PDFRenderer renderer = new PDFRenderer(doc);

	//Rendering an image from the PDF document
	      BufferedImage image = renderer.renderImage(2);

	//Writing the image to a file
     ImageIO.write(image, "JPEG", new File("/eclipse-workspace/my_image.jpeg"));
	
	      System.out.println("Image created successfully.");
	
	//Closing the document
	doc.close();
	}
}

Output:

After successful execution, the above program shows the following output.