Javatpoint Logo
Javatpoint Logo

PDFBox Extracting Image

In this section, we will learn how to extract image from the existing PDF document. The PDFBox library provides a PDFRender class which renders a PDF document into an AWT BufferedImage.

Follow the steps below to extract an image from the existing PDF document-

Load Existing PDF Document

We can load the existing PDF document by using the static load () method. This method accepts a file object as a parameter. We can also invoke it using the class name PDDocument of the PDFBox.

Instantiate the PDFRender class

PDFRenderer class renders a PDF document into an AWT BufferedImage. The instance of this class needs a document object as its parameter. This can be shown in the following code.

Render Image

The renderImage() method of the Renderer class can be used to render the image in a particular page. This method need to pass the index of the page, where we have the image that is to be rendered.

Writing the Image to a File

We can write the rendered image to a file using the write () method. In this method, we need to pass three parameters -

  1. The rendered image object.
  2. String representing the type of the image (jpg or png).
  3. File object to which we need to save the extracted image.

This can be shown in the following code:

Close Document

After completing the task, we need to close the PDDocument class object by using the close () method.

Example-

This is a PDF document which we are going to extract its page as an image by using PDFBox library of a Java program.


PDFBox Extracting Image

Java Program

Output:

After successful execution, the above program shows the following output.


PDFBox Extracting Image

Now for verification, open the image as shown below-


PDFBox Extracting Image





Youtube For Videos Join Our Youtube Channel: Join Now

Feedback


Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA