Javatpoint Logo
Javatpoint Logo

PDFBox Working with Metadata

PDF document has many properties. These properties provide the metadata information related to PDF document. There is no guarantee that all PDF files will have all the metadata we need as some fields are optional.

PDF document contains the following properties-

File Name It holds the name of the File.
Title It is used to set the Title of the PDF document.
Author It is used to set the Author name of the PDF document.
Subject It is used to specify the Subject of the document.
Application It is used to set the Application of the document.
Keyword It is used to create the list of the Keywords from which we can search the document.
Created It is used to set the Date for the creation of document.
Modified It is used to set the Date of the modification of the document.
Producer It is used to set the producer name of the document.

PDFBox provides PDDocumentInformation class for setting the document properties. This class has a set of setter and getter method. Setter method is used to set the value for document properties and getter method is used to retrieve that value.

Working with Setter () method-

The important Setter methods of PDDocumentInformation class are as following-

  1. setAuthor(String author)- This method is used to set the value for the author name.
  2. setTitle(String title)- This method is used to set the value for the PDF document title.
  3. setCreator(String creator)- This method is used to set the value for the creator of the PDF document.
  4. setSubject(String subject)- This method is used to set the value for specify the subject of the PDF document.
  5. setKeywords(String keywords list)- This method is used to set the value for the keywords.
  6. setCreationDate(Calander date)- This method is used to set the value for the creation of the PDF document.
  7. setModificationDate(Calander date)- This method is used to set the value for the modification of the PDF document.

Example-

This example explains how to add properties like Author, Title, Date, Subject etc. to a PDF document.

Output:

After successful execution of the above program, it retrieves the text from the PDF document as shown in the following output.


PDFBox Working with Metadata

Working with getter () Method-

The important getter method of PDDocumentInformation class are as follows-

  1. getAuthor()- This method is used to retrieve the value of the Author name.
  2. getTitle()- This method is used to retrieve the value of the document Title name.
  3. getCreator()- This method is used to retrieve the value of the document Creator name.
  4. getSubject()- This method is used to retrieve the value of the Subject name of the PDF document.
  5. getKeyword()- This method is used to retrieve the value of Keyword for PDF document.
  6. getCreationDate()- This method is used to retrieve the value of the Creation Date of the PDF document.
  7. getModificationDate()- This method is used to retrieve the value of the Modification Date of the PDF document.

Example-

This example explains how to add properties like Author, Title, Date, Subject etc. to a PDF document.

Output:

After successful execution of the above program, it retrieves all the attributes of the PDF document which can be shown in the following output.


PDFBox Working with Metadata





Youtube For Videos Join Our Youtube Channel: Join Now

Feedback


Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA