Apache POI Architecture
Apache POI consists of various components and make an architecture to form a working system.
For example, POIFS and HSSF components are used to read and write Microsoft's Office and Open Office files respectively.
POIFS is the most stable and oldest part of POI. It supports both read and write functionality. It is a port of the OLE 2 Compound Document Format to pure Java. All of our components for the (non-XML) Microsoft Office formats ultimately rely on it.
HSSF component is used to read and write Microsoft Excel 97 (-2003) file format using Java. XSSF is used to read and write Microsoft Excel XML (2007+) file format (OOXML) in Java. SS is a package that provides read and write capability for both formats with a common API.
HWPF is used to handle the Microsoft Word 97 (-2003) file in Java. It supports read, and limited write capabilities.
HSLF is used to handle Microsoft PowerPoint 97(-2003) file format in Java. It provides read and write capabilities.
HDGF is our port of the Microsoft Visio 97(-2003) file format to pure Java. It currently only supports reading at a very low level, and simple text extraction.
HPBF is used to handle the Microsoft Publisher 98(-2007) file format in Java. It currently only supports reading at a low level for around half of the file parts, and simple text extraction.
HMEF is used to handle the Microsoft TNEF (Transport Neutral Encoding Format) file format to pure Java. TNEF is sometimes used by Outlook for encoding the message, and will typically come through as winmail.dat. HMEF currently only supports reading at a low level, but we hope to add text and attachment extraction.
HSMF is responsible to handle of the Microsoft Outlook message file format in Java. It currently supports only some of the textual content of MSG files, and some attachments.
Following are the components of POI with their Maven artifactId.
Next TopicApache POI Features