Javatpoint Logo
Javatpoint Logo

Tika XML File Extracting

To extract xml file, Tika provides XMLParser class. This class is used to extract content and metadata from xml file. It is located into org.apache.tika.parser.xml package.

This class contains constructor and methods that are tabled below.

Tika XMLParser Constructor

Constructor Description
public XMLParser() It is used to create instance of the class.

Tika XMLParser Methods

Method Description
public Set<MediaType> getSupportedTypes(ParseContext context) It returns the set of media types supported by this parser.
public void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException It parses a document stream into a sequence of XHTML SAX events.
protected ContentHandler getContentHandler(ContentHandler handler, Metadata metadata, ParseContext context) It is used to get content handler.

Tika XML File Extracting Example

In this example, we are extracting content and metadata from a xml file. See the example.

// web.xml

Our XML file.

Output:

Document Content: 
         default
         org.apache.catalina.servlets.DefaultServlet
         
             debug
             0
        
         
             listings
             false
        
         1
    

Document Metadata:
Content-Type:   application/xml





Please Share

facebook twitter google plus pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA