Tika Component StackTika consists of four components that formed a component stack. A diagram is shown below to illustrate the component positions and interaction between each other. Tika-CoreIt is a base component on which the other three package components are built. It provides following things.
Tika-ParsersIt represents the Tika wrappers for different parsing libraries. It also provides implementations of a generic Parser interface. Tika-parser provides all the required classes and methods for parsing the text and metadata. Tika-AppIt is an application that provides the command line and graphical user interface aspects of Tika. It is at top of tika-parsers. We can run it from the command line and it shows a windows where we can drag file. It produces the extracted content and metadata of the dragged file. To work with it, we can install it from the official site of tika. It is a jar file, so we can execute it using java command. Tika-BundleIt is one of the four Tiks's components and used to provide an Open Services Gateway Initiative ( OGSI ) bundle. It helps to Tika to include in an OGSI environment. OGSI is a software component model that helps to develop component based applications in Java. It is similar to Java Beans and supports modular software development approach. The tika-bundle package was created because of a need in recent Tika deployments to include the full Tika stack (ideally, tika-app).
Next TopicTika Supported Formats
|