Centralizing File Delimited Metadata
In this section, we will learn how to centralize File Delimited Metadata in Talend studio for the Data integration platform.
Before going further in this chapter first, we will understand why we will use File delimited.
It can be used for defining the properties of tFileInputDelimited, tFileOutputDelimited components, and to read and write the data from the delimited Files.
We must centralize their Metadata in the Repository and also reuse.
Note: The File schema creation is very similar for all types of File connections like "Delimited, Regex, XML, Positional, or LDIF".
For creating the File delimited connection form beginning:
Repository → Metadata → File Delimited
Note: To use the centralized File delimited in our job, go to the basic setting view of the necessary components with its property typeset as build-in for opening the File Metadata setup window.
The New Delimited File window will open where the File connection and schema definitions are completed in four steps.
Step1: Define General Properties
In the first step, we will fill all the necessary details like Name, which is a mandatory field, the Purpose and Description fields also.
We can also manage the version and status fields of a repository item in the project setting dialog box.
Click on the Select button next to the Path field for selecting a folder under the File delimited node to hold our newly created File connection.
Note: we cannot select a folder if we are editing an existing connection, but we can drag and drop it to a new folder whenever we want.
After filling all the details of general properties, click on the Next button.
Step2: Defining File path and Format
In the next step, we will click on the Browse button to load the file from the local system.
For example, we will select the custmore.txt File from our system.
Step3: Define File Parsing Parameters
In this step, we can change the setting according to our needs.
In the File setting area, we can set the Encoding type, Field and Row Separators, as we can see in the below screenshot:
In the Rows to skip section, we can specify the given parameter like Header and Footer.
We can select the Limit checkbox, and also specify the desired number of rows in the Limit of Rows section, as we can see in the above screenshot:
To view the new setting impact, look into the File Review Panel, and check the set handling row as column names box to transform the first parsed row as labels for schema columns.
And also, see that the number of the header rows to be skipped is increased by 1.
To see the effect and result view on the viewer, click on the Refresh Preview button.
After that, click on the Next button.
Step4: Checking and customizing the File schema
In the last step, we will check and customize the File schema.
To see the newly created Metadata in the Talend studio.
Repository → Metadata → File Delimited → customer_Metadata
To reuse the Metadata as a new component or the existing component, simply drag the File connection or schema from the Repository's Metadata node and drop it to the design workspace window.
For modifying the existing File connection:
For adding a new schema to an existing File connection: