Centralizing Positional Metadata
In this section, we will learn how to centralize File Positional Metadata in Talend Studio for Data integration platform.
Before going further in this chapter first, we will understand why we will use positional Files.
To read and write data for the positional File, we need to centralize the Metadata.
To describe the properties of tFileInputPositional, tFileOutputPositional, and tFileInputMSPositional components, we must use the File Positional Metadata.
To create the File Positional connection from the beginning:
Repository → Metadata → File Positional
Note: To use the centralized File delimited in our job, go to the basic setting view of the necessary components with its property typeset as build-in for opening the File Metadata setup window.
Then New Positional File window will open where both the File connection and schema definitions are completed in four steps:
Step1: Defining General Properties
In the first step, we will fill all the necessary details like Name, which is a mandatory field, and the Purpose and Description fields if we want to more specific.
We can also manage the version and status fields of a repository item in the project setting dialog box.
Click on the Select button next to the Path field for selecting a folder under the File positional node to hold our newly created File connection.
Note: we cannot select a folder if we are editing an existing connection, but we can drag and drop it to a new folder whenever we want.
After filling all the details of general properties, click on the Next button.
Step2: Defining File path and Format
In the next step, we will click on the Browse button to locate our File from the local system.
For example, we will select the Employee info.txt File from our system.
To define the File column properties, click on the File preview and set the markers against the ruler, and the orange arrow helps us to change the position.
As we can see in the above image, the Field Separator and Marker Position fields are automatically filled.
Field Separator: To show the length of the columns of the loaded File, Field Separator is used, and it also displays the number of characters between the separators.
[*]: The asterisk symbol represents all the remaining characters on the row, which starts from the previous marker position, and we can also change the figure to identify the columns correctly.
Maker Position: This field is used to display the exact position of each marker on the ruler, and we can also change the figure to identify the positions accurately.
For moving the marker, hold the arrow and drag it to the new position.
To remove a marker, hold the arrow and drag it towards the ruler until an (x) icon appears.
Step3: Define File Parsing Parameters
In this step, we describe the File parsing variable to recover the File schema properly.
The preview section displays the File columns upon the marker's positions.
To view the new setting impact, look into the File Review Panel, and check the set handling row as column names box to transform the first parsed row as labels for schema columns.
And also, see that the number of the header rows to be skipped is increased by 1.
To see the effect and result view, on the viewer, click on the Refresh Preview button.
After that, click on the Next button.
Step4: Checking and customizing the File schema
In the last step, we will check and customize the File schema:
To see the newly created Metadata in the Talend studio:
Repository → Metadata → File positional → Employee
To reuse the Metadata as a new component or the existing component, simply drag the File connection or schema from the repository's Metadata node and drop it to the design workspace window.
For modifying the existing File connection:
For adding a new schema to an existing File connection: