Centralizing File Delimited Metadata

In this section, we will learn how to centralize File Delimited Metadata in Talend studio for the Data integration platform.

Before going further in this chapter first, we will understand why we will use File delimited.

It can be used for defining the properties of tFileInputDelimited, tFileOutputDelimited components, and to read and write the data from the delimited Files.

We must centralize their Metadata in the Repository and also reuse.

Note: The File schema creation is very similar for all types of File connections like "Delimited, Regex, XML, Positional, or LDIF".

For creating the File delimited connection form beginning:

  • Go to the Repository panel then move to Metadata.
  • After that, expand the Metadata and right-click on the File Delimited, and select Create File delimited option, as we can see in the below image:

Repository → Metadata → File Delimited

Centralizing File Delimited Metadata

Note: To use the centralized File delimited in our job, go to the basic setting view of the necessary components with its property typeset as build-in for opening the File Metadata setup window.

The New Delimited File window will open where the File connection and schema definitions are completed in four steps.

  • Define General properties
  • Defining File path and Format
  • Define File Parsing Parameters
  • Checking and customizing the File Schema

Step1: Define General Properties

In the first step, we will fill all the necessary details like Name, which is a mandatory field, the Purpose and Description fields also.

We can also manage the version and status fields of a repository item in the project setting dialog box.

Click on the Select button next to the Path field for selecting a folder under the File delimited node to hold our newly created File connection.

Note: we cannot select a folder if we are editing an existing connection, but we can drag and drop it to a new folder whenever we want.

After filling all the details of general properties, click on the Next button.

Centralizing File Delimited Metadata

Step2: Defining File path and Format

In the next step, we will click on the Browse button to load the file from the local system.

For example, we will select the custmore.txt File from our system.

  • Select the Format where the File is created. For this, we are selecting the Windows platform from the given drop-down list.
  • If the suitable Format is not available in the given drop-down list, ignore it.
  • We have the File viewer which gives an instant picture of the File loaded, as we can see in the below screenshot:
Centralizing File Delimited Metadata
  • Click on the Next button to proceed further.

Step3: Define File Parsing Parameters

In this step, we can change the setting according to our needs.

  • Here, we are giving header value as 1, in the File settings pane where we change the custom ANSI field as "|".
  • And set the Field separator as Custom ANSI as we can see in the below image:
Centralizing File Delimited Metadata
  • After that, click on the Next

In the File setting area, we can set the Encoding type, Field and Row Separators, as we can see in the below screenshot:

Centralizing File Delimited Metadata
  • According to our File type, we can select as CSV or Delimited.
  • If we select our File type as CSV, then we have an option to set the Escape char and Text Enclosure, or like here we choose Delimited, both the options are unavailable.

In the Rows to skip section, we can specify the given parameter like Header and Footer.

  • If the particular File has the Footer information, then set the number of Footer line, which is to be ignored.
  • If the File preview displays a Header message, then leave the header from the parsing and set the header's number for skipping.
Centralizing File Delimited Metadata

We can select the Limit checkbox, and also specify the desired number of rows in the Limit of Rows section, as we can see in the above screenshot:

To view the new setting impact, look into the File Review Panel, and check the set handling row as column names box to transform the first parsed row as labels for schema columns.

And also, see that the number of the header rows to be skipped is increased by 1.

Centralizing File Delimited Metadata

To see the effect and result view on the viewer, click on the Refresh Preview button.

After that, click on the Next button.

Step4: Checking and customizing the File schema

In the last step, we will check and customize the File schema.

  • To customize the File schema, check whether our data type in the Type column is correct or not.
  • In the description of schema section, we can change the column name as we mentioned in the actual File.
  • The Guess button is used to generate the schema again if the Delimited File schema is changed, and make sure that if we customize the schema, the Guess feature could not keep the changes.
  • Click on the Finish button, as we can see in the below image:
Centralizing File Delimited Metadata

To see the newly created Metadata in the Talend studio.

  • Go to the Repository panel then move to Metadata.
  • After that, expand the File Delimited node as we can notice in the below screenshot:

Repository → Metadata → File Delimited → customer_Metadata

Centralizing File Delimited Metadata

To reuse the Metadata as a new component or the existing component, simply drag the File connection or schema from the Repository's Metadata node and drop it to the design workspace window.

For modifying the existing File connection:

  • Go to the Repository's Metadata node.
  • After that, expand the File delimited, and right-click on the customer_Metadata schema, and select Edit File Delimited as we can see in the below screenshot:
Centralizing File Delimited Metadata

For adding a new schema to an existing File connection:

  • Go to the Repository panel, and right-click on the File delimited.
  • Select Retrieve Schema from the popup menu in the Metadata, as we can see in the below image:
Centralizing File Delimited Metadata




Latest Courses