How to validate a csv header

This post describes an easy way to check if the column header of a .csv file matches the requirements. You can download the sample HowToCheckCsvHeader.csv and use it to build your own job.

1. Move the tFileList to the canvas

  • Select the directory that contains the .csv file
  • Add the Filemask

Talend tFileList

2. Connect the tFileInputDelimited as an iterate

  • Add the File name/Stream ((String)globalMap.get(“tFileList_1_CURRENT_FILEPATH”))
  • Set the Row Separator “\r\n”
  • Set the Field Separator “;”
  • Set the Limit to 1 because that’s the rownumber of the header in the .csv file

Talend ValidateHeader tFileInputDelimited

  • Edit the schema

Talend ValidateHeader Schema

3. Add the tFixedFlowInput component to the canvas

  • Edit the schema with the same columns as the tFileInputDelimited
  • Select Mode “Use Inline Table”
  • Add columns A/B/C/D/E/F by pressing the + button

Talend ValidateHeader tFixedFlowInput

4. Add the tUnite to your canvas

  • Press sync columns at the component tab
  • Connect the tFileInputDelimited and the tFixedFlowInput to the tUnite with the iterate connection (the order doesn’t matter in this case)

5. Add the tUniqueRow

  • Press sync columns at the component tab
  • Mark all columns as key attribute

Talend ValidateHeader tUniqueRow

6. Add the tHashOutput

  • If the tHash components are not available then go to this HowTo
  • Press sync columns at the component tab

7. Add two tJava components to the canvas

  • Connect both tJava components to the tHashOutput as a Run if Ttrigger
  • Click at If (order: 1) and add the the following condition to the component tab: ((Integer)globalMap.get(“tHashOutput_1_NB_LINE”)) == 1
  • Click at If (order: 2) and add the the following condition to the component tab: ((Integer)globalMap.get(“tHashOutput_1_NB_LINE”)) > 1

8. Run your job

When you run your job for the first time you will see that it will pass the header validation.
If you adjust a columnname in the tFixedFlowInput component you will see that it will fail if you run the job again.
In this example we only print a message but you can do a lot of other stuff after it.

Talend ValidateHeader Job

3 Comments

Leave a Reply