How to split files depending on column values (2)

A visitor of TalendHowTo asked me if there’s a way to split a file into seperate files based on a column value without using a subjob. My answer: offcourse there is!

Let’s start to built immediately:

1. Create a new job

  • Name your new job “splitFiles2”

2. Add the tFixedFlowInput compoment to your canvas

  • Edit the schema
  • Add the city column as a String
  • Add the value column as a String

Split csv files schema

  • Set the row and field separator
  • Add the content

Split csv files tfixedflowinput

Content: Los Angeles;1 Amsterdam;1 Rome;1 Barcelona;1 London;1 Berlin;1 Detroit;1 Amsterdam;2 Amsterdam;3 Los Angeles;2

3. Add the tFlowToIterate to your canvas

  • Connect the tFixedFlowInput to your tFlowToIterate component (Row –> Main)
  • Add the keys and values as shown below (the key values are set as global variables)

Split files tFlowToIterate


4. Add the tFixedFlowInput to your canvas

  • Add the tFixedFlowInput component next to the tFlowToIterate component
  • Connect the tFlowToIterate component to the second tFixedFlowInput component (Row –> Iterate)
  • Add the schema

Split files tFixedFlowInput2

  • Add the columns and values to your basic settings (the values (global variables) has been set by the tFlowToIterate component in the previous step)

Split files tFixedFlowInput2

5. Add the tFileOutputDelimited to the canvas of your job

  • Enter the filename (replace the YourProfile part)
  • Enter the row and field separator¬†(if you use \n as a row separator the tFileOutputDelimited will create one row instead of serveral rows)
  • Check the append option
  • Press “Sync columns”
  • Connect the tFixedFlowInput component to the tFileOutputDelimited component (Row –> Main)

Split files tFileOutputDelimited2

6. Run your job

You will see that the following files are created now:

Split csv files result

You will also see that the Amsterdam.txt file contains 3 rows:


10. splitFiles job

This is how your splitFiles job should look like:

Job SplitFiles2

Offcourse it’s also possible to accomplish this by using a subjob

If you have any questions just leave a comment!!!

Leave a Reply