How to built a JSON string with arrays in Talend

In this “how to” you are going to learn how to build a job that produces a JSON string that can be stored to disk or sent to a REST API, for example.

I have used the tJSONDoc components from the Talend Exchange for this “how to” because Talend Open Studio doesn’t provide an easy component to create JSON output with arrays. If you don’t know how to install these components you can read this article first.

These great components are built by Jan Lolling and he has also written the documentation.

1. JSON string

This is the JSON string that we are going to built in this “how to”:

JSON code

2. Create a new job

First you have to create a new job:

  • Right click job design
  • Select “Create job” and give your job a name

3. Add input data with the tFixedFlowInput component

For this example we are going to use the tFixedFlowInput component as a datasource, but you can use all kinds of datasources like a .csv or a database.
To complete this step make sure that the tHashInput and tHashOutput components are enabled in your studio. When they are disabled please follow the steps of this article first.

3.1. Customer data

The first step is to create the dataset with the customer data. You are going to load this data into a tHashOutput component so you can keep using the data in your job without reloading each time you need it:

  • Add a tFixedFlowInput component to the canvas
  • Add the schema below to the component (Component tab -> Basic settings -> Edit schema)

Schema tFixedFlowInput

  • Set the mode to “Use Inline Table”
  • Add the data below

  • Add a tHashOutput component to the canvas next to the tFixedFlowInput
  • Connect the tFixedFlowInput component to the tHashOutput component (Row -> Main) and sync your columns if needed
  • What I always do is adding some HTML code to the labelformat setting at the view page. Example: __UNIQUE_NAME__<br><b>Input customer data</b>

CustomerData

3.2. Order data

Now you have to add the order data to your job. Just put it under your customer data.

  • Add a second tFixedFlowInput component to the canvas
  • Add the schema to this component (Component tab -> Basic settings -> Edit schema)

tFixedFlowInput schema orders

  • Set the mode to “Use Inline Table”
  • Add the data below

tFixedFlowInput component settings

  • Add the tHashOutput component to the canvas
  • Connect the tFixedFlowInput component to the tHashOutput component (Row -> Main) and sync your columns of the tHashOutput component

OrderData

3.3. Connect the two input flows

  • You have to connect the first tFixedFlowInput to the second tFixedFlowInput with “Trigger ->On Subjob Ok”

Data input

4. Build JSON document

Now we have the input data ready it’s time to get started with the JSON string.

We are going to use three type of JSON components:

  1. tJSONDocOpen – Holds the root of the JSON document and can be initially loaded from various sources
  2. tJSONDocOutput – Builds JSON objects or arrays and sets their attributes
  3. tJSONDocSave – Renders the final JSON tree pretty formatted as string

4.1 Add tJSONDocOpen

  • Add the tJSONDocOpen component to the canvas
  • Go to the “Basic settings” of the tJSONDocOpen component view and select “New empty object”

tJSONDocOpen settings

  • Connect the tFixedFlowInput that contains the order data to the tJSONDocOpen component as “Trigger -> On Subjob Ok”

4.2 Set orderData as root of JSON document

  • Add another tFixedFlowInput component to the canvas. You have to add this tFixedFlowInput component because you can’t connect the tJSONDocOpen to the tJSONDocOuput component directly.
  • Connect the tJSONDocOpen component to this tFixedFlowInput component (Trigger -> On Subjob Ok)
  • Add the schema below

tFixedFlowInput schema dummy

  • Set the mode to “Use Inline Table”
  • Add the data

tFixedFlowInput settings dummy

  • Add the tJSONDocOutput component to the canvas
  • Connect the  tFixedFlowInput component to the tJSONDocOutput component (row -> main). The tFixedFlowInput with the dummy attribute is needed because it’s not possible to connect the tJSONDocOpen component directly to the tFixedFlowInput component.

tJSONDocOutput add orderData records view

  • Open the component view of the tJSONDocOutput and press “sync columns”
  • Select the Parent JSON Document (select the tJSONDocOpen component that you have just created – in my example it’s the tJSONDocOpen_1)
  • Add “orderData” as the JSON path for the parent
  • Select “One single json object with the schema columns as attributes”

tJSONDocOutput orderData

4.3. Add customer records to JSON

  • Go to the component tab, basic settings of the tHashOutput component with the customer records.
  • Go to the component tab, basic settings and select “Edit schema”
  • Press the “Export all rows into XML file” button and save the file at your desktop for example

Export all rows into XML file

  • Add the tHashInput component to the canvas
  • Go to the component tab, basic settings and select “Edit schema”
  • Press the “Replace all rows by import from XML file” button

Replace all rows from XML file

  • Select the XML file that you have saved
  • Press OK
  • Connect the tFixedFlowInput component to the tHashInput component (Trigger – On Subjob Ok)

Connect to tHashInput

  • Add the tFlowToIterate component to the canvas. This component sends the unique emailaddresses into the flow by adding them to a global variable
  • Connect the tHashInput component to the tFlowToIterate component (Row -> Main)
  • Go to the basic settings of the tFlowToIterate component.
  • Uncheck the “Use the default (key,value) in global variables” option
  • Add the key and value (this step creates a global variable “_varEmail”. This is the name of the global variable where the emailaddress is going to be stored in during each iterate.

tFlowToIterate add orderDataRecords

  • Add the tSetGlobalVar component to the canvas
  • Connect the tFlowToIterate component to the tSetGlobalVar component (Row -> Iterate)
  • Configure the basic settings of the tSetGlobalVar component. I have only used this component to be able to connect the tFlowToIterate component to the tMap component which we are going to add in the next step.

tSetGlobalVar settings Add orderDataRecords

  • Add the tMap component to the canvas next to the tSetGlobalVar component
  • Connect the tSetGlobalVar component to the tMap component (Row -> Main)
  • Double click the tMap component
  • Click the + button in the middle and add the variable as you can see in the picture below. The expression uses the global variable that you have created within the tFlowToIterate component

Add variable tMap

  • Add an output table (out1)
  • Drag and drop the variable to the output table
  • Rename the column name to “email”

tMap addOrderData records

  • Add the tJSONDocOutput component to the canvas
  • Connect the tMap component to the tJSONDocOutput component *Row -> out1)
  • Open the tJSONDocOutput component and press “Sync columns” if needed
  • Configure the basic settings like you can see in the picture below

tJSONDocOutput add orderData records

This is how the add orderDataRecords part should look like:

Add orderData records to JSON

4.4. Add tJSONDocSave

To see the result of the job so far you need to follow the next steps:

  • Add the tJSONDocSave component to the canvas
  • Connect the tHashInput component to the tJSONDocSave component  (Trigger -> On Subjob Ok)

tJSONDocSave flow

  • Configure the basic setting of the tJSONDocSave component (in this case we are going to write the output to a tLogRow component so you can uncheck the “Write output as file” checkbox

tJSONDocSave component

  • Add the tLogRow component to the canvas
  • Connect the tJSONDocSave component to the tLogRow component (Row -> Main)
  • Run your job

Your output should look like this:

Job output JSON1

4.5. Add fieldValues to JSON

  • Add another tHashInput to the canvas
  • Edit the schema with the “replacing all rows from import XML file” option. You can use the same XML file as you used before
  • Connect the tJSONDocOutput to the tHashInput with row -> iterate

Add second tHashInput to the canvas

  • Add a tFilterRow component next to the tHashInput component and add the basic settings.
  • Press “Sync columns” if needed

tFilterRow add fieldValues

  • Add a tFlowToIterate next to the tLogRow
  • Connect the tFilterRow to the tFlowToIterate (Row -> Filter)
  • Configure the basic settings

tFlowToIterate add fieldValues

  • Add a tSetGlobalVar component next to the tFlowToIterate component
  • Connect the tFlowToIterate component to the tSetGlobalVar component (Row -> Iterate)
  • Configure the basic settings

tSetGlobalVar add fieldValues basic setting

  • Add a tMap component next to the tSetGlobalVar component
  • Connect the tSetGlobalVar component to the tMap component (Row -> Main)
  • Configure the tMap component the same way as you did in your previous tMap component

tMap add fieldValues

  • Add a tJSONDocOutput component
  • Connect the tMap component to the tJSONDocOutput component (Row -> Out2)
  • Configure the basic settings.
  • Select your previous tJSONDocOutput component as “Parent JSON Document”. In my example it’s the tJSONDocOutput_4

tJSONDocOutput component settings2

  • Run your job and see the result. Your result should look like this:

JSON Code fieldValues

4.6 Add the optionalData to the JSON string

Now we are going to add the optionValues part to the JSON string which should look like this eventually:

SON Code optionalValues part

Our tFixedFlowInput datasource contains only one row for this emailaddress, so we have to split that row into multiple rows with a name and value column.

  • Add another tHashInput to the canvas
  • Edit the schema with the “replacing all rows from import XML file” option. You can use the same XML file as you used before
  • Connect the previous tJSONDocOutput to the tHashInput (Row -> Iterate)

Add third tHashInput customer data

  • Add a tFilterRow component next to the tHashInput component
  • Connect the tHashInput component to the tFilterRow component (Row -> Main)
  • Add the basic settings

tFilterrow add fieldValues

  • Add the tSplitRow component to the canvas
  • Connect the tFilter to the tSplitRow component (Row -> Filter)
  • Configure the tSplitRow component (in my example row10 is the input row, but it could be different in your job)

tSplitRow optionalData

  • Add a tFlowToIterate next to the tSplitRow
  • Connect the tSplitRow to the tFlowToIterate (Row -> Main)
  • Configure the basic settings of the tFlowToIterate component

tFlowToIterate add optionalValues

  • Add a tSetGlobalVar component next to the tFlowToIterate component
  • Connect the tFlowToIterate component to the tSetGlobalVar component (Row -> Iterate)
  • Configure the basic settings

tSetGlobalVar settings Add optionalValues

  • Add a tMap component next to the tSetGlobalVar component
  • Connect the tSetGlobalVar component to the tMap component (Row -> Main)
  • Configure the tMap component. Use “out3” for the outputflow

tMap optionalValues

  • Add a tJSONDocOutput component
  • Connect the tMap component to the tJSONDocOutput component (Row -> out3)
  • Configure the basic settings.
  • Select your previous tJSONDocOutput component as “Parent JSON Document”. In my example it’s the tJSONDocOutput_4. The tJSONDocOutput will put the “optionalValues” at the same level as the “fieldValues”

tJSONDocOutput component settings optionalValues

  • Run your job and see the result. Your result should look like this:

JSON Code optionalValues part2

4.7 Add the ORDERLINEITEMS to the optionalValues

The last step to finish this job is adding the orderlineitems to the JSON string. We are going to do this in two steps:

  1. First we are going to add a placeholder to the optionalValues
  2. The second step is to merge the orderlineitems to value attribute of the placeholder
  • Add a tFixedFlowInput component to the canvas
  • Connect the tHashInput to the tFixedFlowInput component (Trigger – On Component Ok)

tFixedFlowInput ORDERLINEITEMS

  • Add the basic settings

tFixedFlowInput ORDERLINEITEMS basic settings

  • Add a new tJSONDocOutput component
  • Connect the tFixedFlowInput component to the tJSONDocOuput component (Row -> Main)
  • Configure the basic settings.
  • Select your tJSONDocOutput component with the orderDataRecords as “Parent JSON Document”. The tJSONDocOutput will add the new “optionalValue” as the last item of the “optionalValues”

tJSONDocOutput component settings orderlineitems optionalValues placeholder

  • Add a new tHashInput component to the canvas
  • Connect the tJSONDocOutput component to the tHashInput component (Row -> Iterate)

tHashInput ORDERLINEITEMS

  • Configure the basic settings of the tHashInput component. In this case you have to use the tHashOutput with the orderline data.

tHashInput ORDERLINEITEMS basic settings

  • Go to the component tab, basic settings of the tHashOutput component with the orderline records.
  • Go to the component tab, basic settings and select “Edit schema”
  • Press the “Export all rows into XML file” button and save the file at your desktop for example

Export all orderline rows to XML file

  • Go to the component tab, basic settings and select “Edit schema”
  • Press the “Replace all rows by import from XML file” button
  • Select the XML file that you have just created
  • Add a tFilterRow component next to the tHashInput component
  • Connect the tHashInput component to the tFilterRow component (Row -> Main)
  • Add the basic settings

tFilterrow OrderLineItems

  • Add a tJSONDocOutput component
  • Connect the tFilterRow component to the tJSONDocOutput component (Row -> Main)
  • Configure the basic settings.
  • Select your previous tJSONDocOutput component (the placeholder) as “Parent JSON Document”. In my example it’s the tJSONDocOutput_5. The tJSONDocOutput will add the orderlineitems to the value attribute.

tJSONDocOutput OrderLineItems

This is how your job should look like:

Job How to create a JSON document with arrays

  • Run you job and see the result:

JSON code

Leave a Reply