Create our first AssemblyLine

IBM Tivoli Directory Integrator

Create our first AssemblyLine

Overview

Returning to the example scenario outlined in the introduction, we will now create an AssemblyLine that migrates information from D1 to D3.

The Tutorials folder, that we copied from TDI_INSTALL/examples to our Solution Directory, contains a file named People.csv:

First;Last;Title
Bill;Sanderman;Chief Scientist
Mick;Kamerun;CEO
Jill;Vox;CTO
Roger
Gregory;Highpeak;VP Product Development
Ernie;Hazzle;Chief Evangelist
Peter;Belamy;Business Support Manager

The file is in character separated value format (CSV) and represents our D1 input data source. the AssemblyLine will extract this data and transfer it to an XML document which will be our D3 output target.

Procedure

Click on New AssemblyLine in the topmost toolbar
Call the new AssemblyLine 'CSV2XML'.
Press the Finish button to open the AssemblyLine in an AssemblyLine editor tab.
Empty AssemblyLine editor

The left part of the AssemblyLine editor contains the list of components that make up this AssemblyLine and is empty right now except for the section names:
- Feed
- Data Flow
The right-hand area displays all Attributes being mapped in and out of the AssemblyLine.
For each line in the CSV file we will create a new node in the XML output document. Looping behavior is provided automatically by the TDI kernel, driving components listed under the AssemblyLine Data Flow section as long as there is input data coming from Connectors in the Feed section¹.
Create a Connector for reading our CSV input file by right-clicking on the Feed section folder and select Add Component...

On the Choose Component wizard panel select File System Connector.
Change Connector name to Read_CSV_File and select Iterator from the Mode drop-down.

Mode settings of a Connector inform the AssemblyLine execution logic what role the component plays in the flow. Iterator Mode results in the for-each behavior required to drive data from the CSV file, one entry at a time, to the components we will add to the Data Flow section.
Press Next
On the File System Connector Configuration panel, set File Path to Tutorial/People.csv
Path to People.csv file can either be full or relative from the Solution Directory.
Press Next and proceed to Parser Configuration.
Click on the CSV Parser to select it.
Press Finish
We will now see the Parser Configuration panel.
Press Finish

To have the Connector discover the schema of the input source to map these values into our AssemblyLine, right-click on the new Iterator Connector in the AssemblyLine Components tree and select Browse Data from the context menu.

This will open the Data Browser in a new editor tab.

Area 1	Choose the selected Parser.
Area 2	Details tab that shows the raw byte stream to be parsed. There are also tabs for changing connection parameters and configuring the chosen Parser.
Area 3	For connecting to the data source and discovering which Attributes are available.

Press Connect and then the Next button.

We have now discovered the schema of this file.
Select the Attributes we want to map in, which in this case is all of them, by either selecting the checkbox next to each one, or using the Select All button.
Press Ctrl-W shortcut to close the Data Browser tab, or click on the Close symbol (X) at the right edge of the tab and return to the AssemblyLine editor where the AssemblyLine should look like the screenshot below⁵.

Details for the selected component are shown to the right of the AssemblyLine component list, including the three mapping rules we just set up in the Input Map. Each Attribute Map item has an Assignment, which is a snippet of script that is evaluated in order to set the value (or values) of the target Attribute.
Before continuing, take a moment to reflect on these Assignments: We will recall from the Entry-Attribute-value data model section that the AssemblyLine has a globally available Work Entry that carries all data being transported down the AssemblyLine. This object is referenced in script code by using the pre-registered script variable work.
The Interface of every Connector has its own Conn Entry that is used as a cache for reads and writes. This component-specific object is accessed from script through the pre-registered variable conn. The conn variable is only available for limited periods, as shown in the TDI Hook Flow Diagrams. Outside this scope it is still accessible by querying a component for its conn Entry.
Consider the first mapping rule which creates an Attribute in the Work Entry named 'First'. Its value is derived from the following assignment:
This shorthand notation references the Attribute called 'First' that was just read into the conn Entry, and its values are used to populate the new Work Entry Attribute. A comparable assignment script would be:
Add the output Connector for creating the target XML document (data source D3) by clicking the Add component button at the top of the AssemblyLine Components panel.
Choose the File System Connector, renaming it to 'Write_XML_File'. Leave the Mode setting as AddOnly.
Press Next.
In the Connector Configuration panel, set the File Path parameter to write to a file called Output.xml in the Tutorial folder.
Choose 'XML Parser' in the next Wizard panel.
Press Finish since we don't need to change the XML Parser configuration.
Note that in the case of the output Connector, we can't do Schema Discovery since there is no Output.xml file to discover from.
AssemblyLine with two Connectors in place...

You may have noticed that when we select a component, its details appear in the right part of the editor screen. Whenever we select either the 'Feed' or the 'Data Flow' folder, you are presented with the overview of all Attribute Maps for this AssemblyLine. This is a handy display for copying your input Attributes to the Output Map of the latest Connector,
Click on either 'Feed' or 'Data Flow'.
Here we see the list of Attributes (three in total) that are being brought into our AssemblyLine by the Iterator-mode Connector.
Select these Input Map Attributes and drag them down to the Output Map of the 'Write_XML_File' Connector, completing the data flow.
We can Control-click to select multiple, or use Shift-click to select a range.

The Assignment is automatically converted from input format to output.
For example, the first map item in the Input Map of the 'Read_CSV_File' Connector will create an Attribute in the Work Entry named 'First' to hold any values found in conn.First (that is, the Attribute called 'First' that was read into the Conn Entry).
When you drag this input mapping rule to an Output Map then its assignment is changed so that the value now comes from the Work Entry instead, and it is creating a target Attribute in the Connector's cache (the Conn Entry).
In Write_XML_File section, right-click attributes and change First to FirstName and Last to LastName.
Add a new map item to this Output Map by right-clicking on the 'Write_XML_File' Output Map itself and choosing Add Attribute.
Adding the 'FullName' Attribute to the Output Map
Call the target of this new mapping rule 'FullName'
Press OK and then double-click on it to edit its assignment.
This opens up the Script editor panel and presents you with a default assignment script:
There is no FullName attribute in the Work Entry, so this map will not be able to set any values. Instead, compute this value by changing the script so that it concatenates the First and Last Attributes, leaving a single space between these values:

The script should read as follows:
Note that no terminating semi-colon is required for one-liner Attribute Map assignment scripts like this⁹.
Press the Close button at the top-right of the Script editor panel when you are done.

Parent topic

Introducing IBM TDI

Footnotes

¹ Note that only one Feeds Connector will be delivering data to the AssemblyLine at a time. If you put more than one Iterator Connector here then the topmost one will empty first before the next one in line begins reading from its source.

² Although we can name Connectors as we like, it is recommended that you name them in the same way that you would a script variable: start with a letter, followed with any number of letters, digits and underscore characters. This is because all AssemblyLine components are automatically registered as script variables, making it easier if you later want to reconfigure and drive them from your script code.

³ This technique makes the solution easier to move and share since all we have to do is specify the desired Solution Directory and all relative paths will work unaltered.

⁴ Since you know the file is in CSV format, the quickest approach would be to just click on the Connect and Next button in the Schema area of the Iterator Connector. Then you drag discovered Attributes into the Input Map as desired. The Data Browser is useful when you are unsure of the format. But I still thought you ought to try it :)

⁵ If for some reason your Connector is in the Data Flow section, simply drag it up to Feed. If the mode setting is not Iterator then right-click on the Connector, select Mode and then choose Iterator.

⁷ For users familiar with version 6.x and earlier, we can also use the pre-7.0 syntax:

    ret.value = conn.getAttribute("First")

⁹ Pre-7.0 syntax is also supported so map assignment scripts can still start with "ret.value =".