Create a Project
Create a new EasyETL Project by pressing the New Project button at the top of the Project Navigator.
New Project button
Name this Project 'CSVtoXML' and press Finish. This opens the new Project in the Simple AL editor.
Simple AssemblyLine editor
The editor provides two drop-downs: one for selecting the Input source and one for the target. The area below is empty (apart from the assistance text) until you have chosen your source.
Set up input for your ETL AL
Configure input for your EasyETL AssemblyLine by clicking on the drop-down for Source Information and selecting the File ConnectorSelecting Source information
You will then be presented with the configuration dialog for this Connector. Point the File Path parameter at the ‘People.csv' file found here:
TDI_HOME/examples/Tutorialwhere TDI_HOME is replaced with the Security Directory Integrator installation directory on your machine1.Setting the File Path parameter
Now click on the tab labeled Parser and select the CSV Parser, keeping the default configuration parameters as-is. Finally, press the Connect button at the bottom of the dialog box to test the connection and discover available Attributes
Testing the connection and discovering schema
The schema for the connected system is displayed and from here we can choose which of these to use in your data flow. For this example, use the Select All button and then press OK to close the configuration dialog.
Back in the EasyETL workbench you will see that the bottom half of the Simple AL editor has changed to reflect that you now have an Input Source configured. SDI now presents you with a button for stepping through this information one record at a time, as well as buttons to run the ETL task to completion and to stop it.
Input Source configured
Below these buttons are two grid boxes called Data Viewers that list the Attributes handled by your data flow. The Data Viewer on the left shows your input Attributes. Those that you selected for reading in the previous step are displayed in bold at the top and are called the Input Map. Below will be any unselected Attributes displayed in gray, and we can include these for input mapping by double-clicking on them. Similarly, you remove Attributes from the map by either double-clicking or deleting them2.
The box to the right is the Output Viewer and it shows the set of Attributes to be written. In SDI terms, this is your Output Map and by default it is identical to the list you selected for your Input Map. Note that we can change the name of any Output Attribute by clicking in the right-hand column and then editing the value. Use this technique to rename the Attributed called ‘Title' to ‘JobTitle'.
Renaming an Output Attribute
Your EasyETL project is now ready for its first test.
Run your EasyETL AssemblyLine
Select the Data Collector view at the bottom of the screen and then press the Read and Write next record button. This causes the following to happen:
- There is a delay as your EasyETL AssemblyLine is transferred to the running Server and started;
- The first record is read and parsed from your CSV input source and the data is displayed in both the input and output grid displays;
- The Attributes you selected for output are written and collected in the Data Collector view3.
So even though you have not selected an Output target yet, you can still run and test your ETL project, viewing the data as it flows down the AssemblyLine
One record read and collected
Each time your press the Read and Write… button, another record is read, displayed and collected. If you now press the Run button then your ETL job runs to completion and you will see this AL report
EasyETL AssemblyLine completed
As shown in the dialog above, no records were actually written anywhere. However, the Data Collector still provides handy visual feedback on the information being extracted and transferred.
Furthermore, we can select rows in the collected data and copy/paste this information to a file or other target4.
Transformations
Up to now your Output values have been identical to your input. However, there will be situations where you want to manipulate or even compute these based on the data read. You do this in SDI EasyETL by writing Transformations in JavaScript.In order to work with Transformations you first have to enable them by selecting the Show data transformations check box.
Figure 10. Enabling Transformation
This causes a new grid box to appear between the Data Viewers: the Transformation Viewer. Here you see arrows indicating that all three Output Attributes get their values directly from Input Attributes – in other words, no transformations. You are going to define a new Output Attribute and then add the Transformation script to compute its value. Do this now by right-clicking in the Output Map, choosing Add Attribute and naming it ‘FullName'. Now double-click on the Transformation to the left of this new Output Attribute and then enter this script snippet:5 return First + " " + LastFigure 11. Show Transformation script
There is an Evaluate button in the Transformation Script editor dialog for testing the script, along with one for bringing up some JavaScript tips and examples.
Press Evaluate now to get an idea how your Transformation works.
Figure 12. Evaluate Expression
The Output value shown was computed using data that you collected when you ran the AssemblyLine. Close the evaluation results dialog by pressing OK.
Now press OK now to accept this script and close the Transformation Script editor and re-run your EasyETL AssemblyLine by pressing Run and then viewing collected entries. Notice how the Data Collector now gives you two Component Collections to choose from: Output and Input. Choose Output and see how your Transformation script generated a ‘FullName' value for each entry.
Figure 13. Output collection with computed FullName Attribute
So now you know how to set up an Input Source and select the Attributes to extract, as well as how to transform this data to fit your output needs. The next step is selecting an Output target and driving the data there.
Selecting an output target
Select the File Connector for your Output target using the drop-down above the Output View and have it write to a file called ‘Output.xml'6. Choose the XML Parser and press OK. Note that you could use the Connect button to ensure that the file path you entered is valid. However, there will be no data to discover – unless of course you point your Connector at an existing XML file.Once our Output is configured, run your ETL AssemblyLine again. Once completed we can open the output file and verify the results.
Figure 14. XML Output
Your Output target could as easily have been a database table, just as your input could be coming from a Lotus® Notes® application or LDAP directory. The steps you took to create this example ETL job are the same, regardless of the systems or data stores you are working with.
Parent topic:
EasyETL Guide1 Note that SDI supports both forward slash and backslash as the path separator when running on Windows. Your SDI solutions will be more portable between Windows and all the other platforms SDI runs on if you use forward slash.2 We can also right-click and choose Add Attribute or Remove Attribute.3 The leftmost button at the top of the Data Collector view opens a configuration dialog where you can increase the Data Collector Buffer size.
Note that if you plan to collect large amounts of data then you may need to increase the memory available to SDI. Do this by locating the ‘ibmditk' batch-file/script in the SDI installation folder and opening it in an editor. Near the bottom is the line that launches ‘miadmin' and you could insert the following text after the –vmargs option: –Xmx512M This example will allow SDI data memory to grow to 512 Mbytes. 4 Data is copied in CSV format to simplify importing values into spreadsheets and report tables. ‘CSV' stands for Character Separated Value, and the separator character used by SDI is the semi-colon (;).5 Note that we can press Ctrl + Space to get up a list of suggestions to what we can type. This list includes some special objects like 'task' and 'main' and lists the Attributes being read in from your Input Source at the bottom. Note also that the notation shown here for accessing Input Attributes only works in EasyETL. In the full SDI Workbench you must prefix an Attribute name with the Entry object carrying it - for example the Work Entry (work). As a result, the above example would look like this:
return work.First + " " + work.LastThe Work Entry and other SDI concepts are discussed in the first chapter of this document.6 If you do not enter a full path, or a relative one, then SDI will base this from the Solution Directory that you specified during installation.