+

Search Tips   |   Advanced Search

Detecting Changes

Security Directory Integrator provides a number of features for detecting changes in input data. In addition to offering a set of Change Detection Connectors, you also have the option of enabling the Delta Engine for your Input source.

The Delta Engine takes snapshots of data as it's read and then compares these with snapshots taken during the previous run to determine what has changed. Those entries that are unchanged are skipped, and only modified entries are retrieved for processing in your EasyETL AssemblyLine.

Press the Configure button for your Input source and then select the Delta tab. Delta configuration

You must first enable the Delta Engine by selecting the check box at the top of the con-figuration panel. Then use the drop-down to select ‘First' as the Unique Attribute Name1.

There are several other parameters available here, some of which make more sense when working in the standard SDI Workbench and not in EasyETL. For example, al-though an EasyETL AL can detect and transfer new and modified entries, it will not handle deleting a row from a database or entry in a directory. However, it will write this infor-mation to an Output target like a File Connector with the LDIF Parser. LDIF files can contain change operation tags, and some systems support LDIF import.

We can learn more about the full Delta Handling features of SDI here:

http://www.tdi-users.org/twiki/pub/Integrator/HowTo/HowTo_SyncData_6.1.1070523.pdf

One change that you may wish to make is to the Commit parameter. This controls when new and changed snapshots are committed to the SDI System Store database. By default this is set to ‘After every database operation' and so occurs during the read phase.

However, if we wish to ensure that a change has been successfully transferred before committing the snapshot, set this drop-down to ‘On end of AL cycle' instead so that it happens after the Output target has been updated.

In order for the Delta Engine to do its work it needs a baseline snapshot set. You create this by running your ETL job the first time after Delta has been enabled. Once it has completed you will notice that the popup reports twice as many writes occurring. This is because SDI also counts the snapshots being written to the System Store, so you get two writes for every entry processed.

Try running your EasyETL AssemblyLine again and you will see that no entries were written this time. The Delta Engine detected that input records were all unchanged and skipped them. All entries unchanged and skipped

As a final test, bring up the input CSV file and change any of the field values – except for ‘Last'2. Save the change and then re-run your ETL job and you will see that only modified entries are processed.


Configure the output target for Updates

The current setup works fine for output to a file. However, if you were driving these changes to a directory, RDBMS or similar data store then you will want to add new data as well as updating existing records. In order for your EasyETL job to do this first select which Output Attribute to use as the criteria for locating the record to modify.

This is done by right-clicking on the Output Attribute you want and selecting the Use as link criteria option. Selecting your link criteria

Now when the Output Connector writes to the target, it first searches for a record using the Link Criteria attribute specified. If no match is found then a new entry is added. If the match was successful then this record is updated.

It's as simple as that: your ETL job has now been configured to provide ongoing synchronization between your input source and output target.


Command line assets for running and scheduling your ETL job

Once your ETL AssemblyLine is ready for deployment we can right-click on the Project in the Navigator and choose the Create files needed… option

Create command line assets to run the ETL job

This brings up an Export Files dialog where to write this script/batch-file.

Note that it will be given the same name as the Project, so in the case of this tutorial exercise running on Windows it will be called ‘CSV2XML.bat'. Executing your EasyETL Project from the command line provides maximum performance for the solution.

You will also get an XML file created in the same location. This is called an SDI Config file and contains the details of your EasyETL AssemblyLine that the SDI Server needs to run it. If you open the generated script in a text editor you will see the one-liner needed to start an SDI Server, point it at a Config and then specify the AssemblyLine to run. All you need to do now is set up a scheduled task or cronjob to periodically invoke this script and your synchronization/migration service will be in place.


Additional options


Parent topic:

EasyETL Guide

1 As you may have deduced, the Delta Engine uses one of your input attributes to uniquely identify snap-shots. If there is there is no unique value available in the input data then we can specify multiple attributes that will be concatenated together to from the snapshot id. You do this by typing in the names of multiple attributes separated by a plus symbol (+). For example: First + Last2 Since this is the attribute used to identify snapshots, any change to its value for an entry will cause it to appear as a new record to the Delta Engine.