+

Search Tips   |   Advanced Search

Delta feature for Iterator mode

Connectors in Iterator mode can generate Delta entries. This feature uses the Delta Engine and the Delta Store to detect changes.


Delta Engine

The Delta Engine allows you to read through a data source, and detect changes from the previous time you did this. This way we can detect new entries, changed entries and even deleted entries. For certain data sources (such as LDIF files and LDAP servers), Security Directory Integrator can even detect if attributes and values within entries have been changed. We can configure Delta settings on Connectors in Iterator mode only.

The Delta Engine knows whether Entries or Attributes have been added, changed or deleted by keeping a local copy of each Entry in a persistent store, which is part of the System Store. This local repository is called a Delta Store and consists of Delta tables. Each time the AssemblyLine is run, the Delta Engine compares the data being iterated with its copy in the Delta Table. When a change is detected the Connector returns a Delta Entry.

Note: Do not manually modify Delta Store tables. Otherwise, the Delta snapshot information will become inconsistent, and the Delta Engine will fail.

Note: In versions earlier than SDI V6.1, snapshots written to the Delta Store during Delta engine processing were committed immediately. As a result, the Delta engine would consider a changed Entry as handled even though processing the AL Flow section failed. This limitation is addressed through the Commit parameter on the Connector Delta tab. Setting this parameter controls when the Delta engine commits snapshots taken of incoming data to the System Store.


Unique Attribute name

In order for the Delta mechanism to be able to uniquely identify each Entry, you must specify a unique Attribute to use as a Delta key. The values of this attribute must be unique in the used data source. We can specify the Delta key in the Delta tab of the Connector, by entering or selecting an attribute name in the Unique Attribute Name parameter. This attribute must be found in the Input Map of your Iterator, and can either be an attribute read from the connected system or a computed attribute (using script in the Attribute Mapping).We can also specify multiple attributes by separating them with a plus sign ( + ):
LastName+FirstName+BirthDate 
At least one of the attributes specified in the Unique Attribute Name parameter must contain a value. When several attributes are specified, their string values are concatenated into one string, which then becomes the unique Delta identifier. Attributes with no values (for example, blank or NULL) are skipped when the Delta key is built for an Entry.


Delta Store

The Delta Store is physically located in the System Store. It consist of one Delta Systable (DS) and one or more Delta Tables. Each Delta Table is used for the Delta Store of a different Iterator Connector with enabled Delta.

Although Delta Store tables can be accessed with both the JDBC Connector and the System Store Connector, it is unadvisable to change them without a deep understanding of how these tables are structured and handled by the Delta Engine.


Delta Table structure

Every Delta Table (DT) contains information about each Entry processed by the Delta Engine for a particular Connector. A Delta Systable (DS) maintains a list of all Delta Tables currently in use by the Delta Store.


Delta process

Given the above Delta Store structure, the sequence number is used to determine which entries are no longer part of the source data set. Every time an AssemblyLine is run the sequence number for the Delta Table used in particular by the Connector is read from the Delta Systable. Then it is incremented, and this incremented value will be used for marking the updated entries during the entire AssemblyLine execution.

The Delta Engine process works in two passes.

  1. Read → Look up → Compare → Update → Set current SequenceID

    1. The Iterator reads entries from the input data source.
    2. The Delta process looks for corresponding Entry in the Delta Table using the unique attribute's value.
    3. If a match is found the Delta process compares each Attribute (and its values) to determine if there have been modifications to the Entry. Based on the result from the comparison, the Delta Engine returns Delta Entry tagged with the relevant operation codes: modify or unchanged:

      • Modify Entry – the Entry that was read and the corresponding Entry from the Delta Table are considered different; the Entry is updated in the Delta Table
      • Unchanged Entry – the Entry that was read and the corresponding Entry from the Delta Table are considered equal.

    4. If a match is not found in the Delta Table the Entry is treated as new:

      • Add Entry – the Entry is added to the Delta Table.

    5. In both case c. and d. the sequence number value in the Delta table is updated with the sequence number used for the current AssemblyLine execution.

  2. Check for data with (SequenceID < current SequenceID) → Mark as DeletedOnce End of Data is reached by the Iterator, the Delta Engine makes a second pass through the Delta Table looking for those entries not accessed during the first pass. These Entries are easily recognized because their sequence number is not updated with the current sequence number. Therefore any Entries in the Delta Table with a sequence number lower than the current sequence number are considered to be deleted entries and are returned as deleted.Note: This pass happens only when the iteration trough the input data completes successfully. If for some reason an error occurs during that iteration, no Entries will be tagged as deleted and returned by the AssemblyLine or removed from the Delta Table. This will not affect the original data source and the next time the AssemblyLine is executed successfully the deleted Entries will be processed correctly.


Row Locking

This parameter is available in the Delta tab for Iterator connectors and the Delta Function Component configuration. It allows you to set the transaction isolation level used by the connection established to the Delta Store database. Setting a higher isolation level reduces the transaction anomalies known as 'dirty reads', 'non-repeatable reads' and 'phantom reads' by using row and table locks. This parameter has the following values:

For more information about transaction isolation levels, see the online documentation of the java.sql.Connection interface: http://docs.oracle.com/javase/1.6.0/docs/api/java/sql/Connection.html.

Each database server sets a default transaction isolation level; the default value for Apache Derby, Oracle and Microsoft SQL Server is TRANSACTION_READ_COMMITTED. However, the default value of the Row Locking parameter of SERIALIZABLE will override this when using a Delta component (that is, the Delta functionality in Iterator Connectors or the Delta Function Component). Some database servers may not support all transaction isolation levels, therefore please refer to the specific database documentation for accurate information about supported transaction isolation levels.Note: Transaction isolation levels are maintained by the database server itself for every connection established to the database. Therefore when a Delta component (with Transaction isolation level set to REPEATABLE_READ or SERIALIZABLE and the Commit parameter set to On Connector Close starts its transaction, all other queries trying to modify the same data will be blocked. This means that other components which need to modify the same data will have to wait until the first component commits its transaction on termination. This waiting may cause the issued SQL queries to timeout and leave the data unmodified.

Also when a component has the Commit parameter set to No autocommit you should manually commit the transactions in such manner that other components will not wait forever to perform a modification.


Detect or ignore changes only in specific attributes

The parameters Attribute List and Change Detection Mode configure the ability of the Delta Engine to detect changes only in specific attributes instead of in all received attributes.The Attribute List parameter is a list of comma separated attributes which will be affected by Change Detection Mode. This Change Detection Mode parameter specifies how changes in these attributes will be handled. It has three values:

Example use case

When using the Delta Engine, sometimes the received entries contain attributes that you consider as not important and wish to ignore. In such cases, these attribute must not affect the result of the Delta computation, as when several Entries differentiate only by these attribute it leads to unnecessary updates of the Delta Store table.

The solution for this case is using the Attribute List and Change Detection Mode parameters

Here is an example scenario where two AssemblyLines are receiving changelog entries from two replicas of a LDAP server and these changes are applied to one Delta Store. To illustrate this we will use the following example changelog entries:

Entry1:

Entry attributes:
	targetdn (replace):	'cn=Niki,o=IBM,c=us'
	changetime (replace):	'20071015094646'
	$dn (replace):	'changenumber=78955,cn=changelog'
	ibm-changeInitiatorsName (replace):	'CN=ROOT'
	changenumber (replace):	'78955'
	objectclass (replace):	'top'	'changelogentry'	'ibm-changelog'
	changetype (replace):	'modify'
	cn (replace):	'Niki' 'Niky'
	changes (replace):	'replace: cn
					  cn: Niki
					  cn: Niky
					  -
					  '

Entry2:

Entry attributes:
	targetdn (replace):	'cn=Niki,o=IBM,c=us'
	changetime (replace):	'20071015094817'
	$dn (replace):	'changenumber=10076,cn=changelog'
	ibm-changeInitiatorsName (replace):	'CN=ROOT'
	changenumber (replace):	'10076'
	objectclass (replace):	'top'	'changelogentry'	'ibm-changelog'
	changetype (replace):	'modify'
	cn (replace):	'Niki' 'Nikolai'
	changes (replace):	'replace: cn
					  cn: Niki
					  cn: Nikolai
					  -
					 '

Entry3:

Entry attributes:
	targetdn (replace):	'cn=Niki,o=IBM,c=us'
	changetime (replace):	'20071037454817'
	$dn (replace):	'changenumber=112,cn=changelog'
	ibm-changeInitiatorsName (replace):	'CN=ADMIN'
	changenumber (replace):	'112'
	objectclass (replace):	'top'	'changelogentry'	'ibm-changelog'
	changetype (replace):	'modify'
	cn (replace):	'Niki' 'Nikolai'
	changes (replace):	'replace: cn
					  cn: Niki
					  cn: Nikolai
					  -
					     '
Modified attributes are marked in bold and attributes that can be ignored are marked in italics. The ignored attributes (such as changenumber, changetime, and so forth) will not be considered when comparing the received Entry with the stored Entry. Therefore these attributes have to be listed in the Attribute List parameter. In order to specify that we want to ignore them the Change Detection Mode parameter needs to be set to Ignore changes for the following Attributes.This is the workflow when the AssemblyLines receive the entries:

  1. When AL1 receives Entry1, it will be returned as modify and saved in the Delta Store table.
  2. When AL2 receives Entry2 , its changetime, $dn, bm-changeInitiatorsName, changenumber attributes are modified but will be ignored. However the cn and changes attributes are also modified and therefore the resulted Delta Entry will be tagged as modify and saved in the Delta Store table.
  3. When AL2 receives Entry3, its changetime, $dn, bm-changeInitiatorsName, changenumber attributes are modified but will be ignored. The rest of the attributes are equal so the resulted Delta Entry will be tagged as unchanged and will be returned to the AssemblyLine (only if the Return unchanged parameter is checked) or skipped. The returned Delta Entry will be identical to the received Entry3. In this case the Delta Store is not updated. If the Attribute List and Change Detection Mode parameter were not used, the last Entry3 would have been tagged as modify and saved in the Delta Store.


Parent topic:

Producing Delta Entries