Cleaning up the database with the DBPurge utility

It is a good maintenance practice to keep ISIM Server database to a manageable size. We can use the DBPurge utility to clean up the audit trail in the database by removing records that are related to completed workflow processes. The utility is useful for deleting historical workflow audit data, non-workflow audit events, and reconciliation reporting entries from the database. It handles only removal, not archiving, of these records. Use the utility sparingly to avoid any unforeseen problems.


Utility usage

The IBM Security Identity Manager virtual appliance command line interface version of db_purge is interactive. After you log in to the CLI, navigate to the isim > utilities > db_purge prompt, type execute and press Enter. You are prompted for more parameters.

Required parameter. Specify one of the following parameters:

1:age
Specify the age of the records to be removed, which must not be a negative number. 0 removes all data, including the data for the current day.

2:date
Deletion date and optional time. For example, '2010-08-15-22:00'. All records that are created on this date or earlier are deleted, based on the server time zone.

Enter 1 or 2 depending on requirements. Next, choose from the following optional parameters to customize the selection.

Optional parameter.

1: grouping
Number of deleted entries in a single commit. The group size must be 1 - 100, where 50 is the default value.

2: workflow
Determines whether workflow data is removed. The flag is Boolean, and its default setting is true.

3: process_type
Specifies a two-character parameter, which restricts the deletion of processes to the specified type.

For example, 'AP' removes only processes of type Account Password Change. This parameter is relevant only when workflow data is removed. If we do not specify this parameter, then processes of any type are removed if they match the other parameters. For more information about the valid values, see the TYPE column description in the Database and Directory > Server Schema Reference > Database tables reference > Workflow tables > PROCESS table section in the IBM Security Identity Manager documentation.

4: audit
Determines whether non-workflow auditing related data is removed. The flag is Boolean, and its default setting is true.

5: recon
Determines whether historical reconciliation report data is removed. The flag is Boolean, and its default setting is true. This option does not remove the recon process-related data. To remove the recon's processes-related data (that is, the data that appears in View requests), -workflow 'true' and '-process_type 'RC' must be used with the -recon option. See –workflow and –process_type parameters for more information.

6: threads
Number of threads to be created by the DBPurge process for the DB2 database. Allowed values are 1 - 8. The default value is 4.

7: continue
Runs the utility with the passed options.

If you continue, by entering 7, the message, Running DBPurge clean utility and the options that were passed are displayed. After you run the utility, it reports the number of records that are removed by the utility. It also reports success or failure of utility.

DBPurge required parameters < age or date >: 
1: age - Specify the age of the records to be removed, which must be non-negative, where 
a value of 0 will remove all data, including today's 
2: date - Deletion date and optional time. For example, '2010-08-15- 22:00'. 

All records created on this date or earlier are deleted, based on the server timezone. 

Select option: 1 
Enter age: 10 

DBPurge optional parameters: 
1: grouping - Number of deleted entries in a single commit. The group size 
must be between 1 - 100 inclusive, where 50 is the default value. 
2: workflow - Determines whether workflow data is removed. The flag is Boolean, and its 
default setting is true. 
3: process_type - Specifies a two-character parameter, which restricts the deletion of 
processes to the specified type. 
4: audit - Determines whether non-workflow data is removed. The flag is Boolean, and its 
default setting is true. 
5: recon - Determines whether historical reconciliation data is removed. The flag is 
Boolean, and its default setting is true. 
6: threads - Threads count for DB2 database. Allowed values are between 1 and 
8 including these values, and its default value is 4 
7: continue 

Select option: 7 

Running DBPurge clean utility. 
Options passed: [ -age '10' -grouping '50' -workflow 'true' -audit 'true' -recon 'true' 
-threads '4'] 
Database clean up is successful.


Scheduled execution

set_schedule
Log in to the CLI and navigate to isim > utilities > db_purge > schedule.

Enter set_schedule to set the DBPurge schedule and press Enter.. You are then prompted for the DBPurge schedule name. After you provide the DBPurge schedule name, provide any required or optional parameters as described in Utility usage. After you enter 7 to continue, we are prompted for scheduling parameters.

Enter schedule information. 
1: minute 
2: hour 
3: day of month 
4: month 
5: day of week 
6: continue 

After you enter 6 to continue, the message Setting DBPurge schedule is displayed. After the schedule is created, DBPurge schedule set is displayed.

The possible values for the day of the week is a number, between 0 to 6. For example: 0 represents Sunday. The following possible values are defined :

Value Day of the week
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday

In the event of a failure, the DBPurge schedule could not be set message is displayed..

Enter schedule name: Schedule1 

DBPurge required parameters < age or date >: 
1: age - Specify the age of the records to be removed, which must be non-negative, 
where a value of 0 will remove all data, including today's 
2: date - Deletion date and optional time. For example, 
'2010-08-15- 22:00'. All records created on this date or earlier are deleted, based 
on the server timezone. 

Select option: 1 
Enter age: 0 

DBPurge optional parameters: 
1: grouping - Number of deleted entries in a single commit. The group size 
must be between 1 - 100 inclusive, where 50 is the default value. 
2: workflow - Determines whether workflow data is removed. The flag is Boolean, and 
its default setting is true. 
3: process_type - Specifies a two-character parameter, which restricts the deletion 
of processes to the specified type. 
4: audit - Determines whether non-workflow data is removed. The flag is Boolean, and 
its default setting is true. 
5: recon - Determines whether historical reconciliation data is removed. The flag is 
Boolean, and its default setting is true. 
6: threads - Threads count for DB2 database. Allowed values are between 
1 and 8 including these values, and its default value is 4 
7: continue 

Select option: 7 
Enter schedule information. 
1: minute 
2: hour 
3: day of month 
4: month 
5: day of week 
6: continue
Select option: 1 
Enter minute <0 to 59>: 10 
Enter schedule information. 
1: minute 
2: hour 
3: day of month 
4: month 
5: day of week 
6: continue 

Select option: 6 

Setting DBPurge schedule. 
DBPurge schedule set.

delete_schedule
Log in to the CLI and navigate to isim > utilities > db_purge > schedule. We must type delete_schedule to delete a schedule and press Enter. You are then prompted for DBPurge schedule name.
Enter schedule name: Schedule1 
DBPurge schedule deleted.

view_schedule
Log in to the CLI and navigate to isim > utilities > db_purge > schedule. We must type view_schedule to view existing schedules and press Enter.

For execution logs, see the dbpurge.log file in the virtual appliance generated support file. In the support file, the dbpurge.log file is located in /opt/ibm/identity/logs/. See Manage the support files.

Blank value in double quotes represent that no value has been set for that 
option. 
Schedules List: 
1: scheduleName="Schedule1" age="0" date="" grouping="50" workflow="true" processType="" 
audit="true" recon="true" threads="4" min="10" hour="*" dom="*" mon="*" dow="*"
Cron tab fields can be set to an asterisk (*), which is the default value.


Processing description

The following description illustrates the cleanup processing that occurs when you run DBPurge. Additional archive utilities can be built and run before running DBPurge. The exact implementation might vary. DBPurge runs the following queries to locate the primary records to remove:

1. SELECT ID FROM PROCESS WHERE COMPLETED <= timestamp
2. SELECT ID FROM AUDIT_EVENT WHERE TIMESTAMP <= timestamp AND WORKFLOW_PROCESS_ID IS NULL
3. SELECT RECONID, ACCOUNTID FROM RECONCILIATION_INFO WHERE RECONID IN (SELECT RECONID FROM 
RECONCILIATION WHERE COMPLETED <= timestamp)
4. SELECT RECONID FROM RECONCILIATION WHERE COMPLETED <= timestamp

The value of timestamp is based on the specified -age parameter and uses the ISIM date format yyyy-MM-dd HH:mm:ss:SSS GMT. As the primary records are selected, the data is removed along with data from the secondary, dependent tables that reference these identifiers. The deletion is done in groups. An adjusted age specification supports consistency so that the record age accurately reflects the time zone of the record time stamps. This strategy supports consistent handling of record time zones. The following values are valid:

0
Deletes any records that completed before the current time.

1
Deletes any records completed before exactly 24 hours ago.
This utility includes multi-threaded deletion. For all databases, separate threads and database connections to read record identifiers and to carry out deletions. For DB2 databases, multiple threads carries out the deletion and improve performance. Each thread requires its own database connection. The utility fails if the appropriate number of database connections is not available. For DB2 databases, DBPurge requires five connections; for other databases, it requires only two.

The following example is a high-level version of the statements for each table, and it illustrates the rows that are removed from each table.


Example

The following delete statements remove rows that reference identifiers from query (1) and from the PROCESS table.

DELETE FROM WORKITEM WHERE PROCESS_ID = ?
DELETE FROM ACTIVITY_LOCK WHERE PROCESS_ID = ?
DELETE FROM PROCESSLOG WHERE PROCESS_ID = ?
DELETE FROM PROCESSDATA WHERE PROCESS_ID = ?
DELETE FROM PENDING WHERE PROCESS_ID = ?
DELETE FROM PASSWORD_TRANSACTION WHERE PROCESS_ID = ?
DELETE FROM ACTIVITY WHERE PROCESS_ID = ?
DELETE FROM WORKFLOW_CALLBACK WHERE PROCESS_ID = ?
DELETE FROM SYNCH_POINT WHERE PROCESS_ID = ?
DELETE FROM AUDIT_MGMT_PROVISIONING WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_MGMT_TARGET WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_MGMT_DELEGATE WHERE EVENT_ID IN 
   (SELECT ID FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?)
DELETE FROM AUDIT_EVENT WHERE WORKFLOW_PROCESS_ID = ?
DELETE FROM SCHEDULED_MESSAGE WHERE REFERENCE_ID = ?
DELETE FROM LCR_INPROGRESS_TABLE WHERE CHILD_ID = ?
DELETE FROM PROCESS WHERE ID = ?

The following delete statements remove rows that reference identifiers from query (2) and from the AUDIT_EVENT table.

DELETE FROM AUDIT_MGMT_PROVISIONING WHERE EVENT_ID = ?
DELETE FROM AUDIT_MGMT_TARGET WHERE EVENT_ID = ?
DELETE FROM AUDIT_MGMT_DELEGATE WHERE EVENT_ID = ?
DELETE FROM AUDIT_EVENT WHERE ID = ?

The following delete statements remove rows that reference identifiers from query (3) and from the RECONCILIATION_INFO table.

DELETE FROM RECONCILIATION_INFO WHERE RECONID = ? AND ACCOUNTID = ?

The following delete statements remove rows that reference identifiers from query (4) and from the RECONCILIATION table.

DELETE FROM RECONCILIATION WHERE RECONID = ?

Parent topic: Troubleshooting ISIM Server problems