IBM Tivoli Composite Application Manager for Application Diagnostics, Version 7.1.0.1

Fine-tuning datacollector.properties - J2EE Agent

To best suit the needs of your environment, you can fine-tune the settings in the data collector properties file. The name of this file depends on the application server:


Locations of the data collector properties file

WebLogic If the monitored server instance is represented by a weblogic machine:

DC_home/runtime/wlsapp_server_version.domain_name.machine_name.instance_name/wlsapp_server_version.domain_name.machine_name.instance_name.datacollector.properties

else:

DC_home/runtime/wlsapp_server_version.domain_name.host_name.instance_name/wlsapp_server_version.domain_name.host_name.instance_name.datacollector.properties

Tomcat DC_home/runtime/tomcatapp_server_version.host_name.instance_name/DC_home/runtime/tomcatapp_server_version.host_name.instance_name.datacollector.properties
Sun Java System Application Server (JSAS) DC_home/runtime/sjsasapp_server_version.domain_name.node_name.instance_name/sjsasapp_server_version.domain_name.node_name.instance_name.datacollector.properties
JBoss DC_home/runtime/jbossapp_server_version.host_name.instance_name/jbossapp_server_version.host_name.instance_name.datacollector.properties
NetWeaver DC_home/runtime/netweaverapp_server_version.sap_node_ID_host_name.sap_instance_number/netweaverapp_server_version.sap_node_ID_host_name.sap_instance_number.datacollector.properties
Oracle DC_home/runtime/oracleapp_server_version.host_name.node_name.instance_name/oracleapp_server_version.host_name.node_name.instance_name.datacollector.properties
J2SE DC_home/runtime/j2se.application_name.host_name.instance_name/DC_home/runtime/j2se.application_name.host_name.instance_name.datacollector.properties

However, to facilitate future upgrades, do not change this file. Instead, add the settings you want to modify to the data collector custom properties file custom_directory/datacollector_custom.properties ; this will override the values in the data collector properties file.

The following properties are in the data collector properties file. Only the properties that are recommended for you to modify are listed.

kernel.codebase

The value of this property is filled in during installation time by the installer. It specifies where the Managing Server codebase can be found.

kernel.rfs.address

The value of this property is filled in during installation time by the installer. This is used by the Application Monitor to locate the Managing Server components.

probe.library.name

The default value is am. This property specifies the name of the native shared library which the data collector needs to run. If the value of the property is am, the data collector searches for a shared library with the name libam.so on UNIX platforms and libam.dll on the Windows platform. In normal cases, this property does not need to be specified or changed from the default. Only when the user needs to run a native shared library with a different name does this property need to change.

Example:

 probe.library.name=am

internal.probe.event.packet.size

The default value is 70 or (70 X 1024 kbytes). Changing to below the default is not recommended. Valid values are 1 to 4000000 (or up to available process memory on the server). This property specifies the size of the data collector's internal send buffer. The send buffer controls how much data the data collector can be sent to the Publish Server at a given time. In normal situations, this property does not have to be changed, as the default send buffer size is more than adequate. However, if the user is seeing a problem with the amount of data the data collector sends to the Publish Server, he can set this property to configure the size of the send buffer.

internal.memory.limit

The default value is 100 (MB). This property limits the amount of memory the data collector may use.

internal.memory.accept.threshold

The default value is 2 (MB). Minimum free memory after which the data collector starts accepting data once it reaches the upper limit, as specified by the property internal.memory.limit.

internal.url.limit

The default value is 1000. This property controls the maximum URL length accepted by the data collector.

internal.sql.limit

The default value is 5000. This property controls the maximum SQL length accepted by the data collector.

internal.probe.event.queue.size.limit

The default value is 900000. This property controls the maximum size of the queue of events maintained by the data collector. When the queue is full, the data collector will drop events.

internal.lockanalysis.collect.Ln.lock.event

The variable n can represent Mod L1, L2 or L3. Possible values are true or false. This parameter controls whether lock acquisition/release events are collected. The recommended setting at all levels is false as there is little benefit in displaying lock acquisition events if they are not experiencing contention.

Example:

internal.lockanalysis.collect.L1.lock.event = false

internal.lockanalysis.collect.Ln.contend.events

The variable n can represent Mod L1, L2 or L3. Possible values are true, false or justone. This parameter controls whether lock contention events are collected.

True indicates contention records are collected. For each lock acquisition request that results in contention, a pair of contention records are written for each thread that acquired the lock ahead of the requesting thread. False indicates contention records are not written. Justone indicates contention records are written, however, a maximum of one pair of contention records are written for each lock acquisition request that encounters contention, regardless of how many threads actually acquired the lock prior to the requesting thread.

Set this parameter to true enables you to determine whether a single thread is holding a lock for an excessive time, or if the problem is due to too many threads all attempting to acquire the same lock simultaneously. The recommended setting at L1 is false. The recommended setting at L2 is justone, this enables you to collect just one pair of contention records for each lock acquisition that encountered contention. The recommended setting at L3 is true but for a limited period of time to reduce performance cost, this enables you to identify every thread that acquired the lock ahead of the requesting thread.

Example:

internal.lockanalysis.collect.L2.contend.events = justone

internal.lockanalysis.collect.Ln.contention.inflight.reports

The variable n can represent Mod L1, L2 or L3. Possible values are true or false. This parameter controls whether data is collected for the Lock Contention report. The recommended setting at L1 is false. The recommended setting at L2 and L3 is true.

Example:

internal.lockanalysis.collect.L3.contention.inflight.reports = true

deploymentmgr.rmi.port

It is not necessary to define the property deploymentmgr.rmi.port if you are running a standalone application server. This property is needed for version 5 application server clusters or application servers controlled by a Deployment Manager.

Example:

deploymentmgr.rmi.port=<Deployment Manager RMI (bootstrap) port>

deploymentmgr.rmi.host

It is not necessary to define the property deploymentmgr.rmi.host if you are running a standalone application server. This property is needed for version 5 application server clusters or application servers controlled by a deployment manager.

Example:

deploymentmgr.rmi.host=<Deployment Manager host>

networkagent.socket.resettime

Default no reset. Time interval after which the connection between the data collector and the Publish Server will be reset.

Example:

networkagent.socket.resettime=-1

am.mp.cpuThreshold

Default 30 milliseconds. Only the methods which take at least the minimum amount of CPU time specified in this property are captured for method profiling data. This avoids unnecessary clutter. Generally, methods with greater than the value specified in this property are considered useful. Customers can reduce or increase this value if needed.

am.mp.clockThreshold

Default 30 milliseconds. Only the methods which take at least the minimum amount of wall clock time specified in this property are captured for method profiling data. This avoids unnecessary clutter. Generally, methods with greater than the value specified in this property are considered useful. Customers can reduce or increase this value if needed.

am.mp.leagueTableSize

Default 1000. This is the maximum number of methods that are monitored for method profiling data. Customers can reduce or increase this value if needed. Decreasing it will help in reducing memory requirements.

am.mp.methodStackSize

Default 100. This is the maximum stack size of any running thread that is recorded in method profiling.

am.mp.threadSize

Default 1000. This is the maximum running thread size that can be monitored at any instance of time.

dc.turbomode.enabled

The default setting is true, which enables turbo mode.

By default, the Data Collector limits the amount of native memory it uses to 100 MB, see the description of internal.memory.limit on page internal.memory.limit. The data collector enters turbo mode when the data collector native memory use exceeds 75% of the native memory limit, by default 75 MB. (You can adjust this percentage with turbo.mem.ulimit to adjust the percentage. However, do not set turbo.mem.ulimit unless directed by IBM Software Support.) The behavior when the memory utilization is below 75 MB is the same whether turbo mode is enabled or disabled.

Behavior when dc.turbomode.enabled is enabled and the data collector is in turbo mode

When the data collector switches to turbo mode, a message Switching to Turbo Mode is logged in the trace-dc-native.log file.

In turbo mode, the data collector stops monitoring new requests, holds existing requests, and switches Network Agent and Event Agent threads to the higher priorities specified by the na.turbo.priority and ea.turbo.priority properties respectively. It also lowers the sleep time of the Event Agent and Network Agent threads specified by the ea.turbo.sleep and na.turbo.sleep properties respectively. All this is done to drain the native memory quickly by sending accumulated event data to the Publish Server.

In turbo mode, if a new request comes in, the data collector simply does not monitor the new request but continues to monitor the already running requests. The data collector notifies the Publish Server that a new request was not monitored when in turbo mode. A notification is sent to the Managing Server for every new request that is not monitored by sending a dropped record. The Publish Server in turn reflects this status in Publish Server corrupted request counters obtained through amctl.sh ps1 status.

When turbo mode is enabled, data in the Application Monitor user interface is always accurate but it comes at the cost of pausing application threads for a few seconds.

Behavior when dc.turbomode.enabled is enabled and the data collector is in normal mode

The data collector switches back to normal mode, when the data collector native memory use falls below 75% of the limit. When the switch to normal mode happens, the data collector releases the requests that were placed on hold while switching to turbo mode and resumes monitoring all requests from then on.

When the data collector switches to normal mode, a message Switching to Normal Mode is logged in the trace-dc-native.log file. It also logs memory utilization and a time stamp.

Behavior when dc.turbomode.enabled is disabled

A value of false disables turbo mode. When turbo mode is disabled, the data collector does not pause the application thread when the native memory use exceeds 75% of the limit. Instead, it drops the accumulated diagnostic data instead of sending it to the Managing Server. Therefore, the data shown in the Application Monitor user interface will be incomplete. But the response time of the application threads will not be negatively impacted. An appropriate message indicating data is dropped is logged in msg-dc-native.log and trace-dc-native.log. The Managing Server discards all the diagnostic data gathered for the request when the data collector drops records related to that request.

Disable dc.turbomode.enabled

The default setting is true, which enables turbo mode.

If any of the following conditions apply, disable turbo mode by setting dc.turbomode.enabled to false:

  • Within the first 10 minutes after starting the data collector, it goes into turbo mode (search for the message Switching to Turbo Mode in trace-dc-native.log).

  • You do not want your applications to be paused temporarily as the data collector native memory exceeds 75% of the limit. Disabling turbo mode comes at the cost of losing the monitoring data when this boundary condition is reached.

An alternative is increasing the internal.memory.limit to allow more native memory use, at the risk of requesting more native memory from the JVM than what is available, in which case the JVM will issue OutOfMemory errors. See the description of internal.memory.limit on page internal.memory.limit.


Parent topic:

Customization and advanced configuration for the data collector