Indexes versus column statistics

 

If you are trying to decide whether to use statistics or indexes to provide information to the Statistics Manager, keep the following differences in mind.

One major difference between indexes and column statistics is that indexes are permanent objects that are updated when changes to the underlying table occur, while column statistics are not. If your data is constantly changing, the Statistics Manager may need to rely on stale column statistics. However, maintaining an index after each change to the table might take up more system resources than refreshing the stale column statistics after a group of changes to the table have occurred.

Another difference is the effect that the existence of new indexes or column statistics has on the Optimizer. When new indexes become available, the Optimizer will consider them for implementation. If they are candidates, the Optimizer will re-optimize the query and try to find a better implementation. However, this is not true for column statistics. When new or refreshed column statistics are available, the Statistics Manager will interrogate immediately. Reoptimization will occur only if the answers are significantly different from the ones that were given before these refreshed statistics. This means that it is possible to use statistics that are refreshed without causing a reoptimization of an access plan.

When trying to determine the selectivity of predicates, the Statistics Manager considers column statistics and indexes as resources for its answers in the following order:

  1. Try to use a multi-column keyed index when ANDed or ORed predicates reference multiple columns

  2. If there is no perfect index that contains all of the columns in the predicates, it will try to find a combination of indexes that can be used.

  3. For single column questions, it will use available column statistics

  4. If the answer derived from the column statistics shows a selectivity of less than 2%, indexes are used to verify this answer

Accessing column statistics to answer questions is faster than trying to obtain these answers from indexes.

Column statistics can only be used by SQE. For CQE, all statistics are retrieved from indexes.

Finally, column statistics can be used only for query optimization. They cannot be used for the actual implementation of a query, whereas indexes can be used for both.

 

Parent topic:

Collecting statistics with the Statistics Manager