In Version 9.1D and later, the UPDATE STATISTICS statement has new features and new syntax to use those new features. The statistics gathered can now include index statistics, in addition to the table and column data distribution statistics already supported. These index statistics can be especially helpful to the SQL optimizer in choosing the best index to scan or join on. New, upward compatible syntax lets the user choose the type of statistics he or she needs.
New SQL syntax for UPDATE STATISTICS - Statement format:
UPDATE [TABLE | INDEX | [ALL] COLUMN] STATISTICS [FOR table_name]
Notes
* Specifying TABLE STATISTICS gets just table cardinalities. Table cardinalities are stored in _Systblstat.
* Specifying INDEX STATISTICS gets statistics on the number of unique values in each index. Index statistics are stored in _Sysidxstat.
* Specifying COLUMN STATISTICS (without ALL) gets statistics on the data distribution of values for each column which is an index key component.
* Specifying ALL COLUMN STATISTICS gets statistics on the data distribution of values for all columns.
* The STATISTICS phrase can be repeated, so that up to 3 types of statistics can be requested by a single UPDATE ... STATISTICS statement.
* By default, for the simple statement "UPDATE STATISTICS" where the type of statistics is not specified, SQL will get table and index column statistics. This is equivalent to the statement: UPDATE TABLE STATISTICS AND COLUMN STATISTICS ;
* The FOR TABLE phrase retains its existing meaning.
* A table containing LONG datatypes can get table, index, and/or column statistics. The columns which are LONG datatypes cannot get statistics.
* Getting table statistics runs in time proportional to the table's primary index.
* Getting column statistics runs in time proportional to the table's primary index, plus an additional amount proportional to the number of columns in the table.
* Getting index statistics runs in time proportional to the total size of all the indexes for the table.
* Table statistics are most often the most useful statistic. They influence join order substantially.
* Index statistics are very important when a table has many indexes (say, more than 5). This is especially true if some of the &.nbsp; indexes are similar to one another.
* Column statistics are most useful when applications use range predicates, such as BETWEEN and the operators "<", "<=", ">", and ">="..nbsp; indexes are similar to one another.
* Column statistics are most useful when applications use range predicates, such as BETWEEN and the operators "<", "<=", ">", and ">=".. |