Information

Title	How is scatter factor calculated by DBANALYS

URL Name	P176284

Article Number	000132413

Environment	Product: Progress Version: 8.x, 9.x Product: OpenEdge Version: 10.x, 11.x, 12.x OS: All supported platforms

Question/Problem Description

How is scatter factor calculated by PROUTIL tabanalys dbanalys?
Why is the scatter factor high for a table which lives in an area by itself (no other tables and no other indices)?
Why does dbanalys or tabanalys report low record fragmentation but a high scatter factor

Steps to Reproduce

Clarifying Information

Error Message

Defect Number

Enhancement Number

Cause

Resolution

The TABANALYS section of DBANALYS reports the scatter factor value.

Scatter Factor for Type II Storage Areas

The Scatter Factor for tables in Type II Areas reported by DBANALYS/TABANALYS is determined differently to that reported for Type I Areas since OpenEdge 10.1B (10.1A02). Type I Scattering Calculation (defined below) does not apply to Type II Areas because the data in any block is only ever associated with one particular database object (Table).

Scatter Factor for tables in Type II Areas is determined by:

The number of blocks occupied by the object
divided by
The minimum number of blocks that table could occupy.

Imbedded interleaving free space in Type II clusters will increase the scatter and indicate that a dump and load would pack the records better thereby minimizing i/o seek operations reading these records.

Scatter Factor for Type I Areas ( all OpenEdge versions )

When determining the scatter factor in Type I Areas, it takes stock of the recid of each record as it sequentially traverses database blocks in each Area. The distance between the current record and the previous record and with some juggling of an internal algorithm, makes a determination of what the minimum preferred distance and the actual distance between the current recid and the last recid and averages all the results.

Scatter Factor for tables in Type I Areas is determined by:

Scatterfactor = log10(# blocks / Minimum # of blocks)

Where:

Minimum # of blocks = MAX(RPB, (blocksize/avg_record_size))
# blocks = total blocks - free blocks

A Scatter Factor value close to 1 indicates a good quality of record allocation as it is essentially reporting:

"How far a table’s records are from each other", or
"How far a table’s records are from contiguous perfection".

Example of a big scatter value in context:

Consider a Table with 95% records loaded recently in one go and then 5% has been added later:

TABANALYS will most likely show a “bad” scatter.

Does this table need to be dumped and loaded?

Most probably not as first 95% is loaded contiguously and other 5% as well, but these 2 sets are just "too far" away from each other.
When frequent deletes of data are done in random sections of a table then the sequential record order may be bouncing back and forth within the Area due to new records being assigned to the next available space within the Area. While the records for this table are all in contiguous blocks to one another they are never-the-less not sequentially arranged on the chains. This can account for why a table which lives in an Area by itself can have a large scatter factor value.
The PROUTIL -C chanalys report can be used to help understand this situation better.

dbanalys or tabanalys record fragmentation and scatter factor are related but independent metrics:

What does tabanalys or dbanalys Fragments Factor value mean?

Workaround

Notes

Progress Articles:

How does PROUTIL tabanalys gather record information?
What does progress do when deleting records from the database?
Why deleting records in a table leads to index fragmentation ?
Fragmentation Factor is calculated incorrectly on large tables

Keyword Phrase