Salesforce

How is scatter factor calculated by DBANALYS

« Go Back

Information

 
TitleHow is scatter factor calculated by DBANALYS
URL NameP176284
Article Number000132413
EnvironmentProduct: Progress
Version: 8.x, 9.x
Product: OpenEdge
Version: 10.x, 11.x, 12.x
OS: All supported platforms
Question/Problem Description
How is scatter factor calculated by PROUTIL tabanalys dbanalys?
Why is the scatter factor high for a table which lives in an area by itself (no other tables and no other indices)?
Why does dbanalys or tabanalys report low record fragmentation but a high scatter factor
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
The TABANALYS section of DBANALYS reports the scatter factor value.

Scatter Factor for Type II Storage Areas

The Scatter Factor for tables in Type II Areas reported by DBANALYS/TABANALYS is determined differently to that reported for Type I Areas since OpenEdge 10.1B (10.1A02). Type I Scattering Calculation (defined below) does not apply to Type II Areas because the data in any block is only ever associated with one particular database object (Table).

Scatter Factor for tables in Type II Areas is determined by:

The number of blocks occupied by the object 
divided by 
The minimum number of blocks that table could occupy. 

Imbedded interleaving free space in Type II clusters will increase the scatter and indicate that a dump and load would pack the records better thereby minimizing i/o seek operations reading these records.

Scatter Factor for Type I Areas ( all OpenEdge versions )

When determining the scatter factor in Type I Areas, it takes stock of the recid of each record as it sequentially traverses database blocks in each Area. The distance between the current record and the previous record and with some juggling of an internal algorithm, makes a determination of what the minimum preferred distance and the actual distance between the current recid and the last recid and averages all the results.

Scatter Factor for tables in Type I Areas is determined by:  

Scatterfactor = log10(# blocks / Minimum # of blocks)


Where:

  • Minimum # of blocks = MAX(RPB, (blocksize/avg_record_size)) 

  • # blocks = total blocks - free blocks

A Scatter Factor value close to 1 indicates a good quality of record allocation as it is essentially reporting:

  • "How far a table’s records are from each other", or

  • "How far a table’s records are from contiguous perfection".

Example of a big scatter value in context:

Consider a Table with 95% records loaded recently in one go and then 5% has been added later:

  • TABANALYS will most likely show a “bad” scatter. 

Does this table need to be dumped and loaded?

  • Most probably not as first 95% is loaded contiguously and other 5% as well, but these 2 sets are just "too far" away from each other.
  • When frequent deletes of data are done in random sections of a table then the sequential record order may be bouncing back and forth within the Area due to new records being assigned to the next available space within the Area.  While the records for this table are all in contiguous blocks to one another they are never-the-less not sequentially arranged on the chains.  This can account for why a table which lives in an Area by itself can have a large scatter factor value.
  • The PROUTIL -C chanalys report can be used to help understand this situation better.
dbanalys or tabanalys record fragmentation and scatter factor are related but independent metrics:
Workaround
Notes
Keyword Phrase
Last Modified Date10/4/2021 8:20 AM

Powered by