How fast Progress utilities read the disks? - Forum - OpenEdge RDBMS - Progress Community

How fast Progress utilities read the disks?

 Forum

How fast Progress utilities read the disks?

  • The recent issue with viewB2 raised the question how fast Progress utilities read the disks. How many disk reads they can generate? Do they read the data volume we expect they should?

    The post is long. It’s a good idea to take beer before you will start reading.

    Test environment:
    Old Sparc server. SunOS 5.10. Slow CPU (1165 MHz). 32 GB RAM. A single HDD disk a database is resided on. I used the sports database blown up to 13 GB (with 3 GB of real data inside). Db blocksize 8K. OpenEdge 11.7.5. I’m not sure that all results will be the same on the modern hardware but I have only what I have.

    File system cache was dropped each time before a test:
    time umount /test
    where /test is a mount point of file system the database resides on.

    umount: /test busy
    real    0m7.993s
    user    0m0.003s
    sys     0m7.956s

    On Linux you can use: echo 3 > /proc/sys/vm/drop_caches

    The results are statistically consistent with each run. The next run is remarkably faster if the cache is NOT dropped.

    The execution time of Progress utilities was estimated by the timestamps of the messages in database log. Some utilities were run using the modified PROMSGS file to redirect the additional messages to the log file. The read rate is also verified by the iostat.

    The number of blocks read by the utilities was verified through promon when the utility was re-run online. The database area of the interest ("Customer/Order Area") was mapped to the alternate buffer pool.

    The utilities were run with minimal set of the options to make the disk reads to be dominant.


    Sequential disk reads
    Speed: 113.36 MB/sec = 1 base unit a.k.a. the “speed of light”, iostat: 115820 kps, 128 tps

    time dd if=sports_8.d1 of=/dev/null bs=8192
    Time: 1m55.205s, Size: 13694533632 bytes = 12.75 GB


    Offline probkup to /dev/null
    Speed: 45.35 MB/sec = 0.40 base unit, iostat: 48511 kps, 54 tps

    time probkup sports /dev/null
    Time: 302.366 sec, Size: 1754885 8K blocks = 13.4 GB

    [2020/02/22@15:30:48.843+0300] P-21081      T-1     I BACKUP   : (451)   Probkup session begin for root on /dev/pts/1.
                00:05:02.366 = 302.366 sec
    [2020/02/22@15:35:51.209+0300] P-21081      T-1     I BACKUP   : (334)   Probkup session end.

    Online probkup to /dev/null
    Speed: 40.81 MB/sec = 0.36 base unit, iostat: 41858 kps, 46 tps

    time probkup online sports /dev/null
    Time: 335.939 sec, Size: 1754885 8K blocks = 13.4 GB

    [2020/02/25@11:10:20.126+0300] P-25301      T-1     I BACKUP  5: (5461)  Begin backup of Data file(s).
                00:05:35.939 = 335.939 sec
    [2020/02/25@11:15:56.065+0300] P-25301      T-1     I BACKUP  5: (5462)  End backup of Data file(s).
    [2020/02/25@11:15:56.069+0300] P-25301      T-1     I BACKUP  5: (13625) Wrote a total of 51890 backup blocks using 13.5 GBytes of media.

    Online probkup is 10% slower than offline probkup most likely due to the latches and the buffer locks. Note that CPUs on this server are slow. Hence the operations with data in shared memory are slow.


    “In-memory” probkup to /dev/null
    Speed: 71.58 MB/sec = 0.63 base unit, iostat: 0 kps, 0 tps

    proserve sports -B 2000000
    proutil sports -C dbanalys
    time probkup online sports /dev/null
    Time: 191.547 sec, Size: 1754885 8K blocks = 13.4 GB

    Logical reads: 1754952
    O/S reads: 0

    So probkup spends only 37-43% of its time reading the disks.


    proutil sports -C dbscan
    Speed: 109.26 MB/sec = 0.96 base unit, iostat: 114353 kps, 126 tps

    proutil sports -C trunc
    time proutil sports -C dbscan
    Time: 125.485 sec, Size: 1754885 8K blocks = 13.4 GB

    [2020/02/25@19:32:50.714+0300] P-25923      T-1     I DBUTIL     (451)   Database Repair session begin for root on batch.
                00:02:05.485 = 125.485 sec
    [2020/02/25@19:34:56.199+0300] P-25923      T-1     I DBUTIL     (334)   Database Repair session end.

    One of the fastest Progress utilities but it can be run only offline.


    Dbanalys, chanalys, ixanalys, tabanalys
    Their speed may significantly vary depending from the length of the chains in a database.

    All of them are scanning Free Cluster Chain:
    Dbanalys and chanalys do this during “FREE CLUSTER CHAIN ANALYSIS” phase.
    Ixanalys and tabanalys do this before AREA " Name":Num BLOCK ANALYSIS phase (note: it’s not the same as “AREA BLOCK ANALYSIS” phase).

    All of them read only two blocks in each data cluster on Free Cluster Chain: first block (with the transactionId and serialNumber fields in the extended block header) and last block (with the nextCluster and prevCluster fields in the block header).

    The information about every cluster on the chain is kept in memory to be used by the utilities during the final phase of the area scan. I guess they build something similar to the “Area Cluster Allocation Blocks” that were planned but are not yet implemented.

    The viewB2 utility also scans the Free Cluster Chain but it reads only one block from each cluster – the last one. It just counts (inefficiently) the blocks on the chain.


    Chanalys running on database with long Free Cluster Chain
    Speed: 20.53 MB/sec = 0.18 base unit, iostat: 115168 kps, 128 tps

    Note: Mismatch between Progress reads and O/S reads according iostat: 20.53 MB/sec vs 112.5 MB/sec! Iostat is as of the dd command.

    time proutil sports -C chanalys "Customer/Order Area" | tail
    Time: 115.936 sec, Size: 304680 8K blocks = 2.32 GB

    [2020/02/26@14:51:55.820+0300] P-26946      T-1     I CHANALYS6: (-----) FREE CLUSTER CHAIN ANALYSIS
                00:01:55.936 = 115.936 sec
    [2020/02/26@14:53:51.756+0300] P-26946      T-1     I CHANALYS6: (-----) 152340 cluster(s) found in the free cluster chain.

    Alternate Buffer Pool ("Customer/Order Area"):

    Logical reads 3
    O/S reads 304680 = 2 * 152340 - clusters found in the free cluster chain.

    Note: chanalys was forced to quit after it finished scanning the Free Cluster Chain.

    Primary Buffer Pool (“Schema Area”):
    Logical reads 1709
    O/S reads 14

    Note: all *analys utilities do not update the “Logical reads” while they scan the Free Cluster Chain but they update the “O/S reads”. But they update the “Logical reads” on the final phase of area scan. Together with unexplained high O/S reads this looks like a bug.


    viewB2 running on database with long Free Cluster Chain
    Speed: 10.57 MB/sec = 0.09 base unit, iostat: 10860 kps, 1358 tps

    Note: the high number of transfers per second!

    Note: viewB2 and chanalys spend the same time to read Free Cluster Chain but viewB2 reads twice less database blocks and generates 10.6 times less O/S reads.

    proutil sports -C viewB2
    Time: 112.643 sec, Size: 152345 8K blocks = 1.16 GB

    [2020/02/26@14:08:44.226+0300] P-26874      T-1     I DBUTIL   : (-----) Area "Info Area":7 - Primary Buffer Pool
                00:01:52.643 = 112.643 sec
    [2020/02/26@14:10:36.869+0300] P-26874      T-1     I DBUTIL   : (-----) Area "Customer/Order Area":8 - Alternate Buffer Pool

    Note: viewB2 reports statistics per area when it finishes reading the area.

    Alternate Buffer Pool ("Customer/Order Area"):
    Logical reads 152346
    O/S reads 152345 (152340 - clusters found in the free cluster chain)


    Chanalys and /offline/ dbanalys also scan the object’s chains:
    Free Chain
    RM Chain
    Index Delete Chain

    Online dbanalys reports the same information about those chains like offline dbanalys does but it does not really scan them (unlike offline dbanalys).

    The sports database does not have the long object’s chains. Their read rate was not benchmarked in my tests.

    The final phase of these utilities is AREA "Area Name": AreaNum BLOCK ANALYSIS.
    The *analys utilities scan the whole area except the clusters on the Free Cluster Chain. If the chain is empty then the scan on this phase is sequential. The next results are for the areas that does not have long chains.

    Ixanalys running on the area without long Free Cluster Chain
    Speed: 86.84 MB/sec = 0.77 base unit

    time proutil sports -C ixanalys "Customer/Order Area"
    Time: 150.389 sec, Size: 1671695 8K blocks = 12.75 GBytes

    [2020/02/22@16:22:02.091+0300] P-21134      T-1     I IXANALYS : (451)   Idxanalys session begin for root on /dev/pts/1.
                00:02:30.389 = 150.389 sec 
    [2020/02/22@16:24:32.480+0300] P-21134      T-1     I IXANALYS : (334)   Idxanalys session end.

    Type of data stored in the area does not matter. You can run ixanalys against a table area. Utility will sequentially scan the area up to HWM (all active blocks).

    The fastest Progress utility that can be run per area.


    Tabanalys running on the area without long Free Cluster Chain
    Speed: 86.84 MB/sec = 0.77 base unit
    time proutil sports -C tabanalys "Customer/Order Area"
    Time: 150.389 sec, Size: 1671695 8K blocks = 12.75 GBytes

    Tabanlys can be a few times slower than ixanalys when it runs on an area where the records are fragmented. Otherwise the speed is the same as of ixanalys. Tip: compare the times of tabanalys and chanalys for the same table’s area to make a decision if you need a dump/load to defragment the records in the area.


    There are some interesting results for idxfix and idxcheck. I will post them later but right now the keyboard is tired.

  • Part 2: Tests of the idxfix utility

    The goal is to find out the main factors that defines the idxfix execution time.

    The tests were run in 3 modes:
    1. Offline - the offline run with the default value of the -B parameter. But file system cache was large enough to store all data reads during a run (file system cache is dropped before each run);
    2. Online - the database is started with -B/-B2 parameters that are large enough to store all data retrieved from disks;
    3. Cached - the second run in multi-user mode when all data are already in database buffer pools.

    The factors are:
    1. Logical Reads - the number of db requests done by utility. It’s a Progress thing and it can be found only in promon or in VST. The cost of offline Logical Reads is almost twice less than the one of the
    online Logical Reads;
    2. OS Reads - the number of the read() calls initiated by Progress. Apart of promon and VST it’s also reported by the -rusage option as “DISK reads”;
    3. Disk Reads - the number of blocks retrieved from disk. We can get the number from iostat command. Note that ‘Disk Reads’ means here the different thing than the -rusage option does;
    4. Seek time. I’m not sure how to get it but the tests prove its existence. Is it a part of system time? Or is it the gap between real time and sum of the user and system times?

    The number of Logical Reads should be higher than OS Reads. OS Reads can get data from the file system cache or from the disks. Hence the number of OS Reads should be higher than Disk Reads but in my tests the
    index fix utility and ABL clients (but not the dd command) used to create Disk Reads about 30-50 % higher than OS Reads.
    Example:Can anybody explain the difference?

    rusage DISK reads: 10576880 KB at 3496 KB/sec
       iostat -x kr/s: 13886082 KB at 4586 KB/sec

    The -NL option seems to make the idxfix utility by 10% faster but I did not run the full set of the tests with the -NL.

    The conclusion from the tests below: all the factors above can be dominant under some circumstances but in the most cases the number of Logical Reads defines how long the index fix utility will run.

    Note: I use 8K blocks instead of KBs because it’s easy to compare the results with number of blocks owned by table and index.

    Baseline: dd command using 8K blocks

    Run        --- Time in seconds ---      OS Reads         Disk Reads        Disk     svc_t
    Mode       real  user    sys    gap    Blocks    /sec    Blocks    /sec %Busy %Wait  msec
    dd      116.044 7.798 61.798 46.448 1,671,760  14,406 1,672,573  14,401  90.2   1.0   7.4
    
    %Wait = Percentage of time that the queue is not empty
    %Busy = Percentage of time that the disk is busy
    svc_t = Average service time, in m
    gap   = real time – user time – system time

    Disk is very busy: system time is large (many times higher than user time). Real time is not equal to the sum of user time and system time. Does the difference (46.448 sec) equal to the seek time? OS Reads = Disk Reads as expected.

    Note that dd command reads the whole table area.


    Another baseline is the time of idxbuild (-z -rusage -C idxbuild -TMB 256 -TF 80 -B 512 -datascanthreads 128 -mergethreads 32).
    Each index was built separately:

    Time      Index
    365.298   Orde-Line.order-line
    331.205   Orde-Line. item-num

    Only idxfix/scan indexes without recid validation was comparable with the speed of idxbuild.

    Index Fix Utility
    1. Scan records for missing index entries with index block validation.
    8. Scan records for missing index entries.


    A few words about the “index block validation” option. It’s a part of the menu option 1 and it’s undocumented. I can be wrong but the index block validation will be enabled when you select only one index and only if you specify “all“ as a range of records to scan through. During index block validation phase the utility will scan the whole index area where the index resides in. What kind of validation can be done during a sequential scan of the index area? Ixanalys also scans the whole area. Does ixanalys validate the index blocks? Idxfix/2 (scan indexes for invalid index entries) does not scan the whole area (it goes through the index tree). Does the option 2 do the same kind of validation? Or any options of the idxcheck utility do? Unfortunately I don’t have the answers. I did not test the “index block validation” option.

    By the way, there are more than a hundred error messages in promsgs file that might be reported by idxfix or idxcheck utilities. It would be nice to know which errors can be reported by which options of these utilities. Idxcheck utility seems to be able to find more index corruptions than idxfix. Are there any opposite situations when idxfix can find the errors that idxcheck would miss?


    Scan records for missing index entries does not use the chain of data clusters owned by a selected table (in SAT2). The option scans the “whole” area. But the scanning is divided in two phases:
    1. Idling;
    2. Working phase.

    During the “idling” phase the index fix utility reads only first block in each data cluster. If block does not belong to the specified table then the utility jumps to the next cluster. The minor exception: if the first block is an object block then the utility reads all blocks in the same cluster regardless of the object (why?).

    If the first block belongs to the table then the utility will read and process the rest of the blocks in the same data cluster. It’s a working phase of the scanning.

    I did not test the idling phase in the areas with the long Free Cluster Chain. I’m expecting that the idxfix/scan records will scan the chain and it will substantially increase the execution time.

    Idling phase of idxfix/scan records for missing index entries

    Formula for activity:
    Logical reads = the number of other object’s blocks in the table's area divided by the data cluster size.
    O/S reads = Logical reads

    To test the idling phase I have created the empty table resided in the large area and run:
    proutil sports -C idxfix -silent -rusage
    8. Scan records for missing index entries
    Table: empty
    Index: idx

    Run        --- Time in seconds ---   Logical Reads       OS Reads       Disk Reads       Disk    svc_t
    Mode       real  user    sys    gap Blocks     /sec  Blocks    /sec   Blocks    /sec %Busy %Wait  msec
    Offline 113.418 7.690 18.449 87.279 212,749   1,876  209,011  1,843  210,771   1,854  74.2  3.0   0.4
    Online  113.346 8.692 18.378 86.276 212,749   1,877  209,010  1,844  210,274   1,850  73.2  3.0   0.4
    Cached    0.820 0.715  0.000  0.105 212,749 259,450

    Disk is 74% busy. System time is rather large. The gap between real time and sum of the user and system times is 87 seconds. The “cached” run is almost instant. OS Reads are 7.82 times slower than with dd command. Note that data cluster size in this area is 8. It does not seem to matter if the utility will read only one block from each cluster or if it will read the whole cluster. I guess the reason is a seek time. On SSD disks the idling phase can probably be much faster but I did not test SSD disks.

    Working phase of idxfix/scan records for missing index entries

    The index fix utility retrieve a record from data block and then it goes through the index tree - from a root block to a leaf block - to check if the correspondex index keys does exist.

    Formula for activity:
    Logical Reads in the table area = the number of records * 2;
    Logical Reads in the index area = the number of records * the number of levels in the index tree.

    Total:
    Logical Reads = the number of records * (2 + the number of levels in the index tree);
    OS Reads = the number of table’s blocks + the number of index’s blocks.

    There are two requests to a table’s block per each record in the block. My interpretation: the first request - the utility get a record from a block. Then it finds the key related to the record. Then (for some reason) it re-reads the record based on the found index key.

    Note: the more records in a table, the more index keys and the more levels in the index tree. Hence the number of Logical Reads is growing faster than the number of the records. The more index keys, the more efficient is the compression algorithm and the more keys will be stored per index block. Hence the number of index blocks is growing slower than the number of the records. Though the number of index blocks is negligible compared to the number of the record blocks. Anyway the ratio of Logical Reads to OS Reads is growing with the number of the records. For example, in my “sports on steroids” database the ratio is 148 while in the “baby sports” database the ratio is 31 for the same Order-Line.order-line index.


    idxfix / 8. Scan records for missing index entries
    Table: Order-Line
    Index: order-line

    The Order-Line table with 38,998,800 records uses 73% of blocks in its area (1,218,720 blocks of 1,671,703). Hence the idling phase can be ignored.
    The Order-Line.order-line index has the perfect logical scatter factor (seek ratio = 1). It uses 46,752 blocks. Index tree consists of 3 levels.

    Run     ------ Time in seconds -------       Logical Reads       OS Reads          Disk Reads       Disk    svc_t
    Mode        real     user   sys    gap      Blocks     /sec  Blocks      /sec    Blocks    /sec %Busy %Wait  msec
    Offline 3024.831 2960.383 61.332 3.116 196,272,868   64,887 1,322,110     437 1,735,760     573   2.4   0.0   4.8
    Online  3911.647 3830.189 76.651 4.807 196,272,868   50,177 1,322,110     338 1,735,544     443   2.0   0.0   5.0
    Cached  3626.903 3626.602  0.197 0.104 196,272,868   54,116

    The offline run is faster even than the cached run due to the less cost of Logical Reads - they don’t use the buffer locks.
    The cached run is just a bit faster than online because the bottleneck is Logical Reads rather than Disk Reads. Disk is not busy. System time is just a small percent of the total (real) time.


    idxfix / 8. Scan records for missing index entries
    Table: Order-Line
    Index: item-num

    The size of the index is 17,664 blocks (2.65 times smaller than the order-line index) but its logical scatter factor is bad (seek ratio = 28.10). Index tree also consists of 3 levels.

    Run            Time in seconds         Logical Reads         OS Reads        Disk Reads       Disk    svc_t
    Mode        real     user    sys      Blocks     /sec    Blocks    /sec    Blocks    /sec %Busy %Wait  msec
    Offline 2276.575 2212.363 58.892 196,273,152   86,214 1,293,022     568 1,691,185     743   3.0   0.0   3.9
    Online  3126.820 3057.882 63.270 196,273,152   62,771 1,292,969     413 1,690,714     541   2.2   0.0   3.9
    Cached  3071.654 3071.371  0.178 196,273,152   63,898

    The scans of the “item-num” index are a bit faster than of the “order-line” index. The logical scatter factors of the indexes don’t matter for this option of idxfix.


    idxfix / 8. Scan records for missing index entries
    Table: Order-Line
    Index: empty (inactive)

    It’s the inactive index that consists of only one block. And the number of levels is, of course, 1.

    Note: the idxfix utility was run with the -silent option. The undocumented feature of this option: it suppresses the error messages. It’s not what we would expect when we check the health of the indexes but it’s the useful option, for example, when we would like to pre-build an index. Without the -silent option we would get in database log 39 millions error messages.

    Run     ------ Time in seconds -------      Logical Reads        OS Reads          Disk Reads      Disk     svc_t
    Mode        real     user   sys    gap     Blocks     /sec     Blocks   /sec     Blocks    /sec %Busy %Wait  msec
    Offline  775.499  712.981 60.238 2.280 118,275,549  152,515 1,275,384   1,664 1,673,266   2,157   7.7   0.0   4.2
    Online  1502.708 1432.440 67.788 2.480 118,275,549   78,708 1,275,384     849 1,672,800   1,113   4.2   0.0   4.3
    Cached  1381.163 1380.960  0.088 0.115 118,275,549   85,635

    The previous runs create 5 block requests per record. Now we have only 3 requests per record. But the runs became faster more than 5/3 times.

    If Logical Reads are a bottleneck then I would expects that all tests above would show the same numbers of Logical Reads per second. But in fact we have, for example, for the cached runs:

                          Index  Logical Reads
    Index                 Blocks    /sec
    Order-Line.order-line 46,752  54,116
    Order-Line.item-num   17,664  63,898
    Order-Line.empty           1  85,635

    Why the size of index does matter for the speed of Logical Reads?

    The close analogue of the ‘scan record’s option is FOR EACH TABLE-SCAN though it does not check the index keys and it uses only one database request per record.

    FOR EACH Order-Line NO-LOCK TABLE-SCAN WHERE FALSE

    Run     ------ Time in seconds -------        Logical Reads        OS Reads          Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap       Blocks     /sec     Blocks   /sec     Blocks    /sec %Busy %Wait  msec
    Offline  505.624  447.621 56.210 1.793   78,149,949  154,561  1,218,716  2,410  1,673,177   3,308  11.8   0.0   4.1
    Online   588.675  526.321 60.461 1.893   78,149,943  132,756  1,218,715  2,070  1,672,695   2,863  10.2   0.0   4.1
    Cached   473.193  473.060  0.110 0.023   78,149,943  165,154

    FOR EACH Order-Line NO-LOCK TABLE-SCAN WHERE TRUE

    Run     ------ Time in seconds -------        Logical Reads        OS Reads          Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap       Blocks     /sec     Blocks   /sec     Blocks    /sec %Busy %Wait  msec
    Offline 1704.229 1645.443 56.671 2.115   78,149,949   45,856  1,218,716     715 1,673,178     981   3.7   0.0   4.3
    Online  1907.015 1832.661 72.028 2.326   78,149,943   40,980  1,218,715     639 1,672,743     879   3.2   0.0   4.3
    Cached  1788.237 1788.034  0.187 0.016   78,149,943   43,702

    For a note: TABLE-SCAN + CAN-FIND runs almost 2 hours.

    Comparing the cached runs:

    Run                      Time in seconds         Logical Reads
    Mode                  real     user    sys      Blocks     /sec
    idxfix order-line 3626.903 3626.602  0.197 196,272,868   54,116
    idxfix item-num   3071.654 3071.371  0.178 196,273,152   63,898
    idxfix empty      1381.163 1380.960  0.088 118,275,549   85,635
    TABLE-SCAN FALSE   473.193  473.060  0.110  78,149,943  165,154
    TABLE-SCAN TRUE   1788.237 1788.034  0.187  78,149,943   43,702

    Logical Reads per second created by idxfix/scan records are in-between FOR EACH TABLE-SCAN WHERE FALSE and WHERE TRUE.

    proutil sports -C tabanalys "Customer/Order Area"

    Run     ------ Time in seconds -------        Logical Reads        OS Reads          Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap       Blocks     /sec     Blocks   /sec     Blocks    /sec %Busy %Wait  msec
    Offline  187.480  112.553 72.001 2.926    1,676,169    8,941  1,671,749   8,917 1,672,751   8,929  32.0   0.0   4.2
    Online   199.795  121.250 75.832 2.713    1,676,169    8,389  1,671,749   8,367 1,672,627   8,380  30.1   0.0   4.2
    Cached    63.139   63.006  0.067 0.066    1,676,169   26,547

    Disk is 30% busy. Tabanalys does not use Logical Read per each record in data block. That is why tabanalys is a few times faster than idxfix.

    Conclusion: idxfix/scan records is CPU bound rather than I/O bound. Fortunately we can easy make a multi-threaded version of this option to load it to use more than one CPU.


    Index Fix Utility
    2. Scan indexes for invalid index entries.


    It has sub-option:
    Validate recids for index entries? Yes or No

    Without recid validation it reads only index blocks. Utility goes through the index tree - unlike menu option 1 it does not scan the whole area.

    Formula for activity:
    Logical Reads in the index area = the number of index keys.
    OS Reads in the index area = the number of blocks owned by the index.

    With ‘validate recids for index entries’ the utility will read record blocks:
    Logical Reads in the table area = the number of index keys.
    OS Reads in the table area = the number of blocks owned by the table.


    Validate recids for index entries? No


    idxfix / 2. Scan indexes for invalid index entries
    Table: Order-Line
    Index: order-line
    Validate recids for index entries? No

    Index 24 (PUB.Order-Line, order-line): 38998800 keys.

    Run     ------ Time in seconds -------        Logical Reads        OS Reads          Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap       Blocks     /sec     Blocks   /sec     Blocks    /sec %Busy %Wait  msec
    Offline  172.039  169.612  2.182 0.245   39,189,311  227,793     46,726     272    63,188     363   2.0   0.0   5.5
    Online   439.070  436.575  2.284 0.211   39,189,311   89,255     46,726     106    62,691     142   1.0   0.0   5.8
    Cached   432.144  432.099  0.044 0.001   39,189,311   90,686

    Disk is idle. Without recid validation the bottleneck is Logical Reads.Offline run is much faster than online.

    idxfix / 2. Scan indexes for invalid index entries
    Table: Order-Line
    Index: item-num
    Validate recids for index entries? No

    Index 25 (PUB.Order-Line, item-num): 38998800 keys.

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user   sys    gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline  183.883  156.881 1.557 25.445  39,072,812  212,487    17,585      96    18,286      96  13.8   0.0   1.5
    Online   429.937  401.006 1.615 27.316  39,072,812   90,880    17,585      41    17,741      41  6.4    0.0   1.6
    Cached   398.554  398.531 0.022  0.001  39,072,812   98,036

    Disk is a bit busy and time gap is large. The possible reason is a bad logical scatter factor of the index. Offline run is still much faster than online.

    The idxblockreport reads exactly the same blocks and exactly in the same order as idxfix/2 but it does not use Logical Read per key in the leaf blocks.

    proutil sports -C idxblockreport Order-Line.order-line

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user   sys    gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline    5.723    2.449 2.095  1.179      46,743    8,168     46,743  8,168    63,098  12,910  64.0   1.0   5.7
    Online     5.897    2.854 2.220  0.823      46,743    7,927     46,743  7,927    62,607  11,958  55.5   1.0   5.4
    Cached     0.999    0.935 0.053  0.011      46,743   46,790

    proutil sports -C idxblockreport Order-Line.item-num

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user   sys    gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline   61.279    1.019 1.616 58.644      17,602      287     17,602    287    18,199     289  95.4   0.0   3.3
    Online    61.121    1.172 1.601 58.348      17,602      288     17,602    288    17,708     289  95.0   0.0   3.3
    Cached     0.478    0.405 0.050  0.023      17,602   36,824

    Note: idxblockreport does not update the counter of Logical Reads. The numbers in these tables are based on assumption that the utility creates the requests only for the keys in non-leaf blocks.

    Idxblockreport for both indexes had a disk bottleneck (and time gap is huge) but especially for the Order-Line.item-num index. The index is smaller than the Order-Line.order-line index but it has the bad logical scatter factor. The idxblockreport utility reads 2.66 times less blocks 10 times longer and the disk is almost twice busy. Nevertheless idxblockreport is much faster than idxfix/scan indexes without recid validation.

    OPEN QUERY q PRESELECT EACH <table> NO-LOCK USE-INDEX <index> is not an equivalent of the scan indexes without recid validation. PRESELECT EACH query reads not only the index keys but the records as well (see the results below).


    Validate recids for index entries? Yes

    Formula for activity:
    Scan indexes for invalid index entries with recids validation creates exactly the same requests as FOR EACH query: one database request per index key and one database request in table area.

    Logical reads = 2 * the number of index keys
    OS Reads = the number of table’s blocks + the number of index’s blocks.


    idxfix / 2. Scan indexes for invalid index entries
    Table: Order-Line
    Index: order-line
    Validate recids for index entries? Yes

    Index 24 (PUB.Order-Line, order-line): 38998800 keys.

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline  730.495  669.850 58.279 2.366  78,188,111  107,034  1,265,439  1,792 1,735,724   2,375   9.2   0.0   4.5
    Online  1187.298 1122.924 61.950 2.424  78,188,111   65,854  1,265,439  1,066 1,735,233   1,461   5.7   0.0   4.5
    Cached  1079.350 1076.590  2.550 0.210  78,188,111   72,440

    iostat -x shows the even disk load:

    Time      r/s    w/s     kr/s   kw/s wait actv  svc_t  %w  %b
    19:53:59 22.3    0.4 19,122.5    5.6  0.0  0.1    4.4   0  10
    19:54:09 20.8    0.2 18,953.6    1.0  0.0  0.1    4.4   0   9
    19:54:19 21.0    0.3 18,894.3    0.5  0.0  0.1    4.7   0  10
    ...
    20:05:29 20.4    0.0 19,046.6    0.0  0.0  0.1    4.5   0   9
    20:05:39 21.6    0.1 18,961.5    1.0  0.0  0.1    4.6   0  10
    20:05:49 22.5    0.1 18,953.8    0.8  0.0  0.1    4.6   0  10
    20:05:59 21.9    0.3 18,632.0    2.0  0.0  0.1    4.6   0  10

    Disk load remains uniform throughout the time due to the perfect logical scatter factor of the index. See the opposite result below (the Order-Line.item-num index).

    FOR EACH Order-Line NO-LOCK USE-INDEX order-line WHERE FALSE

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline  687.922  627.083 58.621 2.218  78,184,327  113,653  1,265,439  1,840 1,735,676   2,522   9.7   0.0   4.4
    Online   843.996  765.257 76.062 2.677  78,184,325   92,636  1,265,439  1,499 1,735,433   2,066   8.0   0.0   4.4
    Cached   717.350  717.390  0.144 0.084  78,184,325  108,990

    FOR EACH Order-Line NO-LOCK USE-INDEX order-line WHERE TRUE

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 2020.960 1959.506 58.972 2.482  78,184,327   38,687  1,265,439    626 1,735,675     859   3.4   0.0   4.7
    Online  2159.702 2095.820 61.379 2.503  78,184,325   36,201  1,265,439    586 1,735,197     805   3.2   0.0   4.6
    Cached  2055.944 2055.915  0.205 0.076  78,184,325   38,028

    OPEN QUERY q PRESELECT EACH Order-Line NO-LOCK USE-INDEX order-line.

    Run     ------ Time in seconds -------       Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user    sys   gap      Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 1789.726 1721.470 64.966 3.290  78,184,327   43,685  1,265,439    707 1,735,724     971   4.5   0.0   5.2
    Online  1931.358 1860.311 68.058 2.989  78,184,325   40,482  1,265,439    655 1,735,217     902   4.3   0.0   5.3
    Cached  1835.341 1829.911  5.419 0.011  78,184,325   42,599

    Comparing the cached runs:

    Run                  ------ Time in seconds -------       Logical Reads
    Mode                     real     user    sys   gap      Blocks     /sec
    idxfix/scan indexes  1079.350 1076.590  2.550 0.210  78,188,111   72,440
    FOR EACH WHERE FALSE  717.350  717.390  0.144 0.084  78,184,325  108,990
    FOR EACH WHERE TRUE  2055.944 2055.915  0.205 0.076  78,184,325   38,028
    PRESELECT EACH       1835.341 1829.911  5.419 0.011  78,184,325   42,599

    Idxfix/scan indexes is a bit slower than FOR EACH WHERE FALSE but it’s much faster than other two queries.


    idxfix / 2. Scan indexes for invalid index entries
    Table: Order-Line
    Index: item-num
    Validate recids for index entries? Yes

    1 indexes, 38998800 keys checked.

    Run     -------- Time in seconds ---------      Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user      sys     gap     Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 3047.638 1757.930 1192.015  97.693 78,071,612   25,617 29,821,120  9,728  1,692,711    529   4.6   0.0   2.4
    Online  1443.833 1109.316   95.257 239.260 78,071,612   54,072  1,236,298    856  2,181,198  1,441  20.3   0.2   2.1
    Cached  1077.835 1077.714    0.120   0.001 78,071,612   72,434

    iostat -x

    Time       r/s  w/s      kr/s  kw/s wait actv svc_t  %w  %b
    21:00:46 127.4  0.3 112,181.3   1.9  0.0  1.0   7.7   1  90
    21:00:56 127.2  0.2 112,640.5   1.6  0.0  0.9   7.4   1  90
    21:01:06 123.5  0.2 109,642.3   0.4  0.0  0.9   7.7   1  90
    21:01:16 127.2  0.3 112,486.3   1.9  0.0  0.9   7.5   1  90
    21:01:26 125.7  0.2 111,488.9   0.4  0.0  0.9   7.6   1  90
    21:01:36 127.9  0.0 112,718.3   0.0  0.0  1.0   7.5   1  90
    21:01:46 127.9  0.1 112,716.2   0.0  0.0  0.9   7.3   1  89
    21:01:56 128.1  0.1 112,899.2   0.3  0.0  0.9   7.4   1  89
    21:02:06 125.6  0.0 111,614.8   0.0  0.0  0.9   7.5   1  89
    21:02:16 127.2  0.1 111,487.5   1.0  0.0  0.9   7.4   1  89
    21:02:26 123.9  0.1 109,951.4   0.4  0.0  0.9   7.6   1  89
    21:02:36  49.7  0.0  38,107.5   0.0  0.0  0.4   7.4   0  33
    21:02:46   6.0  0.2      48.0   0.5  0.0  0.0   1.1   0   1
    ...
    21:51:06   3.0  0.0      24.0   0.0  0.0  0.0   5.2   0   2
    21:51:16   3.0  0.2      24.0   0.7  0.0  0.0   6.4   0   2

    The previous tests created more or less even disk activity. The test above is different due to the bad logical scatter factor: at the beginning of the test the utility actively reads the record blocks. Disk was 90% busy. But two minutes later when almost all record blocks were loaded into memory the disk activity is dropped down for the next 50 minutes until the end.

    Another feature of this test: the default value of the -B in offline run is not enough to keep all needed database blocks. It results in the large number of OS Reads. Data were cached in the file system memory - Disk Reads are almost the same as in the online run with large -B/-B2. Nevertheless the offline run is much slower than other runs. In other words, OS Reads are costly even if data is already in file system cache. At least they are more costly than the Logical Reads in Progress. The cost is also expressed in the very large system time.

    FOR EACH Order-Line NO-LOCK USE-INDEX item-num WHERE FALSE

    Run     -------- Time in seconds ---------      Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user      sys     gap     Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 3061.844 1768.078 1193.846  99.920 78,067,844   25,497 29,821,121  9,740 1,693,607     553   4.8   0.0   2.4
    Online   955.959  818.810   60.097  77.052 78,067,842   81,664  1,236,298  1,293 1,690,457   1,798  13.8   0.1   2.2
    Cached   745.732  745.735    0.170   0.073 78,067,842  104,686

    FOR EACH Order-Line NO-LOCK USE-INDEX item-num WHERE TRUE

    Run     -------- Time in seconds ---------      Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user      sys     gap     Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 4277.942 2985.670 1224.835  67.437 78,067,844   18,249 29,821,121  6,971 1,690,771     394   3.3   0.0   2.2
    Online  2561.496 2138.432   91.325 331.739 78,067,842   30,477  1,236,298    483 2,083,202   1,585  15.5   0.2   2.2
    Cached  2113.301 2113.193    0.285   0.077 78,067,842   36,941

    OPEN QUERY q PRESELECT EACH Order-Line NO-LOCK USE-INDEX item-num.

    Run     -------- Time in seconds ---------      Logical Reads        OS Reads         Disk Reads      Disk     svc_t
    Mode        real     user      sys     gap     Blocks     /sec     Blocks   /sec    Blocks    /sec %Busy %Wait  msec
    Offline 4080.298 2759.787 1236.630  83.881 78,067,844   19,133 29,821,121  7,309 1,695,263     385   3.8   0.0   3.2
    Online  2437.406 1912.184  103.648 421.574 78,067,842   32,029  1,236,298    507 2,170,731     848  20.2   0.3   2.7
    Cached  1872.175 1866.495    5.622   0.058 78,067,842   41,699

    All these queries create the same uneven disk load as the index fix utility.

    Comparing the cached runs:

    Run                  -------- Time in seconds ---------      Logical Reads
    Mode                     real     user      sys     gap     Blocks     /sec
    idxfix/scan indexes  1077.835 1077.714    0.120   0.001 78,071,612   72,434
    FOR EACH WHERE FALSE  745.732  745.735    0.170   0.073 78,067,842  104,686
    FOR EACH WHERE TRUE  2113.301 2113.193    0.285   0.077 78,067,842   36,941
    OPEN QUERY PRESELECT 1872.175 1866.495    5.622   0.058 78,067,842   41,699

    FOR EACH WHERE FALSE is faster than idxfix/scan indexes.

    OPEN QUERY PRESELECT has a higher system time because it writes the result list to a temp file on disk. It writes approximately 12 bytes per record.

    By the way, the fastest way to count the records in 4GL is to run FOR EACH WHERE FALSE and to check the number of records using the_UserTableStat VST. If you will use a unique index then you can also additionally check the number of its index entry locks.

    Conclusion: idxfix/scan indexes is CPU bound but it also sensitive to the size of the database buffer pool when the index has the bad logical scatter factor. Unfortunately we don’t have too many possibilities to speed up this option. It can be single-threaded only. Fortunately it’s faster than idxfix/scan records by (1 + the_number_of_index_tree_levels / 2) times.

    The next are the tests of idxcheck