Why is a query on my primary index 100 times faster than a secondary index? - Forum - OpenEdge Deployment - Progress Community

Why is a query on my primary index 100 times faster than a secondary index?

 Forum

Why is a query on my primary index 100 times faster than a secondary index?

  • Index guids does not perform if used as foreign key.
    Problem description:
    Some queries using indexes of table “ctbEmploymentPeriodResultHist” are not performing.
    I created a query to browse the table using the primary index and one to browse a secondary index. The query on the secondary index is 100 times slower than the primary index.
    Preparation:

    ·         Created a local database for this table (prostrct create <datbaseName> -blocksize 8192)

    ·         Procopy of the progress Empty8 database and enabled Large Files  (proutil <databaseName> -C EnableLargeFiles)

    ·         Loaded the data of the customer into the table.

    ·         Table “All” rebuild

    ·         Started the database for multiple users with proserve (proserve <DatabaseName> -B 5000 )

    ·         Start the test.

    Note: Each time before running a query I:

    ·         Started the database for multiple users with proserve.

    ·         Checked if it was using the correct query.

    ·         Started promon to monitor the database activity.

    ·         Run the query

    ·         Stopped the database

    Primary Index query test:
    define variable ResultHistCounter as int64 no-undo.
    etime(yes).
    for each ctbEmploymentperiodresultHist no-lock:
        ResultHistCounter = ResultHistCounter + 1.
    end.
    message etime skip ResultHistCounter view-as alert-box.
    Result:

    ·         The Buffer hit ratio was at 98%.

    ·         Record reads per second varies between 70000 and 220000.

    ·         Elapsed Time:  +/- 4 to 5 minutes.

    Secondary Index query:
    define variable ResultHistCounter as int64 no-undo.
    etime(yes).
    for each ctbEmploymentperiodresultHist no-lock use-index FK_oidresult:
        ResultHistCounter = ResultHistCounter + 1.
    end.
    message etime skip ResultHistCounter view-as alert-box.
    Result:

    ·         The Buffer hit ratio varies between  60 and 80%.

    ·         Record reads per second varies between 600 and 1200.

    ·         Elapsed Time:  +/- 7 hours.

    Moved the index “FK_oidresult” to a new Storage area, used different values for the  –B startup parameters but the conclusion was the same. Every test I did to improve the performance ended with the same test results.  Tried the same tests with Progress version 11.2, same  results.
    Environment:

    ·         Progress V 10.2b and Progress V 11.2

    ·         Window 7 Professional (service pack1)

    ·         Intel® Core i5-2400 CPU @3.10 GHz

    ·         Memory: 4,00 GB

    ·         32 Bit system.

    Table details:

    ·         Name                         : ctbEmploymentPeriodResultHist.

    ·         Number of fields          : 23

    ·         Record count of table   :  29236098

    ·         Number of indexes       : 6

    ·         Storage area               : Table     --> d EmpPerRestHistData":25,64;64 .

    ·                                             Indexes -->  d " EmpPerRestHistIDX":26,64;8 .

    ctbEmploymentPeriodResultHist.df.zip

  • If the primary index was unique and the secondary was not, that could offer some explanation.

  • I agree but in this case the secundary index is actually a foreign key wich is a guid (36 char).

    My database has to order about 5 million keys over a total of 29 million records.

    My primary key is also a guid. The difference in time between the 2 indexes are really big.

    Bigger then expexted. I tried everything to make it perform faster, a various of database parameters, index rebuild, index move and rebuild.

    It will not perform.

  • use idxanalys

    proutil db-name -C idxanalys

    check the results and see if anything is revealed

    also check this KB entry for hints

    http://knowledgebase.progress.com/articles/Article/21091?retURL=%2Fapex%2FProgressKBHome&popup=false

  • Also 2 more notes

    Use the XREF option of the COMPILE statement to determine which index(es) ABL has chosen for a query.

    using XREF can highlight the # if reads.

    so compile example 1 and example 2 separately, with XREF option, then compare results.

    A general comment:

    Note that USE-INDEX CAN prevent the AVM from doing index bracketing and might actually result in slower query performance.

    both from this kb:

    http://knowledgebase.progress.com/articles/Article/P145706?retURL=%2Fapex%2Fprogresskbsearch&popup=false

    however the comment is cherry-picked a bit. Just looking for some explanation.

    Also - I haven't dealt much with foreign keys, but the XREF compile may point out something to do with the foreign key mucking up the works a bit.

  • The short answer is because sequential I/O is much faster than random I/O.

    Whether an index is primary or not and whether the index is unique or not has very little bearing on the results you observe.

    The data in the table are stored in some physical order on disk and the rows are packed at some average density with some amount of empty space also. When the phyiscal storage order matches the ordering given by an index fairly well, then when you read all the rows by that index, you can do sequential reads from disk. In addition, the operating system can perform read-ahead and read the next block into the filesystem buffer before you request it. Further, when a data block in the database buffer pool contains the next row that you need according to the index ordering, then we can fetch that row immediately with out any I/O at all for that row.

    You said you observe a difference in performance of 100X. It can be 1000X.

  • Will adding a FIELDS phrase with just the secondary index field prevent the record from being read? ie result in a sequential read over the index?

    No idea if this helps OP, but just curious.

  • It will not. However, doing so may improve things when going over a

    network connection.

    winmail.dat.zip