Salesforce

Why deleting records in a table leads to index fragmentation ?

« Go Back

Information

 
TitleWhy deleting records in a table leads to index fragmentation ?
URL NameP103055
Article Number000151525
EnvironmentProduct: Progress
Version: 9.x
Product: OpenEdge
Version: 10.x, 11.x
OS: All supported platforms
Question/Problem Description
Why deleting records in a table leads to index fragmentation ?
Why IDXANALYS shows large fragmentation after a large record delete?
When to run IDXCOMPACT?
Why are index entries occupying up to 10 times more physical space after mass record deletion?
Is it possible to view the contents of the index tree?
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
When records are deleted, index-trees are not rebuilt. The entire B-tree is left intact. This was a change introduced in Progress 9.x, where index entry locks are not removed at the end of transaction:  
  • During a transaction, when a record is deleted, if there is a unique index, a delete place holder is put in the place of the RECID for the deleted record until the transaction commits. This is done so that another transaction running at the same time cannot add the same unique entry, which would thereby prevent the deleting transaction from rolling back.
  • At the end of the transaction, it is converted to a negative number and put it in place of the RECID in the index as a placeholder.  i.e.  Progress treats the index entry locks as index entries during subsequent read operations.  
  • The specified blocks are not read because RECID is higher than any possible High Water Mark (HWM) for that Area. But it's still a logical request.
  • Having deleted a sufficient amount of records of a table in a short time span, would leave a situation of high fragmentation and query performance issues when read queries traverse the entries in an index bracket, the placeholders are skipped over. When there are a large number of placeholders within a bracket, possibly in multiple index blocks (which often occurs when one deletes many rows in a table), this can cause noticeable delays and extra index block reads.
  • These placeholders are cleaned out the next time the index block that contains them is used and written to disk. However: if there are no modifications to this index bracket, the placeholders remain
  • As a result, corrective actions should be considered after delete operations in order to reduce the number of index blocks in an effort to increase index utilization and therefore performance:
    1. Ideally, an IDXBUILD operation on the affected table(s) would correct this immediately, but the utility has to be run offline and requires after-imaging to be disabled.  
    2. If the database cannot be taken offline, after the delete operation:
    • Use "proutil -C idxcompact" which can be run both online and offline to reduce the number of blocks in the B-Tree and possibly the number of B-Tree levels, or  
    • Use "proutil -C idxmove" which can be run both online and offline. Move indexes to another area and optionally back again to the current area the indexes are stored. This action will rebuild the indexes of the remaining records in this table, but the utility locks modifications to the table until complete.
How index compact would help matters is as follows:
  1. If the index is a unique index, the delete chain is scanned and the index blocks are cleaned up by removing deleted entries.
  2. All non-leaf levels of the B-tree are compacted starting at the root working toward the leaf level.
  3. The leaf level is compacted.
  4. When run online it may skip over parts of an index to avoid contention with other processes/clients. Multiple online IDXCOMPACT runs may be required to bring performance back to normal.
In other words, in addition to compacting an index, the idxcompact utility additionally clears dead entries left after entries have been deleted from unique indexes.

It is strongly recommended that OpenEdge 10.1C04 or 10.2A01 or later is in use before running IDXCOMPACT online.  Previously there were a number of issues with running the IDXCOMPACT utility online, in particular an issue that resulted in excessive bi growth due to the scope of the idxcompact transaction being handled as a single transaction.

The command line argument for IDXCOMPACT is:

$  proutil dbname -C idxcompact tablename.indexname n

Where:
n, Specifies the degree of index compaction and takes an integer range >=50 and <=100.
The default value is 80. If not specified, 80 is used.

Command line arguments can be created and then edited to run through a shell script, for example, as follows:
OUTPUT TO idxcomp.out.
 FOR EACH _file WHERE _file-number > 0 AND
     NOT _file-name BEGINS "SYS" NO-LOCK:
     FOR EACH _index OF _file NO-LOCK: 
         PUT UNFORMATTED "proutil dbname -C idxcompact " + _file-name + "." + _index-name "90" SKIP.
     END.
END.
OUTPUT CLOSE.
​DISPLAY "file is ready for editing".

** As with running any update database utility, it is suggested a valid database backup beforehand ** 

Further Considerations:

If this "delete operation" is a regular feature of the application resulting in full table content deletion:  the contents of a complete table are deleted at the end of every business day:
  • Consider moving this table and it's associated indexes to a separate Storage Area and then save time deleting the entire table content by instead running the offline "proutil dbname -C truncate area" as an alternate method. Indexes are disabled as part of this operation and need to be rebuilt afterwards which is fast because there are no records remaining.
An alternative is to stop the database and run the mass record deletion in single-user mode.
In single-user mode there is no risk of conflict between transactions, therefore the delete placeholder mechanism isn't used as it isn't needed.
Workaround
Notes
Keyword Phrase
Last Modified Date11/20/2020 7:04 AM

Powered by