Salesforce

Defect: Database crashes with 8783 or 210 error due to index corruption.

« Go Back

Information

 
TitleDefect: Database crashes with 8783 or 210 error due to index corruption.
URL NameDefect-Database-crashes-with-8783-or-210-error-due-to-index-corruption
Article Number000178243
EnvironmentProduct: OpenEdge
Version: 10.1B, 10.1C, 10.2x, 11.0 to 11.6.3 inclusive
OS: All supported platforms
Question/Problem Description
Production database crashed with one of the following errors:
SYSTEM ERROR: Attempt to read block 0 which does not exist in area , database (210)
SYSTEM ERROR: Attempt to read block 0 which does not exist in area , database (14684)
SYSTEM ERROR: Index x (table-name,index-name): couldn't find key recid nnnnnnnnn (8783)

IDXFIX is unable to locate the index corruption.

IDXCHECK Option 3 can detect the index corruption, but can crash in OpenEdge versions prior to 11.7 due to the index corruption.
Index corruption seems to reoccur relatively quickly after an IDXBUILD has been done to address the index corruption issue.
Steps to Reproduce
Clarifying Information
Database is very large.
Table and Index areas are both very large and have dbkeys within the 64-bit dbkey range.
 
Error MessageSYSTEM ERROR: Attempt to read block 0 which does not exist in area <area number>, database <database name>. (210)
SYSTEM ERROR: Attempt to read block 0 which does not exist in area <area number>, database <database name>. (14684)
SYSTEM ERROR: Index x (table-name,index-name): couldn't find key recid nnnnnnnnn (8783)

(4423) SYSTEM ERROR: Index 258, block 158718208, element no. 0: bad compression size.
(2816) prev size = 0, cs = 31, ks = 3, is = 30, key count = 0.
(4423) SYSTEM ERROR: Index <index num>, block <block num>, element no. <element num>: bad compression size.
(2816) prev size = 0, cs = 31, ks = 3, is = 30, key count = 0.
(2816) prev size = %i, cs = %i, ks = %i, is =%i, key count = %i.
(14037) Index 258 block validation error data: nment is 8039, nlength is 8046, level is 107, current key is 0, offset is 0, func is cxDoInsert
(14037) Index %d block validation error data: nment is %d, nlength is %d, level is %d, current key is %d, offset is %d, func is %s
(17808) Index 258 partition 0 block validation error data: nment is 8039, nlength is 8046, level is 107, current key is 0, offset is 0, function is cxDoInsert
(17808) Index %d partition %d block validation error data: nment is %d, nlength is %d, level is %d, current key is %d, offset is %d, function is %s
(14031) Invalid Index Block Detected
Defect NumberDefect OCTA-7749, PSC00350688
Enhancement Number
Cause
An index manager defect corrupts the index.  The issue is only possible on databases that have very large indexes where the dbkeys straddle the 32/64-bit boundary, in a multi-level index, when a split occurs to accommodate an additional index block. This may result in corruption.  This does not effect OpenEdge 10.1B, Progress 9 or earlier which cannot hold recids higher than the 32-bit boundary.

IDXFIX is not able to detect this type of index corruption as it traverses the indexes differently.

In OpenEdge versions prior to 11.7, the index check utility (IDXCHECK) with Option 3 - Validate record for each key, in the Validation menu, is able to detect the index corruption but may crash as a result of the index corruption. 
After Image files and OpenEdge Replication can reproduce the index corruption when the changes are applied in the target databases.

To determine if the 64 bit boundary for block identifiers has been crossed perform the following calculations:
  1. Review the records per block defined for all areas containing indexes. 
  1. Determine how many blocks are in the area.  
If the area has more blocks than (2 ^ 31 records / <records per /block>) it is possible to encounter this problem.

Run a prostrct statistics report (online or offline) and look at the active blocks and records/block listed for the area.

Example for sports2000 Cust_Index Area:
 
Database Block Usage for Area: Cust_Index

  Active blocks: 21
    Data blocks: 21
    Free blocks: 0
   Empty blocks: 75
   Total blocks: 96
  Extent blocks: 2
  Records/Block: 32
   Cluster size: 1
Maximum # of blocks (for 2 Billion records) =  2,147,483,648 records / (32 records / block)  =   67,108,864 blocks

Active blocks: 21 <  67,108,864  blocks

No problems here.

Another example:
 
Database Block Usage for Area: Index ABC

  Active blocks: 19863639
    Data blocks: 19863635
    Free blocks: 4
   Empty blocks: 121
   Total blocks: 19863760
  Extent blocks: 83
  Records/Block: 256
   Cluster size: 8
Maximum # of blocks (for 2 Billion records) =  2,147,483,648 records / (256 records / block)  =   8,388,608 blocks

Active blocks: 19,863,639  > 8,388,608 blocks

This area is vulnerable.

The following ABL code can be run to review the output of prostrct statistics [dbname] > statistics_output.txt to determine if a database is susceptible to this defect:
 
DEFINE VARIABLE cInFile AS CHAR NO-UNDO INIT "./statistics_output.txt".
DEFINE VARIABLE cHolder AS CHAR NO-UNDO.
DEFINE VARIABLE cTempName AS CHAR NO-UNDO.
DEFINE VARIABLE iCount AS INT NO-UNDO INIT 0.
DEFINE VARIABLE iMaxRecords AS INT64 INIT 2147483648 NO-UNDO.

DEFINE TEMP-TABLE ttAreas
FIELD  cAreaName AS CHAR
FIELD  iActiveBlocks AS INT64 FORMAT ">>>>>>>>>>>>>>>9"
FIELD  iRPB AS INT 
FIELD  iClusterSize AS INT
INDEX INDEX-name cAreaName
INDEX INDEX-clustersize iClusterSize.

IF (OPSYS = "WIN32") THEN
DO:
  ASSIGN DEFAULT-WINDOW:WIDTH-CHARS = 180
         CURRENT-WINDOW = DEFAULT-WINDOW
         SESSION:SYSTEM-ALERT-BOX   = TRUE.
END.

INPUT FROM VALUE(cInFile).
REPEAT:
    IMPORT UNFORMATTED cHolder NO-ERROR.
    IF ERROR-STATUS:ERROR THEN LEAVE.
    
    REPEAT WHILE INDEX(cHolder,"Database Block Usage for Area") < 1:
        IMPORT UNFORMATTED cHolder NO-ERROR.
            IF ERROR-STATUS:ERROR THEN LEAVE.
        IF INDEX(cHolder,"Control Area") > 1 OR 
            INDEX(cHolder,"Primary Recovery Area") > 1 THEN
        DO:
            IMPORT UNFORMATTED cHolder NO-ERROR.
            NEXT.
        END.
        IF INDEX(cHolder,"Database Block Usage for Area") > 0 THEN
        DO:
            cTempName=TRIM(ENTRY(2,cHolder,":")).
            FIND FIRST ttAreas where cAreaName = cTempName NO-ERROR.
            IF NOT AVAILABLE ttAreas THEN
            DO:
             CREATE ttAreas.
             ASSIGN ttAreas.cAreaName = cTempName.
            END.
        END.
    END.

    REPEAT WHILE INDEX(cHolder,"Active blocks") < 1:
        IMPORT UNFORMATTED cHolder NO-ERROR.
            IF ERROR-STATUS:ERROR THEN LEAVE.
        IF INDEX(cHolder,"Active blocks") > 0 THEN
        DO:
            IF AVAILABLE ttAreas THEN
            DO:
             ASSIGN ttAreas.iActiveBlocks = INT64(TRIM(ENTRY(2,cHolder,":"))).
            END.
            ELSE 
            DO:
             CREATE ttAreas.
             ASSIGN ttAreas.iActiveBlocks = INT64(TRIM(ENTRY(2,cHolder,":"))).
            END.
        END.
    END.

    REPEAT WHILE INDEX(cHolder,"Records/Block") < 1:
        IMPORT UNFORMATTED cHolder NO-ERROR.
            IF ERROR-STATUS:ERROR THEN LEAVE.
        IF INDEX(cHolder,"Records/Block") > 0 THEN
        DO:
            IF AVAILABLE ttAreas THEN
            DO:
             ASSIGN ttAreas.iRPB = INT(TRIM(ENTRY(2,cHolder,":"))).
            END.
        END.
    END.

    REPEAT WHILE INDEX(cHolder,"Cluster size") < 1:
        IMPORT UNFORMATTED cHolder NO-ERROR.
            IF ERROR-STATUS:ERROR THEN LEAVE.
        IF INDEX(cHolder,"Cluster size") > 0 THEN
        DO:
            IF AVAILABLE ttAreas THEN
            DO:
             ASSIGN ttAreas.iClusterSize = int(TRIM(ENTRY(2,cHolder,":"))).
            END.
        END.
    END.    
    iCount = iCount + 1.
    IF iCount > 20 THEN LEAVE.
END.

FOR EACH ttAreas NO-LOCK:

 IF (ttAreas.iActiveBlocks > (iMaxRecords / ttAreas.iRPB)) THEN 
    DISPLAY 
    ttAreas.cAreaName FORMAT "x(20)"  
    STRING(ttAreas.iActiveBlocks - (iMaxRecords / ttAreas.iRPB))  " 
    is susceptible." WITH WIDTH 100.
 ELSE
    DISPLAY ttAreas.cAreaName FORMAT "x(20)"  
    STRING(ttAreas.iActiveBlocks - (iMaxRecords / ttAreas.iRPB))  " 
    is not susceptible." WITH WIDTH 100.
END.

INPUT CLOSE.

Alternatively, the following VST's will identify vulnerable bit boundary areas. 
DEFINE VARIABLE Vulnerable AS LOGICAL NO-UNDO FORMAT "*/ ". 
DEFINE VARIABLE RowsPerBlock AS INTEGER NO-UNDO. 
DEFINE VARIABLE Limit AS INT64 NO-UNDO. 
DEFINE VARIABLE TwoBillion AS INT64 NO-UNDO. 

ASSIGN TwoBillion = EXP(2, 31). 

FOR EACH _Area WHERE _Area-number GT 3,  
FIRST _areastatus WHERE _Areastatus-areanum = _Area-number , 
FIRST _storageobject OF _area WHERE _object-type = 2: 

ASSIGN 
RowsPerBlock = EXP(2, _area-recbits) 
Limit = (TwoBillion / RowsPerBlock) 
Vulnerable = (_areastatus-HiWater >= Limit). 

DISPLAY 
_area-name RowsPerBlock label "r/block" 
_areastatus-HiWater Vulnerable LABEL "Vuln". 

END
Resolution
1. Upgrade to OpenEdge 11.6.4, 11.7 or later.
If unable to upgrade to 11.7 at this time, the following hotfixes are available to be downloaded from ESD:
  • OpenEdge 10.2B08 and apply hotfix 64.
  • OpenEdge  11.4 and apply hotfix 044.
  • OpenEdge 11.5.1 and apply hotfix 033.
  • OpenEdge 11.6.3 and apply hotfix 003.
 
2.  An IDXBUILD is necessary to correct the existing index corruption.
The above hotfixes and versions prevent future occurrences of the problem.  They do not correct or tolerate the corruption which already exists in the index.
A full index rebuild of effected indices (if known) or all indices (if specific indices which have this problem are unknown) is required to eliminate index corruption created prior to the applicaiton of the fix using an IDXBUILD from the upgraded version with the fix.

3. In OpenEdge 11.7, Option (5 - Validate tree) was added to IDXCHECK which is fast and provides more deterministic error messages if the issue is present.  This Option is not enabled by default when running IDXFIX.

Sample error messages:  
 
If the index is a non-Multi-tenancy/Table partitioned index:
Detected 32/64 dbkey mismatch for index , level , block . (19005)
An entry(entry number %d) has a 64 bit flag but points to a 32 bit DBKEY %J.(19006)

If the index is a Multi-tenancy/Table partitioned index:
Detected 32/64 dbkey mismatch for index %d, partitionId ,level , block .(19007)
An entry(entry number %d) has a 64 bit flag but points to a 32 bit DBKEY %J.(19006)
 
Workaround
To reduce the potential vulnerability of the problem until an upgrade can be scheduled:

Run periodic PROSTRCT STATISTICS or VST reports on Area size as outlined above to identify vulnerable areas
  1. Create a new Area with a lower records per block.
  2. Use PROUTIL with the IDXMOVE Option to move the older indices to the new Area or create new indices which match the old definitions but with new names.
  3. Activate or IDXBUILD the new indices.
  4. Make one of the matching index in the new area the primary index.
  5. Rename the indices in the old area.
  6. Rename the indices in the new area to match the original index names.
Notes
Quick Reference to 64 bit boundary for block identifiers: 64-bit Boundary
RPB
BLOCKS
1
2,147,483,648
2
1,073,741,824
4
   536,870,912
8
   268,435,456
16
   134,217,728
32
    67,108,864
64
    33,554,432
128
    16,777,216
256
      8,338,608

Critical Alert – Index manager defect can corrupt large indexes where dbkeys straddle 32/64-bit boundary., ​
How to get a table name from a dbkey index block?

64-bit recids were first introduced in 10.1B.
This problem does not apply to 10.1A or earlier versions such as 9.x, 8.x, etc.
Keyword Phrase
Last Modified Date11/9/2021 3:22 PM

Powered by