Are there any caveats against the -lruskips 2147483647? - Forum - OpenEdge RDBMS - Progress Community

Are there any caveats against the -lruskips 2147483647?

 Forum

Are there any caveats against the -lruskips 2147483647?

  • I would like temporary (let's say just for 1 sec) to increase the -lruskips to its maximal value (2147483647). Are there any negative effects I might cause?

    Maybe it would be better to use the value that should be a bit less than 2147483647 looks? The -lruskips cuts LRU locks by (-lruskips + 1) times. I guess Progress reads the number of the block accesses ("Usect"?) stored in the pool of buffer headers and takes the modulo (-lruskips + 1) value. If the result is zero than the block is moved on LRU chain. The 2147483648 would be too large for a signed 4-byte integer. But the tests did not revealed the problems with this -lruskips value.

    BTW, kbase incorrectly states that maximum is 2147483648:

    Article: Why is a maximum value of 2GB stated in the OpenEdge documentation for the LRU force skips (-lruskips) parameter?

    http://knowledgebase.progress.com/articles/Article/Why-is-a-maximum-value-of-2GB-stated-in-the-OpenEdge-documentation-for-the-LRU-force-skips-lruskips-parameter


    Thanks in advance,
    George

  • I'd be interested in hearing what happens :)

    --
    Tom Bascom
    tom@wss.com

  • And new tests did found some inconsistency. If I set the high value of the -lruskips online using promon and then return it back to zero the behaviour of Progress sessions will not be the same as just after db startup with -lruskips 0. I used promon itself to access the blocks. Any action in promon ("U"pdate, "S"ample, "Z"ero and even "R"epeat?!) reads the ACO object blocks - one block per data area including "Control Area". And, by the way, the queries of any VSTs do the same. It's VERY unfortunate that these blocks are on the LRU chain. So in sports db promon's actions will read 7 blocks (and will create 7 BHT and 7 LRU latch locks). We can create db accesses and check their latch locks without leaving the "Activity: Latch Counts" screen. After the changing the -lruskips forth and back these actions will not lock LRU latch anymore. Really it's excellent news! But I would like to understand if it's a real thing or just a "mirage". ;-)

  • What is "ACO"?

    --
    Tom Bascom
    tom@wss.com

  • "Area Control Object" object block = bk_type 12 + objectId 0, block 3 in each area:

    OBJBLK:
    0040 totalBlocksOld:               0x%016x %d
         hiWaterBlockOld:              0x%016x %d
         chainFirst[FREECHN]:          0x%016I64x %I64u
    0050 chainFirst[RMCHN]:            0x%016I64x %I64u
         chainFirst[LOCKCHN]:          0x%016I64x %I64u
    0060 numBlocksOnChainOld[FREECHN]: 0x%016x %d
         numBlocksOnChainOld[RMCHN]:   0x%016x %d
         numBlocksOnChainOld[LOCKCHN]: 0x%016x %d
    0070 chainLast[FREECHN]:           0x%016I64x %I64u
         chainLast[RMCHN]:             0x%016I64x %I64u
    0080 chainLast[LOCKCHN]:           0x%016I64x %I64u
         objectId:                     0x%04hx             %d
         objectType:                   0x%04hx             %d
    0090 serialNumber:                 0x%016I64x %I64u
         firstFreeCluster:             0x%016I64x %I64u
    00A0 lastFreeCluster:              0x%016I64x %I64u
         totalBlocks:                  0x%016I64x %I64u
    00B0 hiWaterBlock:                 0x%016I64x %I64u
         numBlocksOnChain[FREECHN]:    0x%016I64x %I64u
    00C0 numBlocksOnChain[RMCHN]:      0x%016I64x %I64u
         numBlocksOnChain[LOCKCHN]:    0x%016I64x %I64u
    00D0 partitionId:                  0x%04hx             %d
    
  • Got it.  I couldn't figure out the abbreviation -- but that makes sense.

    --
    Tom Bascom
    tom@wss.com

  • The explanation found. New value of the -lruskips will be used after a block will be accessed N times where N is the previous value of the -lruskips. So if we will increase the -lruskips to 2 billions and then will change it back to 0 it will  take a lot of time ("eternity") before previous value will "expire". The blocks that were not accessed while the -lruskips was set to 2 billions will use the current -lruskips value immidiately. Excellent! It's what I need. We can start db with -lruskips 2147483647 and then immidiately change it to the "working" value. The -lruskips 2147483647 will stay in use for the blocks accessed at db startup including the ACO blocks. Or increase the -lruskips at any time, take Z, U, L or R action in promon' screen and return the previous value of the -lruskips. It can be done almost instantly. The ACO blocks (and most likely only these blocks) will be "infected" by the maximum -lruskips value.

    Tom, the trick can be useful for the scrips (like my dbmon) that use promon to gather db statistics as well as for 4GL programs that use VSTs (like ProTop). It makes them insensitive to the contention on LRU latch. I saw promon hung during the seconds when LRU latch was a bottleneck and in such cases a sampling interval missed the period with highest activity. I'm sure the same is true for VSTs.

  • The tests were re-done with a fresh mind. The trick is a bit harder than I thought yesterday: the -lruskips sets new "countdown" value in the buffer headers only when a buffer is accessed (obviously) and only when its current "countdown" counter is zero (what I missed yesterday).

    proserve sports -lruskips 0
    promon sports

    Step 1: promon reads ACO blocks at its startup:

    promon/R&D/debghb/6/1. Cache Entries

    05/09/16        Status: Cache Entries  
    
      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru   Skips
                                                                          
       33      64    1    139 O       0 L            4     0      0    0       0
       34      64    6    824 O       0 L         1385     0      0    0       0
       35      64    7     74 O       0 L           43     0      0    0       0
       36      64    8    211 O       0 L           10     0      0    0       0
       37       2    9    348 O       0 L            6     0      0    0       0
       38       2   10    485 O       0 L            6     0      0    0       0
       39      64   11    622 O       0 L            6     0      0    0       0

    Step 2: Increase the -lruskips and access the ACO blocks.

    4. Administrative Functions ...
    4. Adjust Latch Options
    8. Adjust LRU force skips: 100

    U - Update activity counters (any "Activity" screen) in promon or read any VST table.

    Look at the "Skips" coulmn.

      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru   Skips
       33      64    1    139 O       0 L            4     0      0    0     100
       34      64    6    824 O       0 L         1385     0      0    0     100
       35      64    7     74 O       0 L           43     0      0    0     100
       36      64    8    211 O       0 L           10     0      0    0     100
       37       2    9    348 O       0 L            6     0      0    0     100
       38       2   10    485 O       0 L            6     0      0    0     100
       39      64   11    622 O       0 L            6     0      0    0     100

    Step 3: Access the ACO blocks again.

    U - Update activity counters

      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru   Skips
       33      64    1    139 O       0 L            4     0      0    0      99
       34      64    6    824 O       0 L         1385     0      0    0      99
       35      64    7     74 O       0 L           43     0      0    0      99
       36      64    8    211 O       0 L           10     0      0    0      99
       37       2    9    348 O       0 L            6     0      0    0      99
       38       2   10    485 O       0 L            6     0      0    0      99
       39      64   11    622 O       0 L            6     0      0    0      99

    Step 4: Increase the -lruskips to its maximum value and access the ACO blocks.

    8. Adjust LRU force skips: 2147483647
    U - Update activity counters

      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru   Skips
       33      64    1    139 O       0 L            4     0      0    0      98
       34      64    6    824 O       0 L         1385     0      0    0      98
       35      64    7     74 O       0 L           43     0      0    0      98
       36      64    8    211 O       0 L           10     0      0    0      98
       37       2    9    348 O       0 L            6     0      0    0      98
       38       2   10    485 O       0 L            6     0      0    0      98
       39      64   11    622 O       0 L            6     0      0    0      98

    Step 5: Access the ACO blocks another 98 times

    UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU... ;-)

      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru   Skips
       33      64    1    139 O       0 L            4     0      0    0       0
       34      64    6    824 O       0 L         1385     0      0    0       0
       35      64    7     74 O       0 L           43     0      0    0       0
       36      64    8    211 O       0 L           10     0      0    0       0
       37       2    9    348 O       0 L            6     0      0    0       0
       38       2   10    485 O       0 L            6     0      0    0       0
       39      64   11    622 O       0 L            6     0      0    0       0

    Step 6: Access the ACO blocks just one more time:

      Num   DBKEY Area   Hash T S Usect Flags   Updctr   Lsn Chkpnt  Lru      Skips
       33      64    1    139 O       0 L            4     0      0    0 2147483647
       34      64    6    824 O       0 L         1385     0      0    0 2147483647
       35      64    7     74 O       0 L           43     0      0    0 2147483647
       36      64    8    211 O       0 L           10     0      0    0 2147483647
       37       2    9    348 O       0 L            6     0      0    0 2147483647
       38       2   10    485 O       0 L            6     0      0    0 2147483647
       39      64   11    622 O       0 L            6     0      0    0 2147483647
  • Yesterday I was wrong about VSTs: they do not read the ACO blocks like promon does. Sorry for misleading.
     
    Why the the ACO blocks should not be on LRU chain:
    For example, if you're planning, to kill an "annoying" self-service session then it's recommended to stop the session first (kill -SIGSTOP), to check if the session is holding any regular latches and to use the kill -9 hoping that the session did not hold a multiplexed latch. If the session is actively reading the data from database buffer pool (like readalot.p does in readprobe test) and if a database was started with -lruskips 0 than the chances to stop a self-service process while it holds LRU latch is approximately 3-10%. If it will really happen then a database will hang: nobody will be able to connect it. If you had the promon session that was already connected the database you will be unable to update any its screens: promon will try to read the ACO blocks and will wait for LRU latch. The "Activity: Latch Counts" screen in promon should not show an owner of LRU latch even though it's a regular latch. In fact it sometimes happens but only because a process got the latch while promon has already started reading shared memory with latch information. It's just 4K area (32*128 bytes) and promon reads it instantly but not fast enough compared with the latch operations. If a process is holding the LRU latch persistently then you will even unable to initiate an emergency shutdown. And there will be no footprints in the logs that could explain what is going on in your database. It's an ideal situation for the malicious minds. ;-)
     
    The solution is "flu shot" done when a database is yet working fine or at least before you're going to use SIGSTOP. Here is the script that does the trick:
    #------------------------------------------------------------------------------
    
    LruShot()
    {
    # "Flu Shot": make ACO ("Area Control Object") Object Blocks insensitive to
    # the contention on LRU latch. Promon will work even if LRU latch is locked.
    # Script sets "Skips" value in Cache Entries (promon/R&D/debghb/6/1)
    # to 2147483647 (a maximum value of the -lruskips). Access to these blocks
    # will not acquire the LRU latch the next 2 billions times.
    #
      Db=$1
    
      MaxSkips=2147483647
      MinSkips=2000
    
    # Do nothing if the current -lruskips is higher than MinSkips.
    # Otherwise set it to MaxSkips for a short period of time.
    # The higher the current lruskips the longer the script will work:
    # Approximately 1 sec per 1000 skips.
    
      PROSHUT=${PROSHUT-$DLC/bin/_mprshut}
    
    # Get the current value of the -lruskips:
      LruSkips=`
       (echo "R&D"     # Advanced options
        echo "4"       # 4. Administrative Functions ...
        echo "4"       # 4. Adjust Latch Options
                       # 4. Adjust LRU force skips: 0
       ) | \
        $PROSHUT $Db -0 -NL 2>/dev/null | tr -d "\f" | \
        awk '/Adjust LRU force skips:/ {print $NF}'
      ` # LruSkips
    
      echo The current lruskips: $LruSkips
    
      test $LruSkips -le $MinSkips && \
      echo Reading ACO blocks in loop... && \
      time \
     (
    # Set the -lruskips to MaxSkips:
      echo "R&D"       # Advanced options
      echo "4"         # 4. Administrative Functions ...
      echo "4"         # 4. Adjust Latch Options
      echo "4"         # 4. Adjust LRU force skips:
      echo "$MaxSkips" # Enter new LRU force skips value
    
    # Read ACO blocks = Update activity counters:
      echo "T"         # Return to the top level (main) menu.
      echo "2"         # 2. Activity Displays ...
      echo "9"         # 9. I/O Operations by File
    
      MinSkips=$LruSkips
      while [ $MinSkips -ge 0 ]
      do
        MinSkips=`expr $MinSkips - 1`
        echo "U"       # Update activity counters.
      done
    
    # Reset the -lruskips to its initial value:
      echo "T"         # Return to the top level (main) menu.
      echo "4"         # 4. Administrative Functions ...
      echo "4"         # 4. Adjust Latch Options
      echo "4"         # 4. Adjust LRU force skips:
      echo $LruSkips   # Enter new LRU force skips value
      echo "X"         # Exit from the OpenEdge Monitor utility.
     ) | \
     $PROSHUT $Db -0 -NL 2>/dev/null 1>&2
    
    } # LruShot
    
    #------------------------------------------------------------------------------
    
    LruShot sports