Salesforce

Why you should not truncate the bi of the target database when relying on an ai DR plan ?

« Go Back

Information

 
TitleWhy you should not truncate the bi of the target database when relying on an ai DR plan ?
URL NameP26615
Article Number000128492
EnvironmentProduct: Progress
Product: OpenEdge
Version: All Supported Versions
OS: All Supported Operating Systems
Question/Problem Description
Why you should not truncate the bi of a hotspare target database when relying on an AI standby disaster recovery plan ?

Why is the bi file larger on the standby hotspare database than the original source database?
Why does rolling forward after-image notes (AI) on a hotspare database cause the bi file to grow bigger than the live database?
Why is the target database's bi file so much larger than the ai files used to roll forward against it?
How to reduce the hotspare bi filesize when applying ai notes with RFUTIL roll forward?

How to manage the size of the bi file on a rolled forward target?
When the hotspare bi file grows to largefile limits can another bi extent be added?
Can we schedule truncating a standby hotspare using aiscan to determine which ai file closed the last open transactions?
When aiscan reports zero active transactions can the target bi file space be recovered ?
Can the bi file size be reduced when the the last ai file applied had no open transactions
Can a roll forward target be truncated and guarantee further roll forward operations?
When active transactions are zero from the last roll forward can the bi file of a hotspare database be truncated and still allow subsequent roll forward operations ?
 
Will truncating a hotspare bi file break updating the standby hotspare?
Is it safe to truncate the bi of the hotspare database after roll forward?
Is it supported to truncate the bi file of a hot standby database ?
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
The bi on a hotspare roll forward target database will grow larger than the bi on the source database. To review a detailed discussion from a DB internals perspective, refer to Article:
In standby-hotspare environments a progressively larger bi file in the target environment can result when source database transaction activity is applied. This Article, discusses:
  1. Why truncating the target (hotspare) database is not a supported method of managing the bi size of the target database when subsequent roll forward operations are still required.
  2. How to reclaim bi filespace used by the target roll forward database
  3. How to avoid bi growth by aligning roll forward operations with bi cluster re-use
Truncating the target (hotspare) database is not a supported method of managing bi growth

A method used to manage a large bi filesize on the roll forward database is to truncate the bi file when there are zero in-flight transactions at the end of the last roll forward then apply the next AI files.

This is confirmed with RFUTIL aiscan reports or by parsing the database lg file while applying AI files:

At the end of the .ai file, 0 transactions were still active. (1636)

While this method is safe, it should never be perceived as reliable and does not guarantee standby baseline continuity for the next foll forward operations.

This method appears to work and allow further roll-forward operations post bi truncation when there are no in-flight transactions, particularly in test environments and most of the time once implemented in production.
This is not enough to depend on it as a standard practice, particularly when the consequence of the next ai rolled forward fails necessitates having to re-seed the standby hotspare baseline. In an emergency it can be used to reclaim bi space as long as knowing that if it fails, re-baselining from a more recent backup - preferably without long outstanding transactions - is the accepted method to reset the bi file size.

The need to recover bi space is most apparent after an unexpected bi growth event on the source database that is eventually resolved without having to disable after-imaging:
  • Once all outstanding transaction(s) are eventually committed at the end of the most recent AI file  applied.
  • PROUTIL truncate bi reclaims disk space in the standby environment. 
  • The next sequence set of ai files applied fail to roll forward.
  • The standby database has to be re-sourced unexpectedly because the roll-forward baseline is broken as a consequence of truncating the hotspare bi file.
The reason this method is not supported is because this usage was not one which was anticipated or considered as intended behavior when the multiple AI extent feature was designed, way back for the first Progress 7 release. It remains a method that is not supported, not because it is undesirable but because it has not been designed to accommodate the complexities of internal transaction notes beyond those that are simply related to application activity. This is why it is not documented and no regression tests are designed to ensure it will continue to work this way in future releases. 

Why does a connection to the hot standby (target) put the database through crash recovery?

When the standby hotspare is opened before the final roll forward session required to restore the database when it is then opened for access, subsequent roll forward redo processing will fail.

When the database is opened, before-image recovery runs and the model presumes any outstanding transaction (without a commit) must be aborted, before access is granted.
  • There are Three Phases of BI recovery that take place when the database is opened.
  • All transactions that have not been committed initiate the Physical and Logical UNDO Phases to effectively abort the transaction.
  • When the roll forward database is prematurely accessed by any OpenEdge utility capable of making changes, a single user client or started for multi-user access:
  1. Subsequent roll forwards fail due to timestamp mismatches between the Master Block and the .bi file.
  2. Data is lost when transactions that span AI extents is undone by the Physical and Logical UNDO Phases
RFUTIL only runs the BI recovery REDO Phase, to allow further transaction 'redo' as AI extents are applied until the target reflects the original database at the time of a crash, disaster or intended roll forward time/transaction.  Access can be restricted using the OPLOCK/OPUNLOCK qualifiers to prevent unintended access before the final transaction notes are applied. For further information refer to Article:

To re-set the BI file size on target:
  • Stop the live (source) database.
  • Depending on site strategy, re-seed the standby hotspare:
    • Truncate the (source) bi then use the OS backup method ensuring the database is marked backedup and the ai file is switched.
    • An online PROBKUP can be used when there are no long running transaction activity also causing unexpected bi growth in the source database environment. Otherwise, an offline PROBKUP will truncate the bi file as part of the process and switches the AI file.
  • Restore the target database for future roll forward operations

Considerations to minimize bi growth on the roll forward target database

Review the following parameters which govern the design of before-image management.

1.   Cluster Ageing (-G)

Prior to Progress 9.1E02, 10.0B02, ensure that -G (Cluster Ageing) database startup parameters is set no lower than 60 seconds and account for any reason it may need to be set higher.  Due to the design handling of flushing data in these versions, cluster ageing lower than 1 minute introduces a high risk losing cached data. 

Since Progress 9.1E02, OpenEdge 10.0B02 Cluster Ageing no longer needs to be considered. The default for -G is 0 facilitated by design changes that use fdatasync() instead of sync() calls to assure data associated with closed BI Clusters is flushed to disk before eventually returning them for re-use.  The only reason Cluster Ageing may need to be increased is for

In any Progress version, take care when using a Cluster Ageing (-G) value above the default value for normal operations. 

Example: A site had a 3-minute cluster ageing interval set (-G 180) and AI switches at 5 minute intervals averaging 40MB each. 
  • Roll forward was performed in half-hour batches (6).
  • The associated AI files took less than 3 minutes to complete their roll forward operations. These transaction notes are also recorded in the target database's BI file.
  • The associated bi clusters cannot be re-used until the Cluster Ageing time expires.
  • The consequence of having to add BI Clusters due to the imposed 3-minute cluster ageing, results in a 150% difference in bi file size between the source and target databases.
  • It would be considered unusual to need cluster ageing as part of normal operations unless there are inherent problems with the underlying cache and disk subsystem.

2.   Delay of Before-Image Flush (-Mf)

Ensure "Delay of Before-Image Flush" -Mf is at the default of 3 seconds on the source database otherwise account for any reason it may need to be set higher. 

Example: A site had the -Mf database startup parameter set to 120 seconds and experienced 40% difference in bi growth on the roll forward database consequently. After dropping it back down to the default 3 seconds, bi files were again within acceptable comparisons.

3.  BI Cluster Size and BI Blocksize

The BI Cluster Size and BI Blocksize on both databases should be the same. 
Cluster aging takes place at the close (checkpoint) of a BI cluster. When the bi clustersize is changed on the source, re-baseline the target database from a new backup so that the target database has the changed value.

4.   AI switch interval

Consider increasing the AI switch interval to allow more transactions to complete within an AI extent. 
Alternatively, consider an AI switch strategy based on 'bytes' rather than time switches by using fixed AI extents that switch only when filled.

5.   Roll Forward interval

Roll forward the AI notes in 'real time' or in smaller batches to allow cluster ageing to take effect.
Workaround
Notes

References to Other Documentation:
Progress Article(s):
 Considerations when upgrading from OpenEdge 11 to a later OpenEdge 11 version.
 
Keyword Phrase
Last Modified Date11/20/2020 6:57 AM

Powered by