Salesforce

Replication: How to recover from errors 6091 827 on ai files

« Go Back

Information

 
TitleReplication: How to recover from errors 6091 827 on ai files
URL NameP109337
Article Number000148628
EnvironmentProduct: Fathom Replication
Version: 3.0A
Product: OpenEdge Replication
Version: 10.x, 11.x
OS: All supported platforms
Question/Problem Description
Source replication database goes down with errors 6091 and 827 on after-image file
file-name in error 6091 refers to the current BUSY AI extent
The Target Replication database has shut down previously or the Replication Agent is not running
When -aistall is not in use on the source database it shuts down
Steps to Reproduce
Clarifying Information
Error MessageFailed to switch to next after-image extent. (3784)
<function>:Insufficient disk space during <system call>, fd <file descriptor>, len <bytes>, offset <bytes>, file <file-name>. (6091)
** rlaixtn: Insufficient disk space to extend the after-image file. (827)
Defect Number
Enhancement Number
Cause
Resolution

What causes AI extents to become unavailable?

Error message 827 occurs when trying to extend a Variable-length after-image extent. The extent cannot be extended having reached filesize limits or running out of disk space and the extent is the only ai extent, or the next ai extent in the ai sequence is not a free "EMPTY" ai extent. In other words the remaining ai extents are either "LOCKED" or "FULL" and therefore not available. 

This scenario can also occur with FIXED ai extents, when the current BUSY extent needs to switch to the next ai extent in sequence whose status is either FULL or LOCKED.

Under the Fathom/OpenEdge Replication model, when the Replication Agent (RPLA) of the target database terminates and/or the target database server goes down, the Replication Server (RPLS) on the source database will also terminate after the connect-timeout has expired. At this stage,

  • The source database is still running and so is the after-imaging.
  • The ai files continue to fill up during this time recording database transaction activity.
  • As AI files switch to the next ai extent, their status changes from "BUSY" to "LOCKED" under the replication model.
  • The "LOCKED" status will only be released when the target database is restarted (and therefore the RPLA) and the "dsrutil source -C restart server" on the source database, so that the RPLS can connect to the RPLA and begin to apply the ai notes at a block level where it last left off.

In other words: Whenever a "FULL" ai file has not been applied to the target database, it will be in the "LOCKED" status until such time as it has been applied to the target database.  Once it has been applied, it will then be changed to the FULL status when it can then be made available again with the "RFUTIL source -C aimage empty".  This is how the model works.  There is no way to change the LOCKED status to anything else while Replication is enabled. It is therefore imperative to monitor the after-image extent availability and during times when the RPLS and RPLA have lost connection take proactive measures.

How to make AI extents available again:

There is no need to disable replication on the source database, other means to recover from this scenario are presented below, depending on the current status of the ai extents.  These methods essentially involve making more/new ai extents available in order for the source database to continue operations while the target database is recovered, synchronised and ai notes can continue to be applied eventually bringing the target inline with the source.

During this recovery operation it is worth stopping ai switch batch/cron jobs
If AIMGT is enabled, change the timed interval to on demand with "rfutil
source -C aiarchiver setinterval 0"

The current status of the ai extents can be queried with "RFUTIL source -C aimage list"
Without the -aistall startup parameter on the source database, the source database would have shut down. This is the start-point of the methods outlined below.

OPTION A: If  there are any FULL ai extents:

  • Manually marked these as empty, "rfutil source -C aimage empty"
  • Restart the target and source databases

IF -aistall had been in place, it is only necessary to restart the RPLS with "dsrutil source -C restart server" as the source database will still be running but no updates allowed until ai extents became available to record the related transaction notes.

OPTION B: If there are available variable EMPTY ai extents, but no diskspace available

In otherwords "FULL" ai extents are all currently marked LOCKED until replication resumes (except of course the current one BUSY), and there are still available variable EMPTY ai extents, but no diskspace available:

  1. Shut the source database down, "proshut source -by" if -aistall is in use otherwise the source database will already be down.
  2. Move the ai extents that were available (EMPTY) but had no diskspace and the current "BUSY" ai extent to another disk
  3. $   prostrct list source source.st
  4. Edit the resulting structure file (source.st) to reflect the new absolute file location of the moved ai files
  5. $   prostrct repair source source.st
  6. $   prostrct list source source.st  again to verify in the resulting source.st file, that the control area of the source database has been properly updated to where the ai files are where they have been moved to
  7. Consider adding more AI files with "prostrct add source addai.st", where addai.st contains the additional ai files
  8. Re-start the source and target database.

OPTION C: If there are no FULL ai extents

In otherwords all ai extents are marked LOCKED (except of course the current one BUSY):

This Option is only available to Progress 9.1E, OpenEdge 10.1B03 or later (and later). Before proceeding refer to Article:  How to switch to new ai extent after adding a new ai extent to the database   

  1. Shut the source database down, "proshut source -by" if -aistall is in use otherwise the source database will already be down.
  2. Add more ai extents by running: "prostrct add source addai.st" where addai.st defines where the new ai files will be placed. AI extents can be added anywhere there is disk space available.
  3. Run: "prostrct reorder ai source" to ensure that the EMPTY ai extents immediately follow the current BUSY ai extent
  4. Re-start the source and target database.
OPTION D: Manually roll forward the LOCKED ai extents onto the target,

Once all the transaction notes in the LOCKED AI extents have been applied to the replication target database, the LOCKED files will get cleared down very quickly once replication resumes.

IMPORTANT NOTE: This Option is only valid if the current BUSY extent still has space to write to, in other words, the source database can be started and still has a small amount of ai file space to write to.  Regardless, the following is also a very good technique to get the target in line with the source when replication has been down for some time and there are a lot of ai notes to synchronize.

For further information refer to Article  How to manually apply a LOCKED AI extent to the target database?   

After applying the method particular to the current scenario, once the target database is synchronised with the source, the ai notes will be processed against the target database while activity is allowed to continue on the source database.  The progress of this activity can be monitored with the "DSRUTIL target -C monitor", Option A: Replication Agent.  As soon as each "LOCKED" ai extent has finished being processed, it will be marked "FULL" and therefore available again once they are marked EMPTY with "rfutil souce -C aimage empty", unless AIMGT is enabled which will automatically manage these. 

The key factor in this scenario is the availability of ai files during times when replication has ended and normal processing continues against the source database filling AI files with transaction notes.

OPTION E: Disable After-Imaging then re-enable Replication

This should be the option of last resort. It will require re-baselining the target database(s) with a new backup of the source.
Every site should have a documented proceedure for this task, the following instructions simply outline the first steps to disable after-imaging in order for production to continue running without having to accommodate the AI structure.

Online:
$ dsrutil source_dbname -C disablesitereplication source
Delete the *.repl.recovery 
$   rfutil source_dbname -C aimage aioff

Offline:
$  proutil source_dbname -C disablesitereplication source
Delete the *.repl.recovery
$  proutil source_dbname -C aimage end




 

Workaround
Notes
Keyword Phrase
Last Modified Date11/20/2020 7:39 AM

Powered by