Unsafe error behavior in legacy ABL loops

Posted by dbeavon on 10-Oct-2018 15:06

In our legacy ABL/4GL code we often use loops to update records.  When writing these loops, the programmers often assumed that the ABL language will commit all records in an atomic transaction.  

However, there have been many occasions over the years when we discovered that an ABL transaction was not being committed atomically. 

On many occasions, several iterations of a loop would update records, but certain records would appear to be left alone.  In other words, some records are inexplicably excluded from a complex transaction.  At that point the normal course of action for the ABL programmer is to shrug and write a "one-time" program to clean things up after-the-fact.

I've got a small sample below that demonstrates the type of code that results in a transaction which is NOT atomic.

/* ************************************************************************ */
/* Locals                                                                   */
/* ************************************************************************ */
DEFINE TEMP-TABLE TT_Activity  FIELD ActivityRecord AS INTEGER.
DEFINE VARIABLE v_Loop AS INTEGER NO-UNDO.
DEFINE VARIABLE v_DatabaseUpdate  AS DECIMAL DECIMALS 5 NO-UNDO.
DEFINE VARIABLE v_Display AS CHARACTER NO-UNDO.
FORM WITH FRAME x-frame DOWN. 
   
/* ************************************************************************ */
/* Setup work                                                               */
/* Eleven things need to be done in a loop.                                 */
/* ************************************************************************ */
DO v_Loop = -5 TO 5:
   CREATE TT_Activity. TT_Activity.ActivityRecord = v_Loop.
END.

/* ************************************************************************ */
/* Transaction operates on all items atomically                             */
/* ************************************************************************ */
DO TRANSACTION:
   
   MESSAGE "STARTING TRANSACTION {&FILE-NAME} {&LINE-NUMBER}".
   PAUSE.

   /* ********************************************************************* */
   /* Operate on each of the 11 records                                     */
   /* ********************************************************************* */
   FOR EACH TT_Activity EXCLUSIVE-LOCK :
      
      /* Update Some Data */
      v_DatabaseUpdate = 10.0 / TT_Activity.ActivityRecord.
      RUN DoWork(v_DatabaseUpdate).

      /* Show results */         
      DISPLAY 
         "         AFTER DoWork COMPLETION FOR " + STRING(TT_Activity.ActivityRecord) @ v_Display 
         FORMAT "X(70)"
         WITH FRAME x-frame DOWN.
      DOWN WITH FRAME x-frame.
         
   END. /* TRANSACTION */ 
      
      
   /* ****************************************************************** */
   /* Finished                                                           */
   /* ****************************************************************** */
   MESSAGE "FINISHING TRANSACTION {&FILE-NAME} {&LINE-NUMBER}".
   PAUSE.
END.

RETURN.




/* ************************************************************************ */
/* Do some work                                                             */
/* Generates an error condition                                             */
/* ************************************************************************ */
PROCEDURE DoWork PRIVATE:
   
   DEFINE INPUT PARAMETER p_Input AS INTEGER NO-UNDO.
   IF p_Input = ? THEN RETURN ERROR "Failure".

   DISPLAY 
      "DoWork:  UPDATING DATABASE WITH INPUT " + STRING(p_Input) @ v_Display 
      FORMAT "X(70)"
      WITH FRAME x-frame DOWN.
   DOWN WITH FRAME x-frame.

END.    
   

Note that the DoWork method is a simplification of some complex operation that might generate an ERROR condition.  ERROR conditions may be raised manually or by the AVM itself. 

You will notice that, even when an error condition is raised, the loop runs to completion and the transaction commits. Only 10 iterations out of 11 are successful.  Half-way thru the loop there is a serious data error that occurs, but the transaction goes along its merry way like nothing had happened.

This seems like an "unsafe" behavior in that it doesn't commit data atomically.  I'm surprised that it is the default behavior.   Can someone please tell me what the fix would be to avoid partially committed data?  I'm looking for a "legacy" programming approach. 

(In some unrelated, OO-based source code I use "BLOCK-LEVEL ON ERROR UNDO, THROW" at the top of source files, but this would not have been available ten years ago when the legacy code was written.  I'd like a fix that would be familiar to an old-school ABL programmer, who is not comfortable with either OO or SEH.)

One option I could think of was to label the "DO TRANSACTION:" loop (eg. "OUTER_TRANS:").  Then the FOR EACH loop would also be modified to have an "ON ERROR" phrase like so: "ON ERROR UNDO OUTER_TRANS, RETURN".  This is the most straightforward option I can think of applying in a general way to our legacy ABL programs.  Unfortunately it is a bit laborious to introduce "ON ERROR" phrases to every loop in the system that may be participating in a transaction.  Even this approach wouldn't totally avoid the risk of partially committed data because, if the program was participating in a transaction that was already started in the calling program, then there would be no way of UNDO'ing it by name.

Any recommendations for fixing "legacy" ABL loops would be greatly appreciated.  It is not going to be a small task.

Posted by Lieven De Foor on 15-Oct-2018 08:30

If the code was written by experienced programmers, they have some catching up to do as the default error processing for a FOR statement is to RETRY (or advance to the next iteration when a retry would cause an infinite loop):

From the online help:

If you are using a REPEAT block or a FOR EACH block, and an error occurs, all of the processing that has been done in the current iteration of the block is undone, and the AVM retries the block iteration where the error occurred. (If the AVM detects that a RETRY of a FOR or iterating DO block would produce an infinite loop, it performs a NEXT instead. For more information, see OpenEdge Getting Started: ABL Essentials.

So this is all expected behaviour...

Your current options are:

-Add a BLOCK-LEVEL ON ERROR UNDO, THROW statement, which changes the above default behaviour and will cause the whole transaction to be rolled back

- Add a DO ON ERROR UNDO, RETURN ERROR block

- Add a DO ON ERROR UNDO, RETRY block + IF RETRY block to do error handling

- Add a DO ON ERROR UNDO, THROW block + CATCH block.

Posted by David Abdala on 11-Oct-2018 05:26

I think the problem has nothing to do with transaction "atomicity", but with the default ON ERROR of FOR EACH.

The fact that FOR EACH has an implicit ON ERROR UNDO, NEXT, means that a loop that fails, is simply ignored. This implies that the transaction is not undone in its entirely, but taken to the exact point it was at the beginning of the loop.

The fact that some records got updated and some didn't means the ON ERROR UNDO, NEXT is working properly, not that the transaction is not atomic.

This is the same as other databases that are capable of having multiple rollback points in the same transaction (most of them these days). While you are in a "global" transaction, you can set inner transaction point to wich you can undo the changes, and continue in the "global" transaction.

I'm not sure a BLOCK-LEVEL solves it, as I don't recall if that affects default ON ERROR.

The only "solution" to this program flow problem is to modify every FOR EACH, to have an ON ERROR UNDO, LEAVE, having a logical flag to know if the for each was successfull in a whole or not, and undoing the transaction based on this flag. You can pull this off too with ON  ERROR UNDO, RETRY.

I have been biten by this FOR EACH "feature" serveral times, still do (last week),,,

All Replies

Posted by David Abdala on 11-Oct-2018 05:26

I think the problem has nothing to do with transaction "atomicity", but with the default ON ERROR of FOR EACH.

The fact that FOR EACH has an implicit ON ERROR UNDO, NEXT, means that a loop that fails, is simply ignored. This implies that the transaction is not undone in its entirely, but taken to the exact point it was at the beginning of the loop.

The fact that some records got updated and some didn't means the ON ERROR UNDO, NEXT is working properly, not that the transaction is not atomic.

This is the same as other databases that are capable of having multiple rollback points in the same transaction (most of them these days). While you are in a "global" transaction, you can set inner transaction point to wich you can undo the changes, and continue in the "global" transaction.

I'm not sure a BLOCK-LEVEL solves it, as I don't recall if that affects default ON ERROR.

The only "solution" to this program flow problem is to modify every FOR EACH, to have an ON ERROR UNDO, LEAVE, having a logical flag to know if the for each was successfull in a whole or not, and undoing the transaction based on this flag. You can pull this off too with ON  ERROR UNDO, RETRY.

I have been biten by this FOR EACH "feature" serveral times, still do (last week),,,

Posted by dbeavon on 11-Oct-2018 11:06

I agree in principal to your ON ERROR recommendations.  Is this a standard programming practice whenever you are updating the database in a loop?  Is there some unstated pattern that says *every* ABL loop involved in a transaction must specify "ON ERROR" to avoid bugs?  Why is this pattern overlooked in so much of our legacy 4GL code?  The legacy code I'm referring to was not written by novices, but by experienced programmers/consultants who specialized in delivering entire ERP's based on Progress 4GL .

It is probably just a matter of semantics but the ABL language and runtime environment *is* allowing transactions to be partially committed.  I would agree with you that the underlying OE database itself may not be responsible for losing any data. But the language and runtime environment certainly is ensuring that the loss of data is the "default behavior" whenever an error condition is raised in a loop.

The problem with FOR EACH loops is that a developer may initially write them to do one simple thing.  That thing could be so simple that it can't possibly raise an ERROR, like assign an established memory variable to a list of database records.  But after some time another developer may need to come along and add a new line of code, and another, and another.  Soon there is a lot of code in the loop, some of which has the potential to raise an ERROR condition (whether that scenario has been explicitly tested for or not is another matter).    The final result of all this is very bad.  We suddenly start experiencing unexpected data loss in certain unusual scenarios - because individual database records are quietly UNDO'ing themselves.  

If there are other ideas for fixing our legacy code to avoid data loss, I would appreciate hearing about them.  Unfortunately I'm guessing there won't be a silver bullet that fixes the entire ERP at once.  Structured error handling (SEH) would certainly come in handy here but I'd like to avoid making changes that would be unfamiliar to an old-school ABL programmers, who aren't comfortable with either OO or SEH.

Posted by Lieven De Foor on 15-Oct-2018 08:30

If the code was written by experienced programmers, they have some catching up to do as the default error processing for a FOR statement is to RETRY (or advance to the next iteration when a retry would cause an infinite loop):

From the online help:

If you are using a REPEAT block or a FOR EACH block, and an error occurs, all of the processing that has been done in the current iteration of the block is undone, and the AVM retries the block iteration where the error occurred. (If the AVM detects that a RETRY of a FOR or iterating DO block would produce an infinite loop, it performs a NEXT instead. For more information, see OpenEdge Getting Started: ABL Essentials.

So this is all expected behaviour...

Your current options are:

-Add a BLOCK-LEVEL ON ERROR UNDO, THROW statement, which changes the above default behaviour and will cause the whole transaction to be rolled back

- Add a DO ON ERROR UNDO, RETURN ERROR block

- Add a DO ON ERROR UNDO, RETRY block + IF RETRY block to do error handling

- Add a DO ON ERROR UNDO, THROW block + CATCH block.

Posted by dbeavon on 15-Oct-2018 08:57

The structured error handling  approaches (THROW/CATCH, esp. the BLOCK-LEVEL default) would be my preference. But these would be out of place in the legacy code that I'm working with (written pre-OE10).  To a certain extent, a programmer would also need to understand OO to use SEH, since the errors are OO classes in an inheritance hierarchy.  Remember that there are many ABL programmers who deliberately avoid these new OO-based language features.   I think that I'm left with the need to change all our FOR loops, and retrofit them with the "ON ERROR" phrase.

As a side, I wonder whether the default/legacy handling of FOR loops might have been designed for interactive CHUI interfaces?  In that type of environment, the user would be directly informed by the AVM of failures, and would potentially be able to take measures to manually correct the problems in their data after-the-fact.  This type of default/legacy behavior may have worked OK for CHUI, but now that we are refitting the old 4GL code to run as a background process (or within appserver), then it causes data to become inconsistent in unexpected ways.  It is unfortunate that the default/legacy behavior of FOR loops is to create inconsistent data.

Thanks for the feedback.

Posted by Håvard Danielsen on 15-Oct-2018 12:18

You can use BLOCK-LEVEL declarations or THROW in code even if there are stubborn ABL programmers that deliberately avoid OO-based language features. The errors that are thrown as a result of the declaration or by throw statements will be regular error conditions and can be handled the same way as errors thrown by the do on error alternatives.    

This thread is closed