Transaction undo and life after logout - Forum - OpenEdge RDBMS - Progress Community

Transaction undo and life after logout

 Forum

Transaction undo and life after logout

This question is not answered

When transaction is interrupted by a signal it will be backed out, a session will log out from a database and… process will keep running for a while: the time of its life after logout equals approximately 50-80% of transaction’s duration.

The common terminating signals that can interrupt a Progress session can be divided into two groups - "stoppers" and "terminators".
"Stoppers": SIGINT, SIGPIPE, STOP and UNDO statements in 4GL (4GL statements are not the signals but anyway..).
"Terminators": SIGHUP, SIGQUIT, SIGTERM and proshut -C disconnect

These groups are different in many aspects:
Duration of transaction undo triggered by a "stopper" is just a bit longer than transaction duration (an undo adds BI reads).
Undo initiated by a "terminator" is much faster - 25-50% of transaction duration (the factor varies with transaction size). It’s the same time that broker/watchdog will spend on transaction undo if a session was terminated by SIGKILL.

If a session has opened the transactions in a few databases then all transactions will be undone one transaction in time, in other words, a session will be temporary "frozen" in all databases except one. Transaction backouts initiated by the "stoppers" will be done in the order in which a session has been connected to the databases. Transaction backouts initiated by the "terminators" will be done in the /reverse/ order: the last connected database will be the first one backed out.

The different scenarios, the different undo times - does it mean the different algorithms that Progress is using to undo the transactions depending on the trappable or untrappable STOP conditions? "Stoppers" do not set neither the "usertodie" flag (_Connect-Disconnect) nor the "resyncing" flag (_Connect-Resync). "Terminators" set the "usertodie" flags in all databases (except, of course, proshut -C disconnect that does it only for one database) and set the "resyncing" flag for one database - the one that is rolling back its transaction.

"Terminators" cause a session to write the backout messages to a log of the corresponding database (even if transaction was in the ALLOCATED status, i.e. when the transaction did not really change a database):
(2252) Begin transaction backout.
(2253) Transaction backout completed.
Of course, "stoppers" do not leave such messages.

If 4GL code is trapping the STOP condition and quits on STOP (IF RETRY THEN QUIT) then a session is terminated immediately after logout from the last connected database.
The "stoppers" (except SIGHUP) leave a session on pause. For example:
KILL signal received. (298)
Press space bar to continue.
The message really means "Press space bar to terminate the session".
The common signals ("stopper" or "terminator") will terminate a session in this state. Let’s treat this state as an end of session’s execution.

Between this state and the logout from the last database (= first connected database) the session keeps running for a while. Example:

[2018/05/12@09:24:19.113+0400] P-1135       T-140073628817216 I ABL     8: (452)   Login by root on /dev/pts/27. 
[2018/05/12@09:24:19.115+0400] P-1135       T-140073628817216 I ABL     8: (7129)  Usr 8 set name to root. 
[2018/05/12@09:24:19.224+0400] P-1135 Transaction begins.
[2018/05/12@09:25:19.228+0400] P-1135 Transaction is ready.
[2018/05/12@09:25:47.000+0400] kill -HUP 1135
[2018/05/12@05:25:47.000+0000] P-1135       T-140073628817216 I ABL     8: (562)   HANGUP signal received. 
[2018/05/12@09:25:47.117+0400] P-1135       T-140073628817216 I ABL     8: (2252)  Begin transaction backout. 
[2018/05/12@09:26:09.014+0400] P-1135       T-140073628817216 I ABL     8: (2253)  Transaction backout completed. 
[2018/05/12@09:26:09.014+0400] P-1135       T-140073628817216 I ABL     8: (453)   Logout by root on /dev/pts/27. 
[2018/05/12@09:26:53.000+0400] P-1135 is terminated.

Tran Time: 60.004 sec
Undo Time: 21.897 sec
Time after logout: 43.986 sec

Note: Undo + Time after logout seems to be always a bit longer than a transaction duration.

"P-1135 is terminated" was reported by:

while sleep 1
do
  kill -0 $PID 2>/dev/null && continue
  echo [`date '+%Y/%m/%d@%H:%M:%S.000%z'`] P-$PID is terminated. >>$LOG
  break
done &

Stack trace of a process after logout shows the calls:

fgetcxdll freadxdll frdfrom bfundo bfrej rnxitn_noval rninterpret rnrq main

I can be wrong but it looks like a process is reading lbi file. The size of lbi file created during transaction is almost twice (1.8-1.9 times) higher than the number of bytes written to BI file by transaction before it was interrupted. Lbi file is not growing during transaction undo. The size of lbi file is only slightly (700-800 bytes) increasing after database logout. I did not use the -nosavepoint parameter during the tests. The -lbimod parameter also has the default value.

If a session does read its lbi file at the final phase then it reads only the part of it related to the interrupted transactions. Anyway it’s not clear why process does not terminate immediately after logout from the last database.

Of course, it’s a minor issue because after logout a process is not using database resources. We can safely use SIGKILL to terminate a process provided we are 100% sure that a session is not connected to any other databases. But we will have 100% insurance only when a process is terminated.

Tested with 11.7.2 on Linux and HP-UX

All Replies
  • If I am following your logic one potential optimization might be to skip the LBI processing?

    --
    Tom Bascom
    tom@wss.com

  • Maybe the LBI processing (if it's really LBI) is just a bug?

  • For a "stopper" I can imagine it is a feature - you can catch the STOP condition in the startup procedure and then re-initialize your session.  In a case like that you would want any UNDO variables and temp-tables to be properly undone.

    For a "terminator" it seems like there would be no point and you could just throw it all away.

    Of course if everything is defined NO-UNDO and if you are not writing code with lots of nested sub-transactions there ought not to be much to undo anyway so the sheer size of what you are seeing seems a bit off kilter.

    If they feel so inclined someone with knowledge of the internals can probably correct my ramblings and explain why it needs to be done the way that it is being done.

    --
    Tom Bascom
    tom@wss.com

  • It would be interesting to know why there is a difference in transaction undoes triggered by a "stopper" and by a "terminator". Why a second one is much faster?

    Most likely the UNDO variables are treated as the fields of some pseudo tables (one pseudo table per procedure) in some pseudo database. But in my test program all variables are defined as NO-UNDO and 'strings lbi*' shown only the contents of the customer.name fields. I used the simple transactions in sports database:

      REPEAT:
        FOR EACH sports.customer EXCLUSIVE-LOCK:
          ASSIGN sports.customer.name = "x" + sports.customer.name.
          ASSIGN sports.customer.name = SUBSTRING(sports.customer.name, 2).
        END.
        
        IF NOW GE vEndTime THEN
        LEAVE.
      END.
    
  • Tom: it /doesn’t/ need to be done the way it is done. like many things, there is more than one way to do it. there are also opportunities to optimise the present implementation. I won’t get into those here.

    George: i’ll explain as best i can.

    Remember that transactions are an /error handling/ mechanism. There are really four variants of undo in the client (plus two in the database, but that is another topic). they are: 0) undo triggered by an error condition detected in application logic and execution of an UNDO statement, 1) undo triggered by some error condition like a duplicate key, 2) undo triggered by a stop condition, 3) undo and termination triggered by an error detected by a signal handler.

    The LBI file is used to record client-side history so that some or all actions can be undone. In all cases, there may or may not be LBI reading to accompany the undo action. The LBI file is composed of one or more nested “ar units” or “action-recovery units”. An ar unit matches execution of a block in the 4GL. One ar unit may align with a block that has a database transaction and that does not have to be the outermost ar unit. Only one because database transactions cannot be nested in the 4GL. Subsequent ar units will be sub-transactions and their database actions may be recorded in the LBI file or not if database savepoints are being used by the client. Also, a nested ar unit may involve the start of a transaction on another database. When that happens, the second database transaction “joins” the outer one in a sense. While actions on the second database can be undone, that transaction can’t be committed until the outer database transaction commits.

    Along with block nesting, the LBI file is used to record values of local variables, temp table actions, database actions, buffer connections and disconnections and a few other items related to client side state information.

    During forward processing, the LBI file is buffered in memory and written sequentially when the buffer fills. When an undo happens, the LBI is read backwards from eof and a new eof established when the undo is finished. Depending on how much is being undone, the new eof may be at the beginning of the LBI or at the start of some block’s ar unit.

    When an undo happens, the undo may go back to any active ar unit. The target ar unit may be the same one that matches a database transaction, higher, or lower. In the first four cases mentioned above, the 4GL session will or can continue, depending on the application logic. The undo processing is very similar for all of them, except that a stop condition may be triggered by a lost database connection. In such a case, database actions in subtransactions cannot be undone by the client. undo must include the ar unit for the outermost database transaction but may go farther than that but not does necessarily stop the session.

    In the last case, the client session is going to be terminated due to a fatal error. much of the undo processing is the same as above, except that it goes all the way to the outermost ar-unit and then the client will exit. In some cases, depending on the type of error (e.g. an access violation), undo processing consists only of rolling back the database transaction followed by database disconnect and process exit.

    I’m sure I have left out something important but i cannot remember what it is.

    -gus

  • > There are really four variants of undo in the client (plus two in the database, but that is another topic).

    Thanks, Gus. Are they mutually exclusive? I tried SIGINT + pause + SIGHUP  (numbers 2 and 3 in your list) and I got the rather unexpected results for transaction undo. And, by the ways, for this combination of signals a process is terminating immediately after logout.