Salesforce

How to debug PROGRESS process hang.

« Go Back

Information

 
TitleHow to debug PROGRESS process hang.
URL NameP14220
Article Number000136338
EnvironmentProduct: Progress OpenEdge
Version: All supported versions
OS: All supported platforms
Question/Problem Description
How to debug a PROGRESS process hang.
How to troubleshoot PROGRESS processes appearing to hang.
Steps to Reproduce
Clarifying Information
Error Message
Defect Number
Enhancement Number
Cause
Resolution
The PROGRESS monitor utility, PROMON, is mentioned in the debugging steps below. When running PROMON to check hung processes, it should be run with the -F option. 
  • The -F option runs the PROMON utility without using the login semaphore.  
  • It should only be used if all the other processes and the database appear to be hung as this Article discusses.
  • PROMON -F is not the same as forcing in to the database with the -F qualifier to proutil truncate bi and, 
  • Unlike skipping crash recovery, does not pose any risk to database integrity.
Which is not to say that PROMON with -F 'safe'. It should not be used in any circumstances other than in an emergency when the database and all processes connected to it are completely hung.  Using -F with PROMON, will make the executable skip locking the login semaphore (ie: reduces the latch count for USR _connect table). If other processes are logging in while _mprshut is logging in with -F results can be unpredictable.  Among other things the possibility that two connections could end up using the same usrctl which is likely to cause all sorts of problems.

When a process appears hung, check the following:

1.  PROMON Option 4: Record Locking Table

Check to see if the user is waiting to obtain an exclusive lock on a record that is share locked or exclusively locked by another user. If so, this may be what is causing the "hang".

If this is the case, there are two Options: 
  • The first is to let the user wait for the user who owns the lock to release it. 
  • The second is to disconnect one of the users, either the one waiting for the lock or the one who holds the lock. The user can be disconnected either from the shutdown menu in PROMON or using proshut.

2.  Circular RM Chain

When new records are created or existing records are updated:
  • The record manager (RM) chain of the database is searched looking for space in a block to populate the new data. 
  • The RM chain contains all blocks in the database that have some free space remaining. 
  • Each block on the RM chain points to the next block in the chain. 
  • If one of these pointers becomes corrupted, instead of sequentially searching through the RM chain, the process may end up looping through the RM chain.
  • Other processes that then access the same block while updating, may also hang.
To determine if this is the cause of the hanging process, do the following:

2.1.  CPU Time

Check to see if the CPU time of the process is increasing.
  1. On UNIX, the "ps -ef" command
  2. On Windows, the Task Manager
If not, then a circular RM chain is not the cause of the hangs.

2.2.  PROMON > R&D > Option 2, Activity Displays > Option 10 - Space Allocation. 

Look at the number reported for "rm blocks examined". 
If the number is very high, in the thousands, then the database may have a circular RM chain.

2.3.  If the process is using CPU time and the rm blocks examined is high, then the RM and Free chains in the database should be rebuilt:

a.  Truncate the BI file
$   proutil dbname -C truncate bi
b.  Backup the database **Very important**
c.  Access the backend database tool DBRPR which will disable AI if enabled.
$  proutil dbname -C dbrpr
Select Option 1. Database Scan Menu
Enter 7 to turn on Rebuild Free chain.
Enter 8 to turn on Rebuild RM chain.
Enter A to Apply scan to all areas or
Enter 10 to Change Current Working Area
Enter "G" for GO to start the rebuild.
 
The time to rebuild the RM and Free chains is dependent on the size of the database and the speed and load of the machine. It is not a very time-consuming operation. For very large databases it can take 2 hours or more.

3.  Further Hang Troubleshooting

If none of the above are likely causes of the hang, the following should be verified:

3.1.  Is the process using any CPU time? Is the process doing I/O?
It may be that the application is looping. The application code should also be checked.

3.2 Generate a stack trace against the process.  
Depending on the Operating System and the executable the process is running depends on the tool to use.
Typically we advise to take at least 3 stack traces against processes.

For example
"kill -USR1 <process-id>" to the process will generate a core file and, depending on the platform, a protrace.
kill -USR1 will not generate a stack for a java process, jstack can be used instead.

3.3.  Has anything been changed on the system or in the application? If so, investigate the changes.

3.4.  Are there any errors reported in the Database log file or related Application Server log files?

3.5.  Are only PROGRESS processes hanging or are there other processes hanging as well? If others as well, then the operating system vendor should be contacted and the system checked out.

3.6.  Is the hang reproducible? If so, determine what steps result in the hang on the system or in the application? If changes have been made to the environment or application, investigate the changes.

3.7. Use Gather script  as indicted in the following article:  Script for gathering database information on Unix.
Workaround
Notes
Keyword Phrase
Last Modified Date2/10/2021 9:08 AM

Powered by