Salesforce

Investigating why the database crashes with error 4194

« Go Back

Information

 
TitleInvestigating why the database crashes with error 4194
URL NameP23968
Article Number000146776
EnvironmentProduct: Progress, OpenEdge
Version: All supported versions
OS: All supported platforms
Question/Problem Description
Troubleshooting database crashes with error 4194 updating the database .lk file
What causes database crashes with error (4194)
Database crashed with the error message: Broker disappeared, updating <dbname>.lk file. (4194)

 
Steps to Reproduce
Clarifying Information
Error MessageBroker disappeared, updating <file-name>.lk file. (4194)
Defect Number
Enhancement Number
Cause
Resolution
It is important to determine what is causing the Broker process (_mprosrv) to die and try to resolve that problem.

As per the description of this message: Broker disappeared, updating <dbname>.lk file. (4194)
"The watchdog noticed that the broker process is no longer active on the system. The watchdog is updating the .lk file to prevent database corruption and will shutdown the database"

Only the Watchdog process can checks to see if the Database Broker is still alive, this is one of the many functions it performs. For further detail refer to Article:
Investigative measures:

Finding the reason for the watchdog process actions, is what this Article hopes to address.  

The first evidence gathering is to check the database log file for error messages from a few minutes before the time of the failure up to the end, then correlate with protrace that may have been created at the time, together with operating system logs.

1.   Cron Jobs:
  • Are there KILL commands in the scripts in use?
  • Is the cron job disconnecting the WDOG user?
  • If there are more than one Progress installation on the machine:
    • Are all the environment variables used pointing to the correct version?
    • Is the cron job first checking for Active Transactions against users before it disconnects them?
    • How are the cron jobs differentiating between a userid on a database running version 9.1E of Progress and another on OpenEdge 11.6 where the database names are the same, for example?
Review the cron jobs, with particular reference to the above and points 2, 3 and 4 below.

2.   Could the Login Broker have been terminated manually?
  • Could the Login Broker have been terminated by a user on the server
  • When disconnecting a PROCESS they're actually disconnecting the BROKER instead?
  • When disconnecting a process, it still has locks in shared memory?
3.   Are the 4194 errors following a "time pattern"' in the database lg file?
  • Is there a specific time or day/time associated ?
  • Are there related errors in the system log files  (Event Viewer or sys.logs).
Interrogate the system logfile during the times of the 4194's from the server experiencing this problem

4.   Who owns the .lk file?
  • What are the permissions on the Progress executables and specifically the .lk file when created?
  • Is another user be accessing the database single user then abnormally terminating their session, which leads to this message "KEEPALIVE timeout" time later?
  • Is a different user starting the database and terminating their terminal session? It could be that the user that starts the database logs off. In this case the _mprosrv needs to be started as a Service or set up with a (restricted) Administrator Account.
5.  A high instance of "ungraceful" terminations on the system could cause the Broker being forced to shut down?
  • HANGUPS (ungraceful terminations) + dead users + cron job running to disconnect users > TIME LIMIT.
6.   Could the filesystem be running out of disk space around the time this is happening?
  • Client temporary files for example?
  • kernel parameters for proc-per-user, subproc-peruser, sem-mni and sem-msl kernel settings?
7.   Does terminal emulation send a kill signal to a process when the terminal is closed?

For example the default configuration of a terminal emulation package called "FACETERM" sends a "kill -9" to a process on exit.
 
 
8.   Resource depletion problems on the Server?
  • Messages referring to the PID of the _mprosrv broker process that disappeared.
  • For example a message that indicates the kernel experienced a low memory condition, and it should be determined what caused that. On Linux/Unix systems check the messages files under /var/log. These may contain something similar to:
kernel: Out of memory: Kill process
kernel: invoked oom-killer
 
While strictly unrelated to the Broker process terminating, are there any error messages that indicate Shared Memory or Buffer Latches?  For example in the database log file evidence of:
User <num> died with <num> buffers locked. (2523)
User <num> died holding <num> shared memory locks. (2522)
System Error: redundant lwake user <n> latch <x>
Begin ABNORMAL shutdown code (2249)

 
Scripts that parse for the PID of a process to terminate clients, have been known to incorrectly terminate the wrong process which would be the Database Broker process' PID in this instance. It may be worth running clients that need to be terminated in this way (for example radio terminal devices in a warehouse) as "remote" clients, so that shared memory is not accessed directly by the client but through the remote server process. The server process will then manage the shared memory and buffer latches on behalf of the client and will clean up remaining latches when the client dies.

A similar message 4195 results when the database .lk file itself cannot be accessed to perform content verification checks, where the database engine forces an abnormal database shutdown as a result.
<dbname>.lk is missing, shutting down... (4195)
For further troubleshooting advice, refer to Article  What causes ".lk is missing, shutting down... (4195)"  
Workaround
Notes
Keyword Phrase
Last Modified Date1/28/2025 9:17 PM

Powered by