Salesforce

What is the OOM Killer (Out of Memory Killer)?

« Go Back

Information

 
TitleWhat is the OOM Killer (Out of Memory Killer)?
URL NameWhat-is-the-OOM-Killer-Out-of-Memory-Killer
Article Number000141887
EnvironmentProduct: Progress
Product: OpenEdge
Version: All supported vergions
OS: Linux
Question/Problem Description

What is the Linux Out of Memory Killer?
What is the Linux OOM Killer?
Where does the Linux Out of Memory Killer report killing processes?
Where does the Linux OOM Killer report killing processes?
What killed my process?
Can we disable OOM Killer ?
Can we exclude a process from being killed by OOM Killer ?

Steps to Reproduce
Clarifying Information
Error Message./var/log/messages MAY contain an entry similar to the following:

kernel: <process name> invoked oom-killer

kernel: Out of memory: Kill process <#> (<process name>) score # or sacrifice child
kernel: Killed process <#>(<process name>)

Db log:

(4194) Broker disappeared, updating<path/db>.lk
Defect Number
Enhancement Number
Cause
If the Linux operating system (beginning with Linux 4.6 kernel 4.19) detects the system is running short on RAM it has an algorithm in place to try and free up RAM by killing processes using high amounts of RAM.

The operating system will report this activity in the /var/log/messages file with a message similar to the following format:

host kernel: Out of Memory: Killed process PID#HERE (processnamehere).

This does not mean the process in question is incorrectly using memory but that the system is killing it to make more RAM available for the rest of the system.


 
Resolution
It is advisable to periodically scan the /var/log/messages or /var/log/syslog file to look for OOM Killer messages.

If any reports of OOM killer messages are listed in the messages file, monitoring of the RAM usage by all processes on the system should be performed to ensure that no processes are leaking memory or incorrectly configured to use more memory than needed.

Logs alone will not be able to tell the whole story of how the OOM event occurred, only that it happened and which processes were sacrificed. 

Monitoring could be done via a script that runs periodically via a cron job and collects ps, top output, etc. The goal is to capture statistical data before the OOM event, not after. The output should then be analyzed by the system administrator.

Several articles are listed below on searching for potential memory leaks related to ABL or Progress Application Server (PASOE) code usage.

Keep in mind that the total usage by all running processes can also trigger an OOM event in case of increased user activity or the combination of new processes that cause the system to cross the threshold. 

If the dstat command is available, the syntax below can be used to determine the top candidates to be killed by OOM killer in case of an Out Of Memory event. 
dstat --top-oom
Alternatively, the OOM killer can be disabled in some Linux versions until the cause is determined or further troubleshooting can be performed. 

Red Hat Enteprise Linux 4.2 and newer releases have the /proc/sys/vm/oom-kill tunable. Set this to 0 to disable the oom-killer

Red Hat Enteprise Linux 5, 6, 7, 8 and 9 do not have the ability to completely disable OOM-KILLER. Please refer the following solution provided by Redhat for tuning OOM-KILLER operation within RHEL 5, RHEL 6, RHEL 7 and RHEL 8.

https://access.redhat.com/solutions/20985

Telling the OOM killer to ignore a process :

Disabling OOM killer is done on a process by process basis, so you’ll need to know the PID of the running process that you want to protect. This is far from ideal, as process IDs can change frequently, but we can script around it.

As documented by http://linux-mm.org/OOM_Killer: “Any particular process leader may be immunized against the oom killer if the value of its /proc/$pid/oom_adj is set to the constant OOM_DISABLE (currently defined as -17).”

This means we can disable OOM killer on an individual process, if we know its PID, using the command below:

  • # OOM_DISABLE on $PID 
  • echo -17 > /proc/$PID/oom_adj 

    Using pgrep we can run this knowing only the name of the process. For example, let’s ensure that the ssh listener doesn’t get OOM killed:

    • pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done 

      Here we used pgrep to search for the full command line (-f) matching “/usr/sbin/sshd” and then echo -17 into the procfs entry for each matching pid.

      In order to automate this, you could run a cron regularly to update the oom_adj entry. This is a simple way to ensure that sshd is excluded from OOM killer after restarting the daemon or the server.

      • #/etc/cron.d/oom_disable 
      • */1 * * * * root pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done 

        The above job will run every minute, updating the oom_adj of the current process matching /usr/sbin/sshd. Of course this could be extended to include any other processes you wish to exclude from OOM killer.

         

          Workaround
          Notes
          Keyword Phrase
          Last Modified Date11/15/2023 5:12 PM

          Powered by