Information

Title	What is the OOM Killer (Out of Memory Killer)?

URL Name	What-is-the-OOM-Killer-Out-of-Memory-Killer

Article Number	000141887

Environment	Product: Progress Product: OpenEdge Version: All supported vergions OS: Linux

Question/Problem Description

What is the Linux Out of Memory Killer?
What is the Linux OOM Killer?
Where does the Linux Out of Memory Killer report killing processes?
Where does the Linux OOM Killer report killing processes?
What killed my process?
Can we disable OOM Killer ?
Can we exclude a process from being killed by OOM Killer ?

Steps to Reproduce

Clarifying Information

Error Message	./var/log/messages MAY contain an entry similar to the following: kernel: <process name> invoked oom-killer kernel: Out of memory: Kill process <#> (<process name>) score # or sacrifice child kernel: Killed process <#>(<process name>) Db log: (4194) Broker disappeared, updating<path/db>.lk

Defect Number

Enhancement Number

Cause

If the Linux operating system (beginning with Linux 4.6 kernel 4.19) detects the system is running short on RAM it has an algorithm in place to try and free up RAM by killing processes using high amounts of RAM.

The operating system will report this activity in the /var/log/messages file with a message similar to the following format:

host kernel: Out of Memory: Killed process PID#HERE (processnamehere).

This does not mean the process in question is incorrectly using memory but that the system is killing it to make more RAM available for the rest of the system.

Resolution

It is advisable to periodically scan the /var/log/messages or /var/log/syslog file to look for OOM Killer messages.

If any reports of OOM killer messages are listed in the messages file, monitoring of the RAM usage by all processes on the system should be performed to ensure that no processes are leaking memory or incorrectly configured to use more memory than needed.

Logs alone will not be able to tell the whole story of how the OOM event occurred, only that it happened and which processes were sacrificed.

Monitoring could be done via a script that runs periodically via a cron job and collects ps, top output, etc. The goal is to capture statistical data before the OOM event, not after. The output should then be analyzed by the system administrator.

Several articles are listed below on searching for potential memory leaks related to ABL or Progress Application Server (PASOE) code usage.

Keep in mind that the total usage by all running processes can also trigger an OOM event in case of increased user activity or the combination of new processes that cause the system to cross the threshold.

If the dstat command is available, the syntax below can be used to determine the top candidates to be killed by OOM killer in case of an Out Of Memory event.

dstat --top-oom

Alternatively, the OOM killer can be disabled in some Linux versions until the cause is determined or further troubleshooting can be performed.

Red Hat Enteprise Linux 4.2 and newer releases have the /proc/sys/vm/oom-kill tunable. Set this to 0 to disable the oom-killer

Red Hat Enteprise Linux 5, 6, 7, 8 and 9 do not have the ability to completely disable OOM-KILLER. Please refer the following solution provided by Redhat for tuning OOM-KILLER operation within RHEL 5, RHEL 6, RHEL 7 and RHEL 8.

https://access.redhat.com/solutions/20985

Telling the OOM killer to ignore a process :

Disabling OOM killer is done on a process by process basis, so you’ll need to know the PID of the running process that you want to protect. This is far from ideal, as process IDs can change frequently, but we can script around it.

As documented by http://linux-mm.org/OOM_Killer: “Any particular process leader may be immunized against the oom killer if the value of its /proc/$pid/oom_adj is set to the constant OOM_DISABLE (currently defined as -17).”

This means we can disable OOM killer on an individual process, if we know its PID, using the command below:

# OOM_DISABLE on $PID

echo -17 > /proc/$PID/oom_adj

Using pgrep we can run this knowing only the name of the process. For example, let’s ensure that the ssh listener doesn’t get OOM killed:

pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done

Here we used pgrep to search for the full command line (-f) matching “/usr/sbin/sshd” and then echo -17 into the procfs entry for each matching pid.

In order to automate this, you could run a cron regularly to update the oom_adj entry. This is a simple way to ensure that sshd is excluded from OOM killer after restarting the daemon or the server.

#/etc/cron.d/oom_disable

*/1 * * * * root pgrep -f "/usr/sbin/sshd" | while read PID; do echo -17 > /proc/$PID/oom_adj; done

The above job will run every minute, updating the oom_adj of the current process matching /usr/sbin/sshd. Of course this could be extended to include any other processes you wish to exclude from OOM killer.

Workaround

Notes

Progress Article(s):

How to detect ABL Memory Leaks with Dynamic Objects Logging
Detect PASOE memory leaks with Dynamic Objects Logging
Sample Code for Analyzing PASOE Agent Log for Possible Memory Leaks

Keyword Phrase

Last Modified Date	11/15/2023 5:12 PM

What is the OOM Killer (Out of Memory Killer)?

Information