Preventing Out-Of-Memory (OOM) Issues — The Smarter Way With Oomd

Ever had your Linux system slow to a crawl or crash due to memory pressure? That’s where oomd comes in—a powerful userspace OOM killer that helps manage memory more gracefully before your system hits that breaking point.

What is oomd?

When your system runs out of memory, the kernel steps in to kill processes with little flexibility or warning. oomd shifts this logic to userspace, giving you more control and smarter decision-making.

How Does It Help?

oomd monitors memory pressure using PSI (Pressure Stall Information) and cgroupv2, keeping an eye on how stressed your system is. Before memory exhaustion happens at the kernel level, oomd takes action—like terminating processes that cross certain thresholds—all based on customizable rules and plugins.

Why Use It?

Avoids long host lockups by acting early
Flexible, workload-specific configuration
Actively used in production at scale (e.g., Facebook)
Reduces time spent in kernelspace memory thrashing

Learn more here: GitHub – facebookincubator/oomd

You Should Know:

1. Installing oomd on Linux (Debian/Ubuntu)

sudo apt update 
sudo apt install oomd

2. Enabling cgroupv2

 Check if cgroupv2 is enabled 
cat /proc/filesystems | grep cgroup2

Enable cgroupv2 (if not active) 
sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1" 
sudo reboot

3. Configuring oomd Rules

Create a config file (`/etc/oomd/oomd.yaml`) with custom rules:

rules: 
- name: "high_memory_pressure" 
detectors: 
- name: "memory_pressure" 
args: 
threshold: 60 
duration: 10s 
actions: 
- name: "kill_by_memory" 
args: 
cgroup: "/system.slice/"

4. Starting oomd

sudo systemctl enable oomd 
sudo systemctl start oomd

5. Monitoring Memory Pressure with PSI

 Check current memory pressure 
cat /proc/pressure/memory

6. Testing oomd Behavior

Simulate memory pressure using `stress-ng`:

sudo apt install stress-ng 
stress-ng --vm 4 --vm-bytes 2G --timeout 60s

7. Logging oomd Actions

journalctl -u oomd -f

8. Excluding Critical Processes

Modify the config to exclude essential services (e.g., sshd, dbus):

actions: 
- name: "kill_by_memory" 
args: 
cgroup: "/system.slice/" 
avoid: "sshd|dbus"

9. Adjusting Kill Strategies

Use recursive killing for containerized workloads:

actions: 
- name: "kill_by_memory_recursive" 
args: 
cgroup: "/docker/"

10. Integrating with Kubernetes

For Kubernetes nodes, apply oomd rules to pods:

rules: 
- name: "k8s_oom_protection" 
detectors: 
- name: "memory_pressure" 
args: 
threshold: 70 
duration: 15s 
actions: 
- name: "kill_by_memory" 
args: 
cgroup: "/kubepods.slice/"

What Undercode Say:

Managing OOM issues is critical for system stability, especially in high-load environments. oomd provides a smarter alternative to the kernel’s OOM killer by acting preemptively and allowing fine-tuned control. Key takeaways:
– Use PSI metrics to monitor memory pressure.
– Configure cgroupv2 for granular process control.
– Test rules in staging before production.
– Combine oomd with Kubernetes for cloud-native resilience.

For advanced users, explore eBPF-based OOM prevention or integrate with Prometheus for real-time alerts.

Expected Output:

A fully configured oomd setup that prevents system crashes by intelligently killing processes under memory pressure, with logs and metrics for observability.

Learn more: GitHub – facebookincubator/oomd

References:

Reported By: Tania Duggal – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

Join Our Cyber World:

💬 Whatsapp | 💬 Telegram

Listen to this Post