LLMjacking 20: Why Your AI Infrastructure Is The Hacker's New Favorite Piggy Bank

Introduction

The cybersecurity landscape is witnessing a paradigm shift as attackers pivot from traditional ransomware and data exfiltration toward a more insidious target: artificial intelligence infrastructure. LLMjacking—the unauthorized hijacking of large language model resources—has evolved from a mere financial nuisance into a sophisticated attack vector that threatens both cloud budgets and corporate intellectual property. What began in 2024 as credential theft targeting Amazon Bedrock instances has matured into a multi-pronged assault methodology that includes direct resource compromise, session token replay, and even the wholesale theft of authenticated AI agent installations【8†L1-L4】.

Learning Objectives

Understand the four distinct LLMjacking attack vectors and differentiate between direct and indirect compromise techniques
Master practical detection and mitigation strategies for cloud-hosted and self-hosted LLM deployments
Implement hardened authentication, logging, and access control measures to prevent AI resource hijacking

You Should Know

Direct LLMjacking: The Assault on Cloud-Hosted and Self-Hosted Models

Direct LLMjacking represents the most straightforward attack path: adversaries go straight for the model resource itself. This manifests in two primary scenarios.

Cloud-Hosted Model Compromise

Attackers leverage stolen cloud credentials—often obtained through phishing, credential stuffing, or infostealer malware—to gain unauthorized access to managed AI services like Amazon Bedrock or Google Vertex AI. The most concerning aspect is that attackers frequently disable invocation logging to conceal their activities, making detection extraordinarily difficult for security teams【8†L8-L10】. The financial impact is staggering: early incidents in 2024 generated bills reported as high as $46,000 per day【8†L6】.

Self-Hosted Open Model Exploitation

The proliferation of open-source models like Ollama has created a sprawling attack surface. Misconfigured servers left exposed to the internet with no authentication requirements have become prime targets. In a notable incident documented by Sysdig, a compromised Ollama server was repurposed into an AI-powered hacking pipeline capable of target enumeration, CVE identification, and automated exploit development【8†L12-L15】【10†L4-L7】.

Step-by-Step: Securing Your Self-Hosted Ollama Deployment

Audit your exposure: Run `nmap -p 11434 your-server-ip` to check if the Ollama port is accessible externally. If open, immediate action is required.

Implement authentication: Ollama does not natively support authentication. Deploy a reverse proxy with auth:

Using Caddy as a reverse proxy with basic auth
caddy reverse-proxy --from your-domain.com --to localhost:11434 \
--basicauth username hashed-password

Restrict network access: Configure firewall rules to allow only trusted IP ranges:

Linux iptables example
iptables -A INPUT -p tcp --dport 11434 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 11434 -j DROP

Enable comprehensive logging: Configure Ollama to log all invocations and monitor for anomalous patterns:
```
Run Ollama with debug logging enabled
OLLAMA_DEBUG=1 ollama serve 2>&1 | tee -a /var/log/ollama.log
```

Implement rate limiting: Use a tool like `fail2ban` to block excessive requests:

Create a fail2ban filter for Ollama
[bash]
enabled = true
port = 11434
filter = ollama
logpath = /var/log/ollama.log
maxretry = 10
bantime = 3600

Indirect LLMjacking: The Credential and Session Theft Epidemic

Indirect LLMjacking represents a more subtle but equally devastating approach. Instead of attacking the model directly, adversaries inherit legitimate user access through stolen credentials or hijacked sessions.

Stolen Access via Token Replay

Attackers harvest session and refresh tokens—often through infostealer malware or browser cookie theft—and replay them to gain unauthorized access to victim AI accounts like Claude or ChatGPT. This attack is already occurring at an alarming scale; IBM X-Force observed approximately 300,000 stolen ChatGPT credentials in circulation【8†L18-L20】. In the OALABS incident, attackers successfully copied authentication tokens across multiple hosts and used them to establish persistent access【8†L20-L22】.

Hijacked Agent Installations

Perhaps the most sophisticated variant involves attackers copying entire authenticated agent installations—tools, session history, and all. The OALABS incident demonstrated this technique vividly: attackers stole Claude Code and Codex installation directories, subsequently breaching at least 14 companies. Notably, the attackers appeared relatively low-skilled, relying on the stolen AI agents to conduct professional-grade offensive operations, down to drafting the final reports【8†L22-L26】.

Step-by-Step: Detecting and Preventing Token Theft

Implement short-lived sessions: Configure your AI platforms to use short-lived access tokens (15-30 minutes) with refresh token rotation:
```
Python example using JWT with short expiration
import jwt
import datetime</li>
</ol>

token = jwt.encode({
'user': user_id,
'exp': datetime.datetime.utcnow() + datetime.timedelta(minutes=15)
}, secret_key, algorithm='HS256')
```
1. Monitor for token replay anomalies: Implement detection rules for:
– Multiple geographic locations using the same token
– Unusual user-agent strings
– Access patterns inconsistent with historical behavior
1. Enforce device binding: Bind sessions to specific device fingerprints and invalidate if mismatches occur:
```
Example: Log all session attributes for anomaly detection
echo "$(date) - User: $USER - IP: $IP - User-Agent: $UA - Token: $TOKEN_HASH" >> /var/log/session_audit.log
```
2. Conduct regular token inventory audits: Periodically review active sessions and force re-authentication for suspicious entries:
```
-- SQL query to identify sessions with unusual activity
SELECT user_id, COUNT(DISTINCT ip_address) as ip_count, 
COUNT(DISTINCT user_agent) as ua_count
FROM session_logs
WHERE timestamp > NOW() - INTERVAL 1 DAY
GROUP BY user_id
HAVING ip_count > 3 OR ua_count > 2;
```
3. Deploy credential monitoring: Use tools like HaveIBeenPwned API or commercial threat intelligence feeds to detect compromised credentials:
```
Check if credentials appear in known breaches
curl -X GET "https://haveibeenpwned.com/api/v3/breachedaccount/[email protected]" \
-H "hibp-api-key: YOUR_API_KEY"
```
3. Cloud Credential Hardening: Protecting Your AI Spend

Given that cloud-hosted LLMs remain a primary target, hardening cloud credentials is non-1egotiable. Attackers who gain access to AWS, GCP, or Azure credentials can invoke expensive model endpoints at will, often disabling logging to avoid detection.

Step-by-Step: Implementing Least-Privilege Access for AI Services
1. Create dedicated service accounts: Never use root or administrative credentials for AI operations. Create service accounts with minimal permissions:
```
// AWS IAM policy for Bedrock - read-only invocation only
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "bedrock:InvokeModel",
"Resource": "arn:aws:bedrock:us-east-1::foundation-model/"
},
{
"Effect": "Deny",
"Action": "bedrock:DeleteModelInvocationLoggingConfiguration",
"Resource": ""
}
]
}
```
2. Enable CloudTrail for all AI services: Ensure logging is enabled and cannot be disabled by the same credentials used for model invocation:
```
AWS CLI command to enable CloudTrail logging for Bedrock
aws bedrock put-model-invocation-logging-configuration \
--logging-config '{"cloudWatchConfig":{"logGroupName":"bedrock-logs","roleArn":"arn:aws:iam::ACCOUNT:role/bedrock-logging"},"s3Config":{"bucketName":"bedrock-logs"}}'
```
3. Implement budget alerts: Configure AWS Budgets to alert on abnormal spending patterns:
```
Create a budget alert for Bedrock costs
aws budgets create-budget --account-id 123456789012 \
--budget '{"BudgetName":"Bedrock-Spend-Alert","BudgetLimit":{"Amount":"1000","Unit":"USD"},"TimeUnit":"DAILY"}' \
--1otifications-with-subscribers '[{"Notification":{"NotificationType":"ACTUAL","ComparisonOperator":"GREATER_THAN","Threshold":80,"ThresholdType":"PERCENTAGE"},"Subscribers":[{"SubscriptionType":"EMAIL","Address":"[email protected]"}]}]'
```
4. Rotate credentials automatically: Use AWS Secrets Manager or HashiCorp Vault to manage and rotate credentials:
```
Force rotation of IAM access keys
aws iam create-access-key --user-1ame ai-service-user
aws iam delete-access-key --user-1ame ai-service-user --access-key-id OLD_KEY_ID
```
5. Deploy anomaly detection: Use machine learning-based services like Amazon GuardDuty to detect unusual API calls:
```
Enable GuardDuty for your account
aws guardduty create-detector --enable
```
4. Self-Hosted Model Security: Beyond Authentication

While authentication is critical, self-hosted models present additional security challenges. Attackers can exploit model vulnerabilities, poison training data, or extract sensitive information through prompt injection.

Step-by-Step: Hardening Self-Hosted LLM Deployments
1. Implement input sanitization: Filter and validate all prompts before they reach the model:
```
import re</li>
</ol>

def sanitize_prompt(prompt):
 Remove potential injection attempts
prompt = re.sub(r'<script.?>.?</script>', '', prompt, flags=re.DOTALL)
 Limit prompt length
if len(prompt) > 4096:
prompt = prompt[:4096]
return prompt
```
  1. Deploy a WAF (Web Application Firewall): Use ModSecurity or AWS WAF to filter malicious requests:
```
ModSecurity rule to block suspicious prompts
SecRule ARGS "(\bignore\b.\bprevious\b|\bprint\b.\bsecret\b)" \
"id:10001,phase:2,deny,status:403,msg:'Potential prompt injection detected'"
```
  2. Isolate model execution: Run models in containers with limited resources and no network access:
```
Run Ollama in a container with network restrictions
docker run -d --1ame ollama \
--1etwork none \
--memory 8g \
--cpus 4 \
-v ollama-data:/root/.ollama \
ollama/ollama
```
  3. Monitor model output: Implement content filtering to prevent data exfiltration:
```
def filter_output(response):
Check for sensitive data patterns
if re.search(r'\b\d{3}-\d{2}-\d{4}\b', response):  SSN pattern
return "Response contains sensitive information and has been blocked"
return response
```
  4. Regular security assessments: Conduct penetration testing on your AI infrastructure:
```
Use OWASP ZAP to scan for vulnerabilities
zap-cli quick-scan --spider -r http://your-ollama-server:11434
```
  5. Detection and Incident Response for LLMjacking
  
  Early detection is critical to minimize financial and reputational damage. Organizations must develop specific detection capabilities for AI infrastructure compromise.
  
  Step-by-Step: Building an LLMjacking Detection Program
  1. Establish baseline behavior: Document normal usage patterns including:
  – Average number of daily invocations
  – Typical token consumption per user
  – Normal geographic distribution of access
  – Standard times of day for model usage
  1. Deploy SIEM alerts: Create alerts for suspicious activities:
```
-- Splunk query to detect anomalous invocation patterns
index=ai_logs source=bedrock
| stats count, avg(tokens) as avg_tokens by user, hour
| where count > 1000 OR avg_tokens > 10000
| eval anomaly=if(count > 1000, "High volume", "High tokens")
```
  2. Implement real-time cost monitoring: Use cloud provider APIs to track spending in real-time:
```
import boto3</p></li>
</ol>

<p>def check_bedrock_spending():
client = boto3.client('ce')
response = client.get_cost_and_usage(
TimePeriod={'Start': '2026-06-19', 'End': '2026-06-20'},
Granularity='DAILY',
Metrics=['UnblendedCost'],
Filter={'Dimensions': {'Key': 'SERVICE', 'Values': ['Amazon Bedrock']}}
)
cost = response['ResultsByTime'][bash]['Total']['UnblendedCost']['Amount']
if float(cost) > 1000:
alert_security_team(f"Abnormal Bedrock spend detected: ${cost}")
```
    1. Create an incident response playbook: Document procedures for:
    – Credential revocation
    – Model access suspension
    – Forensic evidence collection
    – Communication with cloud providers
    – Regulatory notification requirements
    1. Conduct tabletop exercises: Regularly simulate LLMjacking scenarios to test response readiness.
    What Undercode Say
    - LLMjacking has evolved beyond simple credential theft into a multi-vector threat encompassing direct resource attacks, token replay, and agent hijacking—each requiring distinct defensive strategies.
    - The attacker skill gap is narrowing as AI agents themselves become weapons. Low-skilled adversaries can now execute sophisticated operations by leveraging stolen authenticated AI installations, democratizing advanced offensive capabilities.
    The evolution of LLMjacking represents a fundamental shift in the threat landscape. What began as a financial nuisance—runaway cloud bills—has matured into a sophisticated attack methodology capable of full-spectrum compromise. Organizations can no longer treat AI security as an afterthought or rely solely on cloud provider defaults. The OALABS incident demonstrates that attackers are actively targeting AI infrastructure for operational advantage, not just financial gain. The ability to steal an entire authenticated AI agent and repurpose it for offensive operations represents a new class of threat that traditional security controls are ill-equipped to handle.
    
    The most concerning aspect is the asymmetry: defenders must secure every access point, every model endpoint, and every credential, while attackers need only find a single vulnerability. This reality demands a proactive, defense-in-depth approach that combines technical controls with continuous monitoring and rapid incident response capabilities【8†L2-L4】.
    
    Prediction
    - +1 The LLMjacking threat will catalyze the development of AI-specific security frameworks and standards, driving innovation in identity management, anomaly detection, and automated response systems tailored to AI workloads.
    - -1 The commoditization of AI-powered hacking tools through stolen agent installations will lower the barrier to entry for cybercriminals, leading to a surge in AI-enabled attacks across all sectors.
    - -1 Cloud providers will face increasing pressure to implement stronger default security controls for AI services, potentially leading to friction for legitimate users but ultimately raising the baseline security posture.
    - +1 Organizations that invest early in AI security capabilities—including credential hardening, continuous monitoring, and incident response playbooks—will gain a competitive advantage in resilience and trust.
    - -1 The financial impact of LLMjacking will escalate as models become more capable and expensive to operate, with potential for six-figure daily losses becoming commonplace for unprepared organizations.
    🎯Let’s Practice For Free:
    
    🎓 Live Courses & Certifications:
    
    Join Undercode Academy for Verified Certifications
    
    🚀 Request a Custom Project:
    
    Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
    [email protected]
    💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands
    
    IT/Security Reporter URL:
    
    Reported By: Aondona Llmjacking – Hackers Feeds
    Extra Hub: Undercode MoN
    Basic Verification: Pass ✅
    
    🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]
    
    💬 Whatsapp | 💬 Telegram
    
    📢 Follow UndercodeTesting & Stay Tuned:
    
    𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky
    Share this:
    Reddit
    LinkedIn
    Threads
    Pinterest
    Bluesky
    WhatsApp
    X
    Telegram
    Facebook
    Email
    Tumblr
    Mastodon
    Print

Listen to this Post

Introduction

Learning Objectives

You Should Know

Cloud-Hosted Model Compromise

Self-Hosted Open Model Exploitation

Step-by-Step: Securing Your Self-Hosted Ollama Deployment

Stolen Access via Token Replay

Hijacked Agent Installations

Step-by-Step: Detecting and Preventing Token Theft

3. Cloud Credential Hardening: Protecting Your AI Spend

Step-by-Step: Implementing Least-Privilege Access for AI Services

4. Self-Hosted Model Security: Beyond Authentication

Step-by-Step: Hardening Self-Hosted LLM Deployments

5. Detection and Incident Response for LLMjacking

Step-by-Step: Building an LLMjacking Detection Program

What Undercode Say

Prediction

🎯Let’s Practice For Free:

🎓 Live Courses & Certifications:

🚀 Request a Custom Project:

IT/Security Reporter URL:

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

📢 Follow UndercodeTesting & Stay Tuned:

Share this:

Related Posts: