Run AI Pen Testing Without The Data Leak: A Local LLM Setup Guide + Video

Introduction:

The integration of Artificial Intelligence into offensive security workflows is rapidly shifting from a novelty to a necessity. However, this evolution brings a critical paradox: the same tools used to find vulnerabilities can inadvertently create them by exposing sensitive client data to public cloud-based AI providers. When a penetration tester feeds proprietary application traffic or proof-of-concept exploit code into a public AI model, they risk violating compliance mandates and non-disclosure agreements. This article explores a secure alternative: running a local Large Language Model (LLM) with the Caido Shift plugin to automate vulnerability discovery and exploitation without ever sending data to the cloud, ensuring data sovereignty during sensitive engagements.

Learning Objectives:

Understand the security risks associated with using public AI APIs during penetration testing.
Learn how to deploy a local LLM (Ollama) and integrate it with the Caido web proxy via the Shift plugin.
Configure and prompt a local model to autonomously identify and exploit SQL injection vulnerabilities in a sandboxed environment.

You Should Know:

Why Local AI is Non-Negotiable for Modern Pentesting
In a standard engagement, a tester might copy a suspicious HTTP request into ChatGPT to ask for exploitation assistance. This action potentially exposes the target’s architecture, API endpoints, and session tokens to a third-party server. For regulated industries (finance, healthcare, government), this constitutes a data breach. By utilizing a local stack comprising Caido (a lightweight web security testing suite) and Ollama (a tool to run LLMs locally), testers maintain complete control over the data. The traffic never leaves the local machine, satisfying even the most stringent compliance requirements while still benefiting from AI-driven automation.
Setting Up the Local Environment (Caido + Ollama)
To begin, you need to establish the pipeline that allows your proxy to “talk” to your AI.

Install Ollama: First, install Ollama on your testing machine (Linux/macOS/Windows).

Linux/macOS
curl -fsSL https://ollama.com/install.sh | sh

Pull a capable coding model (e.g., Codellama or Mistral)
ollama pull codellama:13b
Verify the model is running
ollama list

Install Caido: Download and install Caido from their official website. It serves as a Burp Suite alternative for intercepting and modifying web traffic.
Configure the Shift Plugin: In Caido, navigate to the Marketplace/Plugins section and install the “Shift” plugin. This plugin acts as the bridge, forwarding selected HTTP requests from Caido to your local LLM endpoint (Ollama) and returning the AI-generated payloads.

3. Configuring the Shift Plugin and Prompt Engineering

The magic lies in how you instruct the AI. Shift allows you to define custom prompts that tell the local model what to do with the intercepted request.

Plugin Configuration: Within Caido’s Shift settings, set the endpoint to http://localhost:11434` (Ollama's default API). Select the model you downloaded (e.g.,codellama:13b`).

The “SQLi Hunter” You need to engineer a prompt that forces the model to act as an exploiter, not just a detector. A sample prompt structure might be:

You are a professional SQL injection tester. Analyze the following HTTP request. 
Identify the parameter likely vulnerable to SQLi. 
Generate a sophisticated time-based or union-based payload to exploit this specific parameter, considering the backend might be MySQL. 
Return ONLY the modified HTTP request with the payload injected. Do not add explanations.</li>
</ul>

[bash]

The `

` is automatically replaced by Shift with the request you right-click on in Caido.

<h2 style="color: yellow;">4. Step-by-Step: Exploiting Juice Shop Locally</h2>

Using the OWASP Juice Shop (a deliberately vulnerable web application) as a target, we can see the local AI in action.

<ul>
<li>Step 1: Intercept Traffic. Configure your browser to proxy through Caido (default: <code>127.0.0.1:8080</code>). Navigate to the Juice Shop login page and attempt a fake login to capture the `POST /rest/user/login` request.</li>
<li>Step 2: Send to AI. In Caido's history, right-click the login request. Navigate to <code>Extensions > Shift > Send to Custom</code>. Select the SQLi prompt you created earlier.</li>
<li>Step 3: Receive the Exploit. The Ollama model processes the request locally. Within seconds, Shift will return a modified request. For example, it might modify the JSON body from `{"email":"[email protected]", "password":"test"}` to a payload attempting to bypass authentication:
[bash]
{"email":"' OR 1=1--", "password":"whatever"}

Step 4: Verify the Result. Send this modified request directly from Caido (Repeater functionality) to the server. If the server responds with a successful login or a database error message, the local AI has successfully identified and exploited the vulnerability without your data ever leaving the network card.

Expanding the Capability: Command Injection and Reverse Shells
The same principle applies to more complex vulnerabilities. By changing the prompt, you can instruct the local LLM to generate OS command injection payloads or even obfuscated reverse shell one-liners based on the context of the web application. Because the model is running locally, you can safely test highly aggressive payloads without an external AI flagging your activity as malicious or storing your techniques in a cloud database.

6. Verifying the Data Flow (The “No-Phone-Home” Check)

It is crucial to verify that the plugin is actually using the local model and not falling back to a cloud API. Use tools like `tcpdump` or Wireshark to monitor outbound traffic while using the Shift plugin.

 Linux: Monitor traffic to common cloud AI endpoints (example)
sudo tcpdump -i any host openai.com or host googleapis.com or host azure.com

Run this command in a terminal while you send a request to the AI via Shift. If the packet capture shows zero traffic to external IPs (only localhost communication with Ollama), your setup is confirmed data-safe.

What Undercode Say:

Sovereignty is the new vector: The primary takeaway here isn’t just that AI helps hack things; it’s that where the AI runs defines the security perimeter. By running models locally, we collapse the attack surface, preventing the exfiltration of client data through the “back door” of an LLM chatbot.
Prompt engineering is the exploit: The effectiveness of the test relies entirely on the tester’s ability to write a prompt that translates a raw HTTP request into a specific exploitation command. The AI is no longer just a spell-checker; it’s an autonomous payload generator, but it still requires a human operator to define the rules of engagement.
Accessibility vs. Control: While tools like Caido and Ollama lower the barrier to entry for AI-assisted hacking, they also raise the bar for defense. Blue teams must now assume that adversaries are running personalized, fine-tuned local models that leave no forensic trace in third-party cloud logs.

Prediction:

Within the next 18 months, local AI agents will become a standard part of the penetration tester’s toolkit, much like Metasploit is today. We will see the emergence of “agent swarms” where one local LLM handles recon, another handles exploitation, and a third handles reporting—all running on a tester’s laptop. This will force defensive security tools to shift focus from detecting specific exploit payloads to detecting the behavioral patterns of automated, AI-driven scanning traffic.

▶️ Related Video (82% Match):

🎯Let’s Practice For Free:

IT/Security Reporter URL:

Reported By: Scomurr Pentesting – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky

Listen to this Post