35 Claude Code Agents for Penetration Testing: The AI-Powered Red Team That Works While You Sleep + Video

Listen to this Post

Featured Image

Introduction:

Claude Code, Anthropic’s agentic AI coding assistant, has evolved far beyond simple code completion. In 2026, security researchers have transformed it into a formidable offensive security powerhouse by deploying specialized AI subagents. Projects like `pentest-ai-agents` (35 agents), `Threatswarm` (27 scope-enforced agents), Cyber Neo, and `Huntress` (29 vulnerability hunters) are now capable of autonomously executing the entire penetration testing kill chain — from reconnaissance and vulnerability discovery to exploitation and reporting. This article explores how to leverage these 35 Claude Code agents for authorized security assessments, providing step-by-step implementation guides, commands, and critical operational security considerations.

Learning Objectives:

  • Install and configure Claude Code with the 35-agent penetration testing suite
  • Deploy autonomous AI agents for reconnaissance, web testing, Active Directory attacks, and cloud security assessments
  • Understand the scope enforcement, safety guardrails, and ethical considerations when running AI-powered offensive tools
  1. Installing Claude Code and the 35-Agent Pentest Suite

The `pentest-ai-agents` project by 0xSteph turns Claude Code into a specialized red team assistant. The installation is intentionally simple to lower the barrier for authorized testers.

Step‑by‑step installation (Linux / macOS / WSL):

 One‑command install — copies 35 agents to ~/.claude/agents/
curl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash

Optional: Install the underlying CLI tools (nmap, nuclei, ffuf, BloodHound, Impacket, etc.)
./install.sh --tools

Verify installation and audit tool availability per agent
db/doctor.sh

Windows (WSL2) setup:

Use WSL2 with Ubuntu 22.04 or Kali Linux. Install Claude Code via the official CLI, then run the curl command above inside the WSL terminal.

Claude Code first‑time authentication:

 Install Claude Code CLI (requires Anthropic API key)
npm install -g @anthropic/claude-code
claude login

Once installed, open Claude Code in any project directory and simply describe your objective. Claude automatically routes to the appropriate specialist agent.

2. Running Autonomous Reconnaissance with the Recon Advisor

The `recon-advisor` and `osint-collector` agents execute the discovery phase, leveraging tools like Nmap, Masscan, Subfinder, Amass, and theHarvester.

Step‑by‑step autonomous recon inside Claude Code:

 Launch Claude Code in your working directory
claude

Within Claude Code, give a natural language prompt:
"Run a full reconnaissance sweep against target.com. Use the recon-advisor agent. Perform subdomain enumeration, port scanning, service detection, and OSINT gathering. Respect scope boundaries."

The agent will autonomously execute commands such as:

 Example commands the agent may run (with your approval if not in auto‑mode)
subfinder -d target.com -o subdomains.txt
amass enum -passive -d target.com
nmap -sV -sC -p- target.com -oA nmap_full
whatweb target.com

Scope enforcement: All agents include a hard‑refusal list that blocks DoS attacks, mass scanning of out‑of‑scope ranges, and operations against safety‑of‑life systems. The `_scope-guard.md` file explicitly defines these boundaries.

3. Web Application and API Security Testing

The web-hunter, api-security, bug-bounty, and `bizlogic-hunter` agents automate discovery of OWASP Top 10 vulnerabilities. These agents orchestrate tools like FFUF, SQLmap, Dalfox, and Commix.

Prompt example inside Claude Code:

"I need to test the API at https://api.target.com/v1. Use the api-security agent. Perform a full fuzzing of all endpoints, check for IDOR, test for SQL injection on all parameters, and validate JWT handling."

Underlying commands the agent may execute (for manual reference):

 Fuzzing endpoints
ffuf -u https://api.target.com/v1/FUZZ -w /usr/share/wordlists/dirb/common.txt

SQL injection testing
sqlmap -u "https://api.target.com/v1/user?id=1" --batch --level=3

XSS hunting
dalfox url https://target.com/search?q=test

Commix for command injection
commix --url="https://target.com/ping?ip=127.0.0.1"

Findings database: Every discovered vulnerability is automatically logged to a SQLite findings database (vulns.db). Use `findings.sh stats` to track progress and `findings.sh export` to generate JSON reports.

4. Active Directory and Credential Attacks

The `ad-attacker` and `credential-tester` agents specialize in internal network assessments. They drive Impacket, BloodHound, NetExec (CrackMapExec), Certipy, kerbrute, and Responder.

Step‑by‑step AD attack simulation (authorized lab only):

"Simulate an internal penetration test against the domain corp.local. I have a low‑privilege domain user account. Use the ad-attacker agent to enumerate users, hunt for Kerberoastable accounts, check for AS‑REP roasting, and identify misconfigured ACLs."

Example commands the agent may run:

 Kerberoasting
impacket-GetUserSPNs -request -dc-ip 10.0.0.1 corp.local/lowuser

AS‑REP roasting
impacket-GetNPUsers -dc-ip 10.0.0.1 corp.local/ -usersfile users.txt

BloodHound collection
bloodhound-python -d corp.local -u lowuser -p password -ns 10.0.0.1 -c All

Pass‑the‑hash with NetExec
nxc smb 10.0.0.0/24 -u administrator -H <NTLM_hash>

The agent understands AD attack paths and can prioritize based on your access level. It also integrates with `credential-tester` for password cracking using Hashcat and John.

5. Cloud Security, Container Breakout, and C2 Operations

Modern infrastructures demand cloud and container testing. The cloud-security, cicd-redteam, and `container-breakout` agents assess AWS, Azure, GCP, Kubernetes, and Docker environments. The `c2-operator` agent designs command‑and‑control infrastructure.

Step‑by‑step cloud misconfiguration scan:

"Audit our AWS production environment for security misconfigurations. Use the cloud-security agent with Prowler and ScoutSuite. Check for open S3 buckets, IAM privilege escalation paths, and unencrypted RDS instances."

Underlying command examples:

 AWS Prowler assessment
prowler aws --services s3,iam,rds

Kubernetes breakout assessment
kube-hunter --remote 10.0.0.2:6443
peirates -kubeconfig /path/to/kubeconfig

Docker escape vector scanning
cdk evaluate --target 10.0.0.3

Container breakout detection: The agent will attempt known runc and CRI‑O escape techniques (only on consenting test systems) and correlate findings with Falco detection rules.

6. LLM Red Teaming and Prompt Injection

The new `llm-redteam` agent (added in v3.2) tests LLM‑powered applications for OWASP LLM Top 10 risks: prompt injection, RAG poisoning, MCP server abuse, and agent tool abuse.

Step‑by‑step AI application penetration test:

"Test the LLM chatbot at https://chat.target.com. Use the llm-redteam agent. Run prompt injection payloads from Garak and PyRIT. Attempt to leak system prompts, bypass content filters, and extract training data."

Example test commands:

 Garak LLM vulnerability scanner
garak --model_type openai --model_name gpt-3.5-turbo --probes all

PyRIT framework
python pyrit.py --endpoint https://chat.target.com --strategy "prompt_injection"

The agent can also test MCP (Model Context Protocol) servers for injection vulnerabilities and assess whether an LLM agent can be tricked into calling malicious tools.

7. Reporting, Handoff, and Operational Security

The `opsec-anonymizer` agent (new in v3.2) provides operator‑side identity hygiene: source IP design, JA3 fingerprint management, and burner infrastructure checklists. After testing, generate professional reports using the built‑in reporting agents.

Generate a handoff report between sessions:

bash handoff.sh

This creates a Markdown report containing all findings, commands run, and next steps.

Export findings in JSON for integration with vulnerability management platforms:

findings.sh export --format json > pentest_report.json

OPSEC checklist before any engagement:

  • Run `opsec-anonymizer` to review your source IP and fingerprint exposure
  • Use dedicated burner infrastructure for external testing
  • Verify all agents respect the `_scope-guard.md` boundaries
  • Never run autonomous agents against production systems without explicit written authorization

What Undercode Say

  • AI agents are not replacing human pentesters — they are force multipliers. The 35 Claude Code agents excel at repetitive, time‑intensive tasks (recon, fuzzing, enumeration), allowing human experts to focus on complex logic flaws, business logic abuse, and strategic decision‑making.
  • Scope enforcement and safety guardrails are mandatory. Every production‑ready agent suite now includes hard‑refusal lists and explicit boundary files. Without these, autonomous AI tools can become uncontrollable weapons. Always run agentic tools in isolated environments (containers or dedicated VMs).

The rise of AI‑powered penetration testing represents a fundamental shift in offensive security. In February 2026, Anthropic reported that Claude Code Security identified over 500 vulnerabilities across production open‑source codebases using LLM‑based reasoning. By March 2026, researchers cataloged 70 open‑source AI penetration testing tools — fewer than five existed before GPT‑4’s release in April 2023. The 35 Claude Code agents discussed here are part of a broader movement: agentic AI is democratizing advanced security testing while simultaneously introducing new classes of vulnerabilities (prompt injection, agent tool abuse, MCP server compromise). Defenders must learn to test AI systems with AI agents, just as attackers will.

Prediction: By Q4 2026, autonomous AI penetration testing will be a standard offering from major security vendors. The bottleneck will shift from “finding vulnerabilities” to “validating false positives and chaining exploits.” Organizations that fail to integrate AI agents into their red teaming and bug bounty programs will fall behind attackers who already use these tools. However, the weaponization of Claude Code and similar agents — turning them into nation‑state‑level attack tools with no coding required — is already a reality. The next 12 months will see an arms race between AI‑powered attackers and AI‑powered defenders, with scope enforcement and behavioral monitoring becoming critical control points.

▶️ Related Video (76% Match):

🎯Let’s Practice For Free:

IT/Security Reporter URL:

Reported By: Omar Aljabr – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky