Listen to this Post

Introduction:
Claude Code, Anthropic’s agentic AI coding assistant, has evolved far beyond simple code completion. In 2026, security researchers have transformed it into a formidable offensive security powerhouse by deploying specialized AI subagents. Projects like `pentest-ai-agents` (35 agents), `Threatswarm` (27 scope-enforced agents), Cyber Neo, and `Huntress` (29 vulnerability hunters) are now capable of autonomously executing the entire penetration testing kill chain — from reconnaissance and vulnerability discovery to exploitation and reporting. This article explores how to leverage these 35 Claude Code agents for authorized security assessments, providing step-by-step implementation guides, commands, and critical operational security considerations.
Learning Objectives:
- Install and configure Claude Code with the 35-agent penetration testing suite
- Deploy autonomous AI agents for reconnaissance, web testing, Active Directory attacks, and cloud security assessments
- Understand the scope enforcement, safety guardrails, and ethical considerations when running AI-powered offensive tools
- Installing Claude Code and the 35-Agent Pentest Suite
The `pentest-ai-agents` project by 0xSteph turns Claude Code into a specialized red team assistant. The installation is intentionally simple to lower the barrier for authorized testers.
Step‑by‑step installation (Linux / macOS / WSL):
One‑command install — copies 35 agents to ~/.claude/agents/ curl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash Optional: Install the underlying CLI tools (nmap, nuclei, ffuf, BloodHound, Impacket, etc.) ./install.sh --tools Verify installation and audit tool availability per agent db/doctor.sh
Windows (WSL2) setup:
Use WSL2 with Ubuntu 22.04 or Kali Linux. Install Claude Code via the official CLI, then run the curl command above inside the WSL terminal.
Claude Code first‑time authentication:
Install Claude Code CLI (requires Anthropic API key) npm install -g @anthropic/claude-code claude login
Once installed, open Claude Code in any project directory and simply describe your objective. Claude automatically routes to the appropriate specialist agent.
2. Running Autonomous Reconnaissance with the Recon Advisor
The `recon-advisor` and `osint-collector` agents execute the discovery phase, leveraging tools like Nmap, Masscan, Subfinder, Amass, and theHarvester.
Step‑by‑step autonomous recon inside Claude Code:
Launch Claude Code in your working directory claude Within Claude Code, give a natural language prompt: "Run a full reconnaissance sweep against target.com. Use the recon-advisor agent. Perform subdomain enumeration, port scanning, service detection, and OSINT gathering. Respect scope boundaries."
The agent will autonomously execute commands such as:
Example commands the agent may run (with your approval if not in auto‑mode) subfinder -d target.com -o subdomains.txt amass enum -passive -d target.com nmap -sV -sC -p- target.com -oA nmap_full whatweb target.com
Scope enforcement: All agents include a hard‑refusal list that blocks DoS attacks, mass scanning of out‑of‑scope ranges, and operations against safety‑of‑life systems. The `_scope-guard.md` file explicitly defines these boundaries.
3. Web Application and API Security Testing
The web-hunter, api-security, bug-bounty, and `bizlogic-hunter` agents automate discovery of OWASP Top 10 vulnerabilities. These agents orchestrate tools like FFUF, SQLmap, Dalfox, and Commix.
Prompt example inside Claude Code:
"I need to test the API at https://api.target.com/v1. Use the api-security agent. Perform a full fuzzing of all endpoints, check for IDOR, test for SQL injection on all parameters, and validate JWT handling."
Underlying commands the agent may execute (for manual reference):
Fuzzing endpoints ffuf -u https://api.target.com/v1/FUZZ -w /usr/share/wordlists/dirb/common.txt SQL injection testing sqlmap -u "https://api.target.com/v1/user?id=1" --batch --level=3 XSS hunting dalfox url https://target.com/search?q=test Commix for command injection commix --url="https://target.com/ping?ip=127.0.0.1"
Findings database: Every discovered vulnerability is automatically logged to a SQLite findings database (vulns.db). Use `findings.sh stats` to track progress and `findings.sh export` to generate JSON reports.
4. Active Directory and Credential Attacks
The `ad-attacker` and `credential-tester` agents specialize in internal network assessments. They drive Impacket, BloodHound, NetExec (CrackMapExec), Certipy, kerbrute, and Responder.
Step‑by‑step AD attack simulation (authorized lab only):
"Simulate an internal penetration test against the domain corp.local. I have a low‑privilege domain user account. Use the ad-attacker agent to enumerate users, hunt for Kerberoastable accounts, check for AS‑REP roasting, and identify misconfigured ACLs."
Example commands the agent may run:
Kerberoasting impacket-GetUserSPNs -request -dc-ip 10.0.0.1 corp.local/lowuser AS‑REP roasting impacket-GetNPUsers -dc-ip 10.0.0.1 corp.local/ -usersfile users.txt BloodHound collection bloodhound-python -d corp.local -u lowuser -p password -ns 10.0.0.1 -c All Pass‑the‑hash with NetExec nxc smb 10.0.0.0/24 -u administrator -H <NTLM_hash>
The agent understands AD attack paths and can prioritize based on your access level. It also integrates with `credential-tester` for password cracking using Hashcat and John.
5. Cloud Security, Container Breakout, and C2 Operations
Modern infrastructures demand cloud and container testing. The cloud-security, cicd-redteam, and `container-breakout` agents assess AWS, Azure, GCP, Kubernetes, and Docker environments. The `c2-operator` agent designs command‑and‑control infrastructure.
Step‑by‑step cloud misconfiguration scan:
"Audit our AWS production environment for security misconfigurations. Use the cloud-security agent with Prowler and ScoutSuite. Check for open S3 buckets, IAM privilege escalation paths, and unencrypted RDS instances."
Underlying command examples:
AWS Prowler assessment prowler aws --services s3,iam,rds Kubernetes breakout assessment kube-hunter --remote 10.0.0.2:6443 peirates -kubeconfig /path/to/kubeconfig Docker escape vector scanning cdk evaluate --target 10.0.0.3
Container breakout detection: The agent will attempt known runc and CRI‑O escape techniques (only on consenting test systems) and correlate findings with Falco detection rules.
6. LLM Red Teaming and Prompt Injection
The new `llm-redteam` agent (added in v3.2) tests LLM‑powered applications for OWASP LLM Top 10 risks: prompt injection, RAG poisoning, MCP server abuse, and agent tool abuse.
Step‑by‑step AI application penetration test:
"Test the LLM chatbot at https://chat.target.com. Use the llm-redteam agent. Run prompt injection payloads from Garak and PyRIT. Attempt to leak system prompts, bypass content filters, and extract training data."
Example test commands:
Garak LLM vulnerability scanner garak --model_type openai --model_name gpt-3.5-turbo --probes all PyRIT framework python pyrit.py --endpoint https://chat.target.com --strategy "prompt_injection"
The agent can also test MCP (Model Context Protocol) servers for injection vulnerabilities and assess whether an LLM agent can be tricked into calling malicious tools.
7. Reporting, Handoff, and Operational Security
The `opsec-anonymizer` agent (new in v3.2) provides operator‑side identity hygiene: source IP design, JA3 fingerprint management, and burner infrastructure checklists. After testing, generate professional reports using the built‑in reporting agents.
Generate a handoff report between sessions:
bash handoff.sh
This creates a Markdown report containing all findings, commands run, and next steps.
Export findings in JSON for integration with vulnerability management platforms:
findings.sh export --format json > pentest_report.json
OPSEC checklist before any engagement:
- Run `opsec-anonymizer` to review your source IP and fingerprint exposure
- Use dedicated burner infrastructure for external testing
- Verify all agents respect the `_scope-guard.md` boundaries
- Never run autonomous agents against production systems without explicit written authorization
What Undercode Say
- AI agents are not replacing human pentesters — they are force multipliers. The 35 Claude Code agents excel at repetitive, time‑intensive tasks (recon, fuzzing, enumeration), allowing human experts to focus on complex logic flaws, business logic abuse, and strategic decision‑making.
- Scope enforcement and safety guardrails are mandatory. Every production‑ready agent suite now includes hard‑refusal lists and explicit boundary files. Without these, autonomous AI tools can become uncontrollable weapons. Always run agentic tools in isolated environments (containers or dedicated VMs).
The rise of AI‑powered penetration testing represents a fundamental shift in offensive security. In February 2026, Anthropic reported that Claude Code Security identified over 500 vulnerabilities across production open‑source codebases using LLM‑based reasoning. By March 2026, researchers cataloged 70 open‑source AI penetration testing tools — fewer than five existed before GPT‑4’s release in April 2023. The 35 Claude Code agents discussed here are part of a broader movement: agentic AI is democratizing advanced security testing while simultaneously introducing new classes of vulnerabilities (prompt injection, agent tool abuse, MCP server compromise). Defenders must learn to test AI systems with AI agents, just as attackers will.
Prediction: By Q4 2026, autonomous AI penetration testing will be a standard offering from major security vendors. The bottleneck will shift from “finding vulnerabilities” to “validating false positives and chaining exploits.” Organizations that fail to integrate AI agents into their red teaming and bug bounty programs will fall behind attackers who already use these tools. However, the weaponization of Claude Code and similar agents — turning them into nation‑state‑level attack tools with no coding required — is already a reality. The next 12 months will see an arms race between AI‑powered attackers and AI‑powered defenders, with scope enforcement and behavioral monitoring becoming critical control points.
▶️ Related Video (76% Match):
🎯Let’s Practice For Free:
IT/Security Reporter URL:
Reported By: Omar Aljabr – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


