Hades PyPI Attack: How a 10MB JavaScript Blob Completely Blinds AI Security Scanners + Video

Listen to this Post

Featured Image

Introduction:

Attackers have discovered a deceptively simple method to evade AI-powered security scanners by exploiting a fundamental weakness in how machine learning models analyze code. The Hades PyPI supply chain campaign weaponized large, non-executing JavaScript payloads—massive blocks of inert code that overwhelm natural language processing (NLP) based security tools without ever executing malicious logic.

Learning Objectives:

  • Understand how oversized, non-executing code blobs can blind AI/ML security scanners through token limits and entropy flooding.
  • Detect PyPI packages containing suspiciously large JavaScript or other inert payloads using command-line forensics.
  • Implement manual and automated mitigation strategies to block supply chain attacks that abuse scanner blind spots.

You Should Know:

1. Anatomy of the Hades Blinding Technique

Attackers upload malicious PyPI packages that include a `setup.py` or `__init__.py` containing a multiline string of random-looking JavaScript code (e.g., 10–20 MB). This JavaScript is never imported or executed during normal installation—but AI security scanners that parse all files in a package will choke on the massive token count, time out, or discard the file entirely. The actual malicious payload (e.g., reverse shell, credential stealer) lives in a separate, small, innocuous-looking file that the scanner never reaches.

Step‑by‑step detection on Linux/macOS:

 Download a suspicious PyPI package tarball
pip download --1o-deps --1o-binary :all: <package-1ame>
tar -xzf <package-1ame>.tar.gz

Find all JavaScript files larger than 1MB
find ./<package-1ame> -type f ( -1ame ".js" -o -1ame ".javascript" ) -size +1M -exec ls -lh {} \;

Extract entropy of suspicious files (high entropy = random-looking data)
entropy <file.js>  install via 'entropy' or use:
python3 -c "import sys; data=open(sys.argv[bash],'rb').read(); import math; freq=[data.count(b) for b in set(data)]; print(-sum((f/len(data))math.log2(f/len(data)) for f in freq))" <file.js>

On Windows (PowerShell):

 Find large .js files recursively
Get-ChildItem -Path .\package-1ame -Recurse -Include .js | Where-Object {$_.Length -gt 1MB} | Select-Object FullName, Length

Calculate file entropy
$file = ".\suspicious.js"; $bytes = [System.IO.File]::ReadAllBytes($file); $freq = $bytes | Group-Object | ForEach-Object {$<em>.Count}; $len = $bytes.Length; $entropy = -($freq | ForEach-Object {($</em>-/$len)[bash]::Log($_/$len,2)} | Measure-Object -Sum).Sum; Write-Host "Entropy: $entropy"

What this does: High entropy (close to 8) indicates compressed/encrypted/random data, typical of non-executing blobs used to blind scanners.

2. Bypassing AI Scanner Token Limits

Most AI/ML security scanners (e.g., Semgrep ML, GitGuardian, Socket.dev) have hard token limits (often 512–4096 tokens per file). Attackers exploit this by embedding a large comment block or string literal of gibberish JavaScript that exceeds the token budget, causing the scanner to skip the file entirely.

Step‑by‑step mitigation – custom scanning script:

!/usr/bin/env python3
 scan_pypi_blind.py - Detects oversized inert payloads
import os, sys, tarfile, io
from pathlib import Path

SUSPECT_EXTENSIONS = {'.js', '.html', '.txt', '.json'}
MAX_SAFE_SIZE = 500  1024  500KB

def check_package(package_tar):
with tarfile.open(package_tar, 'r:gz') as tar:
for member in tar.getmembers():
if member.size > MAX_SAFE_SIZE and any(member.name.endswith(ext) for ext in SUSPECT_EXTENSIONS):
print(f"[!] Oversized {member.name} ({member.size} bytes) - possible scanner blind")
 Extract and check for high entropy
f = tar.extractfile(member)
if f:
data = f.read()
entropy = -sum((data.count(b)/len(data))  (data.count(b)/len(data)).bit_length() for b in set(data))  simplified
if entropy > 7.0:
print(f" High entropy ({entropy:.2f}) - likely obfuscated payload")

if <strong>name</strong> == "<strong>main</strong>":
check_package(sys.argv[bash])

Usage: `python3 scan_pypi_blind.py malicious-package.tar.gz`

3. Hardening AI Scanners Against Payload Flooding

AI tools can be reconfigured to sample or chunk large files rather than skipping them. For OpenAI‑based scanners, implement a sliding window tokenizer that processes files in overlapping segments.

Example chunking strategy (pseudo‑configuration for custom scanner):

 scanner_config.yml
blinding_mitigation:
large_file_policy: "chunk_and_sample"
chunk_size_tokens: 4000
chunk_overlap: 500
max_chunks_per_file: 20
fallback_entropy_check: true
entropy_threshold: 7.0
action_on_high_entropy: "manual_review_required"

For cloud‑based scanners (AWS Macie, Google Chronicle): Use pre‑processor Lambdas that truncate files to first/last 1MB, but also randomly sample 10KB from middle to catch split malicious code.

  1. Detecting the Real Payload After the Blinding Layer

The Hades campaign hid malicious code in `utils/__init__.py` or `helpers/__init__.py` with innocuous names. After the large JavaScript blob causes a scanner timeout, the real backdoor installs a cryptominer or steals AWS keys.

Linux command to find suspicious Python files with network imports:

grep -r --include=".py" -E "(requests.|urllib.|socket.|subprocess.|os.system|eval|exec|<strong>import</strong>)" ./package-1ame/ | grep -v "test_" | grep -v "<strong>pycache</strong>"

Windows equivalent (findstr):

findstr /s /i /m "requests. urllib. socket. subprocess. os.system eval exec <strong>import</strong>" .py

5. Mitigating Supply Chain Risk in CI/CD Pipelines

Add a pre‑install hook that blocks PyPI packages containing any file > 1MB with high entropy or non‑executable extensions.

Sample `.gitlab-ci.yml` job:

scan-pypi-blind:
stage: test
script:
- pip download --1o-deps --1o-binary :all: $PACKAGE_NAME
- python3 scan_pypi_blind.py $PACKAGE_NAME-.tar.gz || exit 1
only:
- merge_requests
variables:
PACKAGE_NAME: "${CI_PROJECT_NAME}"

For Docker builds, add a multistage check:

FROM python:3.11-slim AS downloader
RUN pip download --1o-deps --1o-binary :all: some-package
COPY scan_pypi_blind.py /scan.py
RUN python3 /scan.py /some-package-.tar.gz || exit 1

What Undercode Say:

  • Key Takeaway 1: AI security scanners are vulnerable to adversarial input flooding—attackers don’t need complex exploits, just size and entropy.
  • Key Takeaway 2: Manual inspection using entropy and file size thresholds remains effective even when AI fails.

Analysis (approx. 10 lines):

The Hades campaign exposes a critical blind spot in ML‑based security tooling: token limits and file skipping policies are not adversarial‑resistant. This is analogous to traditional DoS attacks but applied to AI parsers. Most vendors assume that scanning all files is sufficient, but they don’t account for an attacker’s ability to force a timeout or context window overflow. The simplicity—embedding inert JavaScript—means even low‑skill attackers can replicate this across PyPI, npm, and RubyGems. Defenders must move beyond “trust the scanner” and implement layered checks: file size caps, entropy analysis, and network call detection in CI/CD. Longer‑term, AI models need dynamic token allocation and anomaly detection on file structures, not just code semantics. Until then, human‑reviewed blocklists and manual auditing of large blobs will remain essential.

Prediction:

  • -1 Attackers will port the Hades technique to npm and RubyGems within 3 months, using large HTML comments or Markdown files instead of JavaScript to blind scanners.
  • -1 AI security scanner vendors will temporarily raise token limits to 50,000, but attackers will respond with 100MB blobs, creating performance degradation and cost spikes for SaaS scanners.
  • +1 Open‑source community tools like `pip-audit` and `safety` will add `–max-file-size` and entropy checks by Q3 2026, closing the gap without AI.
  • -1 Supply chain attacks targeting AI scanner blind spots will increase 400% YoY as threat actors share evasion cookbooks on darknet forums.
  • +1 Manual code review checklists will be updated to include “find largest file in package” as a mandatory step, reducing risk for mature DevSecOps teams.

▶️ Related Video (82% Match):

🎯Let’s Practice For Free:

🎓 Live Courses & Certifications:

Join Undercode Academy for Verified Certifications

🚀 Request a Custom Project:

Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands

IT/Security Reporter URL:

Reported By: Azubuike Ibe – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky