Listen to this Post

Introduction:
The ZIP file format, ubiquitous for compression and archiving, harbors a dark secret: its specification allows for inconsistencies between the actual compressed data and the metadata describing it. This structural ambiguity has given rise to a new evasion technique dubbed “Zombie ZIP,” where attackers craft malicious archives that appear benign to security scanners but unleash malware upon extraction. By exploiting how different tools parse ZIP files—some reading the central directory, others scanning local file headers—cybercriminals can slip malicious payloads past antivirus, sandboxes, and email filters, making this a critical concern for defenders.
Learning Objectives:
- Understand the structural vulnerabilities in the ZIP format that enable the “Zombie ZIP” attack.
- Learn practical methods to create, detect, and analyze malicious ZIP files using command-line tools and custom scripts.
- Implement robust defense strategies for developers, security teams, and end-users to mitigate this and similar archive-based threats.
You Should Know:
- Anatomy of a ZIP File: Local Headers vs. Central Directory
A ZIP archive is not a simple stream of compressed data; it consists of two main parts: a series of local file headers (each followed by the compressed data of a file) and a central directory at the end of the archive. The central directory contains metadata for every file (name, compression method, offset to the local header, etc.) and is used by most tools to list contents quickly. However, the ZIP specification does not require the central directory to match the local headers perfectly. Attackers exploit this by creating discrepancies—for example, a central directory entry might point to a benign file, while the actual data at that offset is malicious, or two different files may be assigned the same name.
Step‑by‑step inspection with `zipinfo`:
To examine a ZIP file’s structure, use the `zipinfo` utility on Linux/macOS:
zipinfo -v suspicious.zip
This command displays detailed information from the central directory, including file names, offsets, and compression methods. For a deeper look at local headers, use:
unzip -Z suspicious.zip
Or dump raw headers with `od` or a hex editor. A healthy ZIP should have consistent file names and offsets between the central directory and the local headers.
2. The Zombie ZIP Technique: How It Works
The “Zombie ZIP” technique typically involves creating a ZIP archive containing two entries with the identical filename but different content. Security tools that scan the archive might only see the first entry (if they parse sequentially) or rely solely on the central directory, which may list only one of them. When a user extracts the archive with a standard tool (like Windows’ built-in ZIP handler or unzip), the last occurrence usually overwrites the first, potentially delivering a malicious file while the scanner only saw a benign one.
Creating a malicious ZIP with Python:
The following Python script uses the `zipfile` module to create a ZIP with duplicate filenames:
import zipfile
with zipfile.ZipFile('zombie.zip', 'w') as zf:
First entry: benign file
zf.writestr('document.pdf', b'%PDF-1.4 benign...')
Second entry: same name, but malicious content
zf.writestr('document.pdf', b'!/bin/bash malicious_payload')
This ZIP, when scanned by some antivirus engines, might only be checked for the first document.pdf. Upon extraction with unzip, the second (malicious) version overwrites the first, executing the payload if the user opens it.
3. Real-World Impact: Bypassing Security Scanners
Many security products, including antivirus and email gateways, parse ZIP files differently. Some extract only the first file with a given name, others rely on the central directory, and a few may decompress all entries but scan only a subset. Attackers have successfully used such discrepancies to deliver malware strains like Emotet and QakBot, hiding them behind benign decoys.
Testing detection with ClamAV:
Create a test ZIP using the Python script above and scan it with ClamAV:
clamscan zombie.zip
If ClamAV only sees the first (benign) PDF, it may report the file as clean. To test extraction, use:
unzip zombie.zip -d test_extract/
Check which file was written:
ls -l test_extract/ file test_extract/document.pdf
The second malicious version will likely be present. This demonstrates the potential for evasion.
4. Detecting Zombie ZIPs: Tools and Techniques
Defenders need to identify anomalous ZIPs before they reach end users. Manual inspection can be supplemented with automated checks for duplicate filenames and mismatched offsets.
Using `zipdetails` (Perl utility):
`zipdetails` provides a low-level dump of ZIP structures, revealing inconsistencies:
zipdetails zombie.zip | grep -i "filename"
Look for duplicate file names or unexpected local header offsets.
Python script to detect duplicates:
import zipfile
def detect_duplicates(zip_path):
with zipfile.ZipFile(zip_path, 'r') as zf:
names = zf.namelist()
duplicates = set([n for n in names if names.count(n) > 1])
if duplicates:
print(f"Duplicate filenames found: {duplicates}")
Further check: compare data? In real scenarios, also check offsets.
else:
print("No duplicate filenames.")
detect_duplicates('zombie.zip')
Advanced detection could compare the CRC32 of each local file header with the central directory or verify that the compressed data matches the expected offset.
5. Mitigation Strategies for Developers and Security Teams
For developers parsing ZIPs: Never trust the central directory blindly. Use a parser that validates each local header and checks consistency. Reject archives with duplicate file names or mismatched metadata. Libraries like `libzip` offer functions to verify archive integrity.
For security tool vendors: Implement multi-parser analysis. Scan the archive using different extraction methods (e.g., first entry, last entry, all entries) and flag any discrepancies. Sandbox extraction can also reveal malicious behavior regardless of filename tricks.
For system administrators and end-users:
- Use extraction tools that warn about duplicate entries (e.g., 7-Zip with its “overwrite” prompt can alert users, but better to use tools that refuse to extract ambiguous archives).
- On Linux, consider using `bsdtar` (libarchive) which may handle duplicates more strictly.
- In cloud environments, validate uploaded ZIPs with a strict parser before processing.
6. Command-Line Analysis on Windows and Linux
Linux – using `7z` to list contents:
7z l -slt zombie.zip | grep "Path = " | sort | uniq -c
This counts file names; any count >1 indicates duplicates.
Windows – PowerShell to inspect ZIP:
Add-Type -AssemblyName System.IO.Compression.FileSystem
$zip = [System.IO.Compression.ZipFile]::OpenRead('C:\path\to\zombie.zip')
$zip.Entries | Group-Object Name | Where-Object { $_.Count -gt 1 }
$zip.Dispose()
This script groups entries by name and outputs any duplicate groups.
7. Advanced Exploitation: Combining Duplicates with Path Traversal
Attackers often pair duplicate filenames with path traversal (e.g., ../../etc/passwd) to increase impact. A ZIP could contain a benign file named `notes.txt` and a malicious file also named `notes.txt` that actually extracts to a system directory. Defenders must also check for absolute paths or relative paths with `..` sequences, especially when duplicates exist.
Mitigation: Normalize paths and reject any entry that resolves outside the target directory. Use `os.path.abspath` or `Path.resolve()` in Python after joining with the extraction root.
What Undercode Say:
- Key Takeaway 1: The ZIP format’s inherent ambiguity—allowing inconsistent metadata—is a persistent attack vector. Understanding this structure is essential for anyone building or using archive-handling tools.
- Key Takeaway 2: Security defenses must evolve from simple signature matching to behavioral and structural analysis. Detecting anomalies like duplicate filenames, offset mismatches, and path traversal attempts should become standard in all security products.
Analysis: The Zombie ZIP technique is a stark reminder that format complexity often leads to security gaps. As file formats grow more feature-rich, the potential for parser discrepancies widens. This attack is not limited to ZIP; similar issues exist in RAR, 7z, and even PDF. The security community must push for stricter specifications, better parser testing (including fuzzing), and cross-tool collaboration to define canonical representations. Until then, defenders must assume that any archive could be malicious and employ layered detection—combining static analysis, sandbox extraction, and user education. The rise of such evasion tactics underscores the importance of defense in depth, where no single tool is trusted completely.
Prediction:
In the coming months, we can expect a surge in archive-based evasion techniques as attackers capitalize on the publicity of Zombie ZIP. We will likely see similar exploits targeting other archive formats (RAR, ISO, tar) and even compound documents like Office files. Security vendors will rush to update their parsers, but the cat-and-mouse game will continue. In the long term, we may witness the emergence of “safe archive” standards or mandatory validation layers in cloud storage services. Moreover, machine learning models trained on structural anomalies could become a frontline defense, flagging archives that deviate from expected patterns. For now, every organization should audit its archive-handling pipelines and prepare for a wave of creatively malformed files.
▶️ Related Video (70% Match):
🎯Let’s Practice For Free:
IT/Security Reporter URL:
Reported By: Martijngrooten The – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


