How This OSINT Engineer Cracked Substack’s Top 100 – And How You Can Build The Same Investigative Arsenal + Video

Introduction:

Open Source Intelligence (OSINT) has evolved far beyond simple Google searches. Today, it’s a disciplined engineering practice that combines data science, cybersecurity, and investigative journalism to turn publicly available information into actionable intelligence. Maxim Marshak, an OSINT Engineer who recently earned a spot in Substack’s Top 100 Rising in Technology, exemplifies this new breed of practitioner—someone who bridges the gap between software development and deep investigative work. His OSINTech publication demonstrates that mastering OSINT isn’t just about knowing a few tools; it’s about building systematic, repeatable workflows that scale across people, companies, and incident data.

Learning Objectives:

Master the core OSINT investigation lifecycle—from passive reconnaissance to active data correlation and breach analysis.
Build a production-ready OSINT environment using Python, Bash, and CLI-first tools across Linux and Windows.
Apply AI-enhanced techniques for identity triangulation, cognitive profiling, and automated intelligence gathering.

Building Your OSINT Lab: Linux, Windows, and the Hybrid Approach

Every serious OSINT engineer needs a controlled environment where they can run reconnaissance safely and legally. The most common approach is a Kali Linux virtual machine, but Windows users can equally participate using WSL2 (Windows Subsystem for Linux) or native PowerShell tools.

Step‑by‑step guide:

Set up Kali Linux (or Ubuntu) in a VM or as your primary OS. Install essential packages:

sudo apt update && sudo apt upgrade -y
sudo apt install git python3 python3-pip nmap whois dnsrecon theharvester -y

For Windows users, enable WSL2 and install Ubuntu:
```
wsl --install -d Ubuntu
```
Then follow the same installation steps inside the WSL terminal.

Create a Python virtual environment to isolate your OSINT tool dependencies:

python3 -m venv osint-env
source osint-env/bin/activate  Linux/Mac
or .\osint-env\Scripts\activate  Windows PowerShell

4. Install common OSINT libraries:

pip install requests beautifulsoup4 shodan python-whois dnspython pandas

This setup gives you a foundation to run most open-source OSINT tools without polluting your system environment.

2. Passive Reconnaissance: Amass, theHarvester, and DNS Enumeration

Passive reconnaissance means gathering information without directly touching the target’s infrastructure. Tools like Amass and theHarvester are industry standards for this phase.

Step‑by‑step guide:

1. Install Amass (if not already present):

sudo apt install amass -y
 Or from source: go install -v github.com/OWASP/Amass/v3/...@master

Run a passive subdomain enumeration against a target domain:
```
amass enum -passive -d example.com -o amass_output.txt
```
This queries multiple data sources (search engines, certificates, DNS) without sending probes to the target.

Use theHarvester to gather emails, subdomains, and employee names:

theharvester -d example.com -b google,bing,linkedin -l 500 -f harvest_results.html

The `-b` flag specifies sources; `-l` limits results.

4. Combine with DNS enumeration using `dnsrecon`:

dnsrecon -d example.com -t axfr  test for zone transfer vulnerability
dnsrecon -d example.com -t brt  brute-force subdomains

These commands build a surface-level intelligence picture that can later be deepened with active scanning.

Active Scanning and Network Fingerprinting with Nmap and Custom Scripts

Once passive data is collected, active scanning helps verify findings and discover live services. Nmap remains the most widely used network scanning tool in 2026, with its scripting engine (NSE) enabling deep service fingerprinting.

Step‑by‑step guide:

Perform a basic SYN scan on discovered IP ranges:

nmap -sS -Pn -p- -T4 -oA active_scan <target_ip_or_range>

Run service and version detection on open ports:
```
nmap -sV -sC -p 80,443,22,21 <target_ip>
```
Use NSE scripts for OSINT-specific checks (e.g., HTTP titles, SSL certs):
```
nmap --script http-title,ssl-cert -p 443 <target_ip>
```
For Windows environments, use PowerShell’s `Test-1etConnection` for quick connectivity checks:
```
Test-1etConnection -ComputerName example.com -Port 443
```

Automate fingerprinting with a Python script that parses Nmap XML output and correlates with Shodan data:

import subprocess, json
result = subprocess.run(['nmap', '-sV', '-oX', '-', 'example.com'], capture_output=True)
Parse XML and enrich with Shodan API

Active scanning must always be conducted with proper authorization to avoid legal repercussions.

Identity Triangulation and Breach Analysis with AI-Powered OSINT

Modern OSINT engineering increasingly leverages AI to correlate disparate data points. Tools like OSINT-D2 use agentic AI to transform usernames and emails into structured identity dossiers, performing breach analysis and cognitive profiling from a single CLI command.

Step‑by‑step guide:

1. Install OSINT-D2 (or similar CLI-first toolkit):

git clone https://github.com/Doble-2/osint-d2
cd osint-d2
pip install -r requirements.txt

2. Run a basic identity lookup:

python osint-d2.py --username "target_user" --platforms all

This enumerates profiles across 30+ platforms in five languages.

Perform breach analysis using Have I Been Pwned or similar APIs:

curl -X GET "https://haveibeenpwned.com/api/v3/breachedaccount/[email protected]" -H "hibp-api-key: YOUR_KEY"

For local breach corpus analysis, use `grep` and `awk` to search through leaked datasets (legally obtained):
```
zgrep -i "[email protected]" breaches/.csv.gz | awk -F',' '{print $1,$2,$3}'
```

5. Leverage Python for automated correlation:

import requests
response = requests.get('https://api.breachdirectory.org/v1/[email protected]')
data = response.json()
 Cross-reference with social media profiles

AI-enhanced OSINT reduces manual effort from hours to seconds, but always verify AI-generated leads with independent sources.

Investigative Journalism Workflows: From Lead to Verified Story

OSINT for journalism follows a different rhythm than penetration testing. It starts with a single lead—a name, phone number, email, or photo—and expands outward through cross-referencing and geolocation.

Step‑by‑step guide:

Reverse image search using TinEye or Google Images via CLI (using googlesearch-python):

pip install googlesearch-python
python -c "from googlesearch import search; print(list(search('image_url', num=10)))"

2. Geolocate photos by extracting EXIF metadata:

exiftool image.jpg | grep -i gps
 Or use: identify -verbose image.jpg | grep -i gps

Cross-reference social media using tools like `ghostlink.py` for username enumeration across 100+ platforms:
```
python ghostlink.py -u target_username
```

Document findings in a structured format (CSV or JSON) for collaborative investigation:

import json, csv
Write findings to a timeline CSV
with open('timeline.csv', 'w') as f:
writer = csv.writer(f)
writer.writerow(['Timestamp', 'Source', 'Artifact', 'Confidence'])

Use OSQuery for forensic system interrogation across Windows, macOS, and Linux endpoints—treating the OS as a relational database:
```
SELECT  FROM processes WHERE name LIKE '%suspicious%';
SELECT  FROM file_events WHERE target_path LIKE '%/etc/passwd%';
```

Journalistic OSINT demands rigorous chain-of-custody and transparent workflows, especially when AI is involved.

API Security and Cloud Hardening in OSINT Workflows

OSINT engineers frequently interact with third-party APIs (Shodan, VirusTotal, Hunter.io). Securing API keys and hardening cloud instances is non-1egotiable.

Step‑by‑step guide:

Store API keys securely using environment variables (never hard-code):

export SHODAN_API_KEY="your_key_here"
export VT_API_KEY="your_vt_key"

On Windows:

$env:SHODAN_API_KEY="your_key_here"

2. Use a `.env` file with `python-dotenv`:

from dotenv import load_dotenv
load_dotenv()
import os
api_key = os.getenv('SHODAN_API_KEY')

3. Implement rate limiting to avoid API bans:

import time
def rate_limited_call(api_func, delay=1):
time.sleep(delay)
return api_func()

Harden your cloud VPS (if used for OSINT scraping):

sudo ufw enable
sudo ufw allow 22/tcp  restrict to your IP if possible
sudo fail2ban-client start

5. Monitor outgoing traffic to detect data exfiltration:

sudo tcpdump -i eth0 -1 'port 443' -c 100

Securing your OSINT infrastructure is as important as the intelligence you gather—compromised tools lead to compromised investigations.

What Undercode Say:

OSINT is engineering, not just searching. Maxim Marshak’s rise on Substack underscores that modern OSINT demands software development skills—Python, Bash, and API integrations are just as critical as knowing which website to visit.
AI is reshaping the field, but human verification remains king. Agentic tools can accelerate identity triangulation, but they also introduce false positives. The best OSINT engineers treat AI as a force multiplier, not a replacement for critical thinking.

The OSINT market is projected to grow from $2.9 billion in 2025 to nearly $6 billion by 2032, reflecting its expanding role in cybersecurity, journalism, and corporate intelligence. Practitioners who can code, automate, and think like investigators will dominate this space. Marshak’s Substack ranking is a signal that the industry values depth over breadth—those who publish systematic, repeatable methodologies gain credibility faster than those who simply share tool lists.

Prediction:

+1 The integration of large language models into OSINT workflows will automate 60% of routine data correlation tasks by 2027, allowing investigators to focus on high-value pattern recognition and narrative construction.
+1 Substack and similar platforms will become primary talent signals for OSINT hiring, as engineers who publicly document their methodologies demonstrate both technical competence and communication skills—a rare combination.
-1 The democratization of AI-powered OSINT tools will lower the barrier to entry for malicious actors, leading to a surge in automated social engineering and identity theft campaigns that exploit publicly correlated data.
-1 Regulatory frameworks (GDPR, CCPA) will increasingly restrict passive data collection, forcing OSINT engineers to navigate a complex legal landscape where once-public information becomes contested territory.
+1 Cross-disciplinary OSINT—merging cybersecurity, data science, and investigative journalism—will become a formal academic discipline, with universities offering dedicated degrees by 2028.
-1 The Substack breach of 700,000 accounts in 2025 serves as a warning: OSINT repositories themselves are high-value targets. Engineers must treat their own investigative data with the same rigor they apply to client work.

▶️ Related Video (72% Match):

https://www.youtube.com/watch?v=aD-U2kP0vNk

🎯Let’s Practice For Free:

🎓 Live Courses & Certifications:

Join Undercode Academy for Verified Certifications

🚀 Request a Custom Project:

Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands

IT/Security Reporter URL:

Reported By: Osintech And – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

💬 Whatsapp | 💬 Telegram

📢 Follow UndercodeTesting & Stay Tuned:

𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky

Listen to this Post

Introduction:

Learning Objectives:

Step‑by‑step guide:

4. Install common OSINT libraries:

2. Passive Reconnaissance: Amass, theHarvester, and DNS Enumeration

Step‑by‑step guide:

1. Install Amass (if not already present):

The `-b` flag specifies sources; `-l` limits results.

4. Combine with DNS enumeration using `dnsrecon`:

Step‑by‑step guide:

Step‑by‑step guide:

1. Install OSINT-D2 (or similar CLI-first toolkit):

2. Run a basic identity lookup:

5. Leverage Python for automated correlation:

Step‑by‑step guide:

2. Geolocate photos by extracting EXIF metadata:

Step‑by‑step guide:

On Windows:

2. Use a `.env` file with `python-dotenv`:

3. Implement rate limiting to avoid API bans:

5. Monitor outgoing traffic to detect data exfiltration:

What Undercode Say:

Prediction:

▶️ Related Video (72% Match):

🎯Let’s Practice For Free:

🎓 Live Courses & Certifications:

🚀 Request a Custom Project:

IT/Security Reporter URL:

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

📢 Follow UndercodeTesting & Stay Tuned:

Share this:

Related Posts: