Listen to this Post

Introduction
Threat intelligence is a critical component of modern cybersecurity, helping organizations identify, analyze, and mitigate threats from advanced persistent threat (APT) groups and other malicious actors. Open-source intelligence (OSINT) provides a wealth of unstructured and semi-structured data that, when properly curated, can form a robust threat actor dataset. This article explores key resources and methodologies for compiling such datasets, along with practical commands and tools to automate the process.
Learning Objectives
- Identify key OSINT sources for threat actor data.
- Learn how to consolidate and cross-reference threat intelligence.
- Automate data collection and enrichment using scripting and APIs.
You Should Know
1. Extracting Threat Actor Data from Malpedia
Malpedia provides structured data on 821 adversaries, including aliases and technical indicators.
Command (Python – `requests` library):
import requests
response = requests.get("https://malpedia.caad.fkie.fraunhofer.de/api/get/actors")
actors = response.json()
for actor in actors:
print(actor["name"], actor["description"])
Steps:
1. Install Python’s `requests` library (`pip install requests`).
- Use the Malpedia API to fetch actor data.
- Parse JSON output to extract names and descriptions.
2. Querying MITRE ATT&CK for Threat Group Tactics
MITRE ATT&CK provides structured threat group profiles with Tactics, Techniques, and Procedures (TTPs).
Command (curl):
curl -X GET "https://attack.mitre.org/api/v2/groups/" -H "accept: application/json" | jq '.objects[] | {name, description}'
Steps:
1. Use `curl` to fetch MITRE’s group data.
2. Pipe output to `jq` for JSON parsing.
3. Extract group names and descriptions for analysis.
3. Automating MISP Galaxy Data Ingestion
MISP Galaxy aggregates threat actor aliases and relationships.
Command (Python – `pymisp`):
from pymisp import PyMISP
misp = PyMISP("https://your-misp-instance.com", "API_KEY")
galaxy_clusters = misp.galaxy_clusters()
for cluster in galaxy_clusters:
print(cluster["value"], cluster["meta"]["synonyms"])
Steps:
1. Install `pymisp` (`pip install pymisp`).
2. Authenticate with your MISP instance.
3. Extract threat actor clusters and aliases.
4. Enriching Data with APTMap
APTMap combines multiple datasets for cross-referencing.
Command (Python – `pandas`):
import pandas as pd
df = pd.read_csv("https://aptmap.net/data/apt_groups.csv")
print(df[["name", "aliases", "suspected_origin"]].head())
Steps:
1. Use `pandas` to load APTMap’s CSV data.
2. Filter relevant columns (name, aliases, origin).
3. Merge with other datasets for enrichment.
5. Bulk Exporting from Mandiant (Google Threat Intelligence)
Mandiant’s reports provide deep insights into APT groups.
Command (wget for bulk download):
wget --recursive --accept pdf --no-parent https://www.mandiant.com/resources/reports
Steps:
1. Use `wget` to download Mandiant reports.
- Extract threat actor details using PDF parsers like
pdftotext.
3. Combine with structured datasets.
6. Automating Threat Intel with SOCRadar API
SOCRadar offers an API for threat actor profiling.
Command (curl with API key):
curl -X GET "https://api.socradar.com/threat/actors" -H "Authorization: Bearer YOUR_API_KEY"
Steps:
1. Obtain an API key from SOCRadar.
2. Query threat actor endpoints.
3. Store results in a structured format (JSON/CSV).
7. Cloud-Focused Threat Actors from WIZ
WIZ tracks cloud-specific threats.
Command (AWS CLI for cross-checking):
aws guardduty list-threat-intel-sets --region us-east-1
Steps:
- Use AWS GuardDuty to compare with WIZ’s cloud threat list.
2. Identify overlaps in IOCs (Indicators of Compromise).
What Undercode Say
- Key Takeaway 1: Consolidating OSINT sources reduces manual effort and improves threat visibility.
- Key Takeaway 2: Automation (APIs, scripting) is essential for scalable threat intelligence.
Analysis:
The fragmentation of threat actor data across vendors necessitates automated aggregation. While no single source provides complete coverage, combining structured (MITRE, MISP) and unstructured (Mandiant reports) data yields a comprehensive dataset. Future improvements may involve AI-driven clustering to resolve aliases and track evolving TTPs.
Prediction
As APT groups increasingly leverage AI for attacks, threat intelligence platforms will adopt machine learning for real-time actor attribution and behavior prediction. Open-source datasets will remain vital but require stricter standardization for interoperability.
For further exploration, check the Awesome Threat Actor Resources repository.
IT/Security Reporter URL:
Reported By: Ysergeev Threatintel – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


