Ubikron Graphs: The Free OSINT Tool That Turns Browsing Into Intelligence Goldmines – Here’s How To Build Your Own Investigation Graph + Video

Introduction:

Open-source intelligence (OSINT) investigations often drown in fragmented data – scattered emails, names, domains, and coordinates hidden across dozens of visited web pages. Ubikron, a free self‑hosted tool described by investigators, solves this by automatically extracting entities from saved pages and visualizing them as interactive graphs. This article dives into the tool’s graph‑based workflow, replicates its core functionality using open‑source alternatives, and provides step‑by‑step commands for Linux and Windows to supercharge your own OSINT pipeline.

Learning Objectives:

Understand how automated entity extraction from web pages accelerates link analysis and investigation backtracking.
Build a local, privacy‑friendly graph database using Python, Neo4j, and browser automation.
Apply data enrichment and report generation to connect disparate clues from social media and public sources.

You Should Know:

Extracting Entities from Web Pages – Command‑Line & Browser Automation

The post highlights Ubikron’s ability to save any browsed page and instantly extract emails, names, phone numbers, URLs, and geocoordinates. This is achieved through regex patterns and HTML parsing. Below is a Python script that does the same – run it on Linux or Windows after installing dependencies.

Step‑by‑step: Build your own entity extractor

Linux/macOS/Windows – Install Python and required libraries:
```
pip install requests beautifulsoup4 re ipwhois folium
```
Save a page locally (e.g., target.html) or fetch via URL:
```
import requests
from bs4 import BeautifulSoup
import re</li>
</ul>

url = "https://example.com/news"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
text = soup.get_text()
```
– Extract emails:
```
emails = set(re.findall(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}', text))
print("Emails:", emails)
```
– Extract domains from `` tags:
```
domains = set()
for a in soup.find_all('a', href=True):
if 'http' in a['href']:
domain = a['href'].split('/')[bash]
domains.add(domain)
```
– Windows alternative: Use PowerGREP or a simple PowerShell regex:
```
(Get-Content page.html -Raw) | Select-String -Pattern '\b[\w.-]+@[\w.-]+.\w{2,}\b' -AllMatches
```
How to use: Save HTML files from your browser (Ctrl+S), then run the script to output a CSV of entities. For live browsing, use a browser extension like “SingleFile” to archive pages and a local server to process them.
1. Building the Graph – Connecting Entities with Neo4j (Self‑Hosted)
Ubikron’s core innovation is the graph where nodes (emails, names, URLs) link to visited pages. You can replicate this using Neo4j, a free graph database. The steps below assume Ubuntu 22.04, but Windows versions exist.

Step‑by‑step: Local graph database for investigations
- Connect via Cypher: Use Python’s `neo4j` driver. After entity extraction, add nodes and relationships:
```
from neo4j import GraphDatabase</li>
</ul>

uri = "bolt://localhost:7687"
driver = GraphDatabase.driver(uri, auth=("neo4j", "password"))

def add_page(tx, url, title):
tx.run("MERGE (p:Page {url: $url, title: $title})", url=url, title=title)

def add_entity(tx, entity_type, value):
tx.run("MERGE (e:"+entity_type+" {value: $value})", value=value)

def relate(tx, url, entity_value):
tx.run("MATCH (p:Page {url: $url}), (e {value: $entity}) "
"MERGE (p)-[:CONTAINS]->(e)", url=url, entity=entity_value)
```
  – To show all pages where an entity appears (as mentioned in the post):
```
MATCH (e {value: '[email protected]'})<-[:CONTAINS]-(p:Page) RETURN p.url
```
  What this does: Every visited page becomes a node, every extracted entity becomes another node, and edges show mentions. You can instantly backtrack to the source page – exactly the “stage of investigation” feature.
  1. Enrichments & OSINT Automation – Over 100 Data Points
  The post mentions “over 100 data enrichments.” These include reverse WHOIS, geolocation of IPs, social media profiling, and hashtag tracking. Here’s how to enrich a domain name using free APIs.
  
  Step‑by‑step: Enrich domains with IP geolocation and threat intel
  – Linux/Windows command line for WHOIS & IP:
```
whois example.com | grep -E "Registrant|Creation Date"
nslookup example.com
```
  – Python enrichment script (install requests, ipwhois):
```
import socket
from ipwhois import IPWhois

domain = "example.com"
ip = socket.gethostbyname(domain)
obj = IPWhois(ip)
results = obj.lookup_rdap(depth=1)
print(f"ASN: {results['asn']}, Country: {results['asn_country_code']}")
```
  – Add coordinates to graph: Create a `Location` node with lat/lon, then link the IP or domain to it. For phone numbers, use `phonenumbers` library to validate and derive country.
  – API security tip: When using public enrichment APIs (e.g., VirusTotal, Shodan), never hardcode API keys. Use environment variables:
```
export VT_API_KEY="your_key"
```
  In Python: `import os; key = os.getenv(“VT_API_KEY”)`
  1. Self‑Hosted Privacy & Data Control – Disable Saving Any Time
  Ubikron offers a self‑hosted version and toggleable page saving. Implement this by using a local proxy or browser automation with granular controls.
  
  Step‑by‑step: Privacy‑aware browsing pipeline with Playwright
  - Install Playwright (Linux/Windows):
```
pip install playwright
playwright install
```
  - Write a script that saves pages only when a flag is enabled:
```
from playwright.sync_api import sync_playwright</li>
</ul>

SAVE_ENABLED = True  Toggle this variable

with sync_playwright() as p:
browser = p.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://linkedin.com/in/example")
if SAVE_ENABLED:
content = page.content()
with open("saved_page.html", "w") as f:
f.write(content)
browser.close()
```
    – For Windows, create a batch toggle:
```
@echo off
set SAVE=1
if "%SAVE%"=="1" python save_page.py
```
    – Hardening: Run the extraction and graph database inside a Docker container to isolate from your main OS. Sample Dockerfile:
```
FROM python:3.10
RUN pip install beautifulsoup4 requests neo4j
COPY extractor.py /app/
WORKDIR /app
CMD ["python", "extractor.py"]
```
    1. Integrating Images, Reports & Beta Graphs – Full Investigation Suite
    The tool also allows adding images (clippings) and writing reports. You can simulate this by generating a Markdown report that embeds graph screenshots and entity lists.
    
    Step‑by‑step: Generate an investigation report from your graph
    - Export graph visualization using Neo4j Browser’s PNG export or use `pyvis` to create an interactive HTML report:
```
from pyvis.network import Network
net = Network()
net.add_node(1, label="Page: article.html")
net.add_node(2, label="Email: [email protected]")
net.add_edge(1, 2)
net.show("investigation.html")
```
    - To add images, store local file paths as node properties:
```
CREATE (c:Clipping {image_path: '/screenshots/post.png', description: 'LinkedIn post'})
```
    - Write a report using Python’s `reportlab` (PDF) or simply output a structured JSON:
```
import json
report = {"entities": list(emails), "graph_nodes": 42, "pages_visited": ["url1","url2"]}
with open("osint_report.json", "w") as f:
json.dump(report, f, indent=2)
```
    1. Mitigating Risks – When Graph OSINT Goes Wrong
    Graph‑based investigations can accidentally expose sensitive data or violate terms of service. Always respect robots.txt, avoid aggressive crawling, and use self‑hosted tools to keep data local.
    
    Step‑by‑step: Ethical scraping and cloud hardening
    - Check robots.txt before any automation:
```
curl https://example.com/robots.txt
```
    - Rate limiting to avoid IP blocks (Linux `cron` or Windows Task Scheduler):
```
import time
time.sleep(2)  2 seconds between requests
```
    - Cloud hardening for self‑hosted Neo4j: Use firewall rules (UFW on Linux, Windows Defender Firewall) to allow only localhost connections unless you expose via VPN.
```
sudo ufw allow from 192.168.1.0/24 to any port 7687
```
    - If you must deploy in the cloud (AWS, Azure):
    - Store credentials in AWS Secrets Manager or Azure Key Vault.
    - Enable VPC and restrict inbound traffic to your IP.
    - Use TLS for Neo4j (enable dbms.connector.bolt.tls_level=REQUIRED).
    1. OSINT Workflow Example – From LinkedIn Post to Graph
    The post’s screenshot includes names like Alex Lozano, Mario Santella, Dan Ramey. Here’s how you’d investigate them using Ubikron‑style graph.
    
    Step‑by‑step: Real‑world mini‑investigation
    - Collect pages: Save LinkedIn profiles, Twitter posts, and news articles mentioning those names.
    - Run entity extraction (script from section 1) across all saved HTML files. Output might be:
    `[“[email protected]”, “twitter.com/mariosantella”, “+1-555-1234”]`
    - Import into Neo4j (section 2). Query to find connections:
```
MATCH (p:Person {name: 'Alex Lozano'})--(page:Page)--(other:Person)
RETURN other.name, page.url
```
    - If a phone number appears on two different pages that both mention Dan Ramey and a specific geocoordinate, you have a strong link.
    - Add enrichment (section 3) – resolve domain `lozano.com` to IP, geolocate, check VirusTotal reports.
    What Undercode Say:
    - Key Takeaway 1: Ubikron’s graph approach transforms chaotic OSINT data into an explorable structure, enabling rapid backtracking and hidden connection discovery – a must‑have for investigators.
    - Key Takeaway 2: Self‑hosting and toggleable saving are critical for privacy and legal compliance; replicating this with open‑source tools (Python + Neo4j) is entirely feasible for under 100 lines of code.
    - Analysis: The tool’s beta graphs represent a shift from siloed data collection to relational intelligence. However, investigators must implement rate limiting and data minimization to avoid crossing ethical boundaries. The inclusion of image clippings and reports suggests a trend toward all‑in‑one OSINT workbenches, reducing reliance on disconnected tools like Maltego and browser extensions.
    Prediction:
    
    Within 18 months, AI‑powered entity extraction and graph recommendation engines will become standard in free OSINT tools. Ubikron’s approach will likely inspire forks that integrate large language models to auto‑tag entities (e.g., “this phone number belongs to a known scam pattern”) and even hypothesize links before the investigator finds them. Commercial vendors will push cloud‑based graphs, but the self‑hosted, privacy‑first movement will grow – especially after high‑profile leaks from centralized OSINT platforms. Expect law enforcement and corporate security teams to adopt internal graph solutions based on this exact browser‑to‑graph pipeline.
    
    ▶️ Related Video (66% Match):
    
    🎯Let’s Practice For Free:
    
    IT/Security Reporter URL:
    
    Reported By: Logan Woodward – Hackers Feeds
    Extra Hub: Undercode MoN
    Basic Verification: Pass ✅
    
    🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]
    
    💬 Whatsapp | 💬 Telegram
    
    📢 Follow UndercodeTesting & Stay Tuned:
    
    𝕏 formerly Twitter 🐦 | @ Threads | 🔗 Linkedin | 🦋BlueSky
    Share this:
    Reddit
    LinkedIn
    Threads
    Pinterest
    Bluesky
    WhatsApp
    X
    Telegram
    Facebook
    Email
    Tumblr
    Mastodon
    Print

Listen to this Post

Introduction:

Learning Objectives:

You Should Know:

Step‑by‑step: Build your own entity extractor

Step‑by‑step: Local graph database for investigations

In Python: `import os; key = os.getenv(“VT_API_KEY”)`

Step‑by‑step: Privacy‑aware browsing pipeline with Playwright

Step‑by‑step: Generate an investigation report from your graph

Step‑by‑step: Ethical scraping and cloud hardening

Step‑by‑step: Real‑world mini‑investigation

`[“[email protected]”, “twitter.com/mariosantella”, “+1-555-1234”]`

What Undercode Say:

Prediction:

▶️ Related Video (66% Match):

🎯Let’s Practice For Free:

IT/Security Reporter URL:

🔐JOIN OUR CYBER WORLD [ CVE News • HackMonitor • UndercodeNews ]

📢 Follow UndercodeTesting & Stay Tuned:

Share this:

Related Posts: