Listen to this Post

Introduction:
Security Information and Event Management (SIEM) is not a magical attack-detection engine—it is a decision‑support system that organizes chaotic, high‑volume log data into structured evidence for human analysts. Many learners rush to tools like Splunk or QRadar without grasping core concepts such as normalization, correlation windows, and alert validation, leading to false confidence and missed threats. This article strips away the tool‑centric hype and builds a fundamentals‑first understanding of how SIEM transforms raw logs into actionable incidents.
Learning Objectives:
- Differentiate logs from events and apply parsing/normalization techniques to unify data from Windows, Linux, and firewalls.
- Build correlation rules that use time windows and severity scoring to distinguish true incidents from noise.
- Validate alerts through evidence‑based methods, reducing false positives and correctly escalating to incident response.
You Should Know:
1. Logs → Events: Parsing and Field Extraction
Raw logs are just text lines—they become evidence only after being parsed into structured fields. The SIEM must extract meaningful attributes like source IP, username, action, and timestamp. Without this step, correlation and alerting are impossible.
Step‑by‑step guide for manual parsing (Linux & Windows):
- Linux – using `awk` and
grep:
Assume a raw SSH log entry:
`Accepted password for john from 192.168.1.100 port 54322 ssh2`
Extract username and source IP
echo "Accepted password for john from 192.168.1.100 port 54322 ssh2" | awk '{print $4, $6}'
Output: john 192.168.1.100
For a failed attempt:
`Failed password for invalid user admin from 10.0.0.5 port 23456 ssh2`
grep "Failed password" /var/log/auth.log | awk '{print $9, $11}'
Extracts username and IP (adjust field numbers based on log format)
- Windows PowerShell – parsing security logs:
Using `Get-WinEvent` to extract failed logon events (Event ID 4625) into fields:Get-WinEvent -LogName Security | Where-Object { $<em>.Id -eq 4625 } | ForEach-Object { $</em>.Properties[bash].Value Target user name $_.Properties[bash].Value Source IP address } - Normalizing fields across platforms:
Create a common schema (e.g.,src_ip,user,action) by mapping Linux field positions and Windows property indices to a unified JSON output – the core of SIEM parsing.
- Normalization Across Platforms: Making Different Logs Speak the Same Language
Firewalls, Windows, and Linux describe the same event (e.g., failed authentication) in different words. Normalization maps all of them to a standard field likeaction=failed_authentication. Without this, correlation rules break.
Step‑by‑step guide to build a normalizer:
1. Collect sample logs from three sources:
- Windows: `4625 – An account failed to log on`
- Linux: `Failed password for root from 192.168.1.10`
- Firewall (iptables): `DPT=22 SRC=192.168.1.10 DROP`
2. Define target schema: `{timestamp, src_ip, dst_ip, protocol, action, username}`
3. Write parsers (using Python for clarity):
import re
def normalize_windows(log):
Simplified regex for Windows 4625 XML or message
match = re.search(r'Account Name:\s+(\S+).Source Network Address:\s+(\S+)', log)
if match:
return {'username': match.group(1), 'src_ip': match.group(2), 'action': 'failed_authentication'}
def normalize_linux(log):
match = re.search(r'Failed password for (\S+) from (\S+)', log)
if match:
return {'username': match.group(1), 'src_ip': match.group(2), 'action': 'failed_authentication'}
Combine into a normalizer that outputs unified JSON.
3. Correlation vs Aggregation with Time Windows
Aggregation counts similar events (e.g., “10 failed logins from 10.0.0.1”). Correlation links different event types across time and entities (e.g., “failed logins followed by a successful login from the same IP”). Time windows define how far back the SIEM looks for relationships.
Step‑by‑step rule example (pseudo‑SIEM query):
Rule: Detect a successful login that occurs within 2 minutes after 5+ failed logins from the same source IP.
1. Collect all failed authentication events with field `src_ip` and timestamp.
2. Aggregate failures per `src_ip` over a sliding window of 5 minutes.
3. Collect success events (`action=successful_authentication`).
- For each success, search back 2 minutes for a failure count ≥5 from the same
src_ip.
5. Emit alert “Possible brute force success”.
Implementation with Splunk search (illustrative):
index=main sourcetype=linux_secure OR sourcetype=WinEventLog:Security | eval action=if(match(_raw, "Failed password"), "fail", if(match(_raw, "Accepted password"), "success", null)) | where isnotnull(action) | streamstats time_window=5m count(eval(action="fail")) as fail_count by src_ip | where action="success" AND fail_count>=5 | table _time, src_ip, user, fail_count
4. Alert Validation: From Alert to Incident
Alerts are claims generated by correlation rules, not facts. Incident creation requires human validation: checking whether the alert aligns with actual evidence, asset criticality, and known false‑positive patterns.
Step‑by‑step validation workflow:
- Gather full context around the alert time window – pull all logs (±30 seconds) from the affected systems using commands like:
– Linux: `journalctl –since “2 minutes ago” –until “now” -u ssh`
– Windows: `Get-WinEvent -FilterHashtable @{LogName=’Security’; StartTime=(Get-Date).AddMinutes(-2)}`
2. Check for known false‑positive indicators:
- Source IP is internal vulnerability scanner or admin jumpbox.
- The “failed” events come from a service account that rotates passwords.
- Time pattern matches scheduled tasks.
- Determine asset criticality – a brute force attempt on a domain controller (critical) vs. a public web server (medium).
4. Make the decision:
- False positive – suppress alert, add tuning rule.
- True positive – escalate as incident, assign to IR team.
- Suspicious – gather more data, extend time window.
Example validation using `jq` on JSON logs:
Extract all alerts from a SIEM export, filter by src_ip
cat siem_alerts.json | jq '.[] | select(.rule_name=="Brute Force") | {timestamp, src_ip, username, asset_criticality}'
5. Building Detection Rules Using MITRE ATT&CK
MITRE ATT&CK provides a common taxonomy for adversary behavior. Instead of guessing what to detect, map rules to techniques like T1110 (Brute Force) or T1078 (Valid Accounts). This ensures coverage and helps prioritisation.
Step‑by‑step Sigma rule for brute force (T1110):
Sigma is a generic rule format that converts to Splunk, QRadar, or ELK.
title: Linux Brute Force via SSH status: experimental logsource: product: linux service: auth detection: failures: - "Failed password" timeframe: 2m condition: failures > 5 level: medium tags: - attack.t1110 - attack.credential_access
Convert to Splunk query (illustrative):
sourcetype=linux_secure "Failed password" | stats count by src_ip, user | where count>5
Add suppression for known safe IPs: NOT src_ip IN (10.0.0.0/8, 192.168.0.0/16).
- Essential Linux & Windows Commands for Log Collection and Monitoring
You cannot rely solely on SIEM agents; hands‑on log collection is critical for troubleshooting and custom detections.
Linux – systemd journal and auditd:
– `journalctl -f` – follow real‑time logs.
– `journalctl -u sshd –since “1 hour ago”` – filter by unit and time.
– `auditctl -w /etc/passwd -p wa -k passwd_changes` – watch file writes.
– `ausearch -k passwd_changes` – search audit logs.
Windows – native tools:
– `wevtutil qe Security /f:text /c:20` – query Security log with formatting.
– `wevtutil gl Security` – get log configuration.
– `Get-WinEvent -FilterHashtable @{LogName=’Microsoft-Windows-Sysmon/Operational’; ID=1} -MaxEvents 50` – Sysmon process creation.
– `Get-Service | Where-Object {$_.Status -eq ‘Running’ -and $_.StartType -eq ‘Automatic’}` – enumerate services for baseline.
Centralised collection tip: Configure `rsyslog` on Linux to forward to a SIEM collector and `WinRM` or `NXLog` on Windows to send events.
7. Time Windows and Severity Prioritization
Not all alerts deserve the same response. Risk scoring combines event count, time window length, asset criticality, and CVSS‑like severity. Use a simple formula:
`Risk Score = (Event count within window) AssetWeight TechniqueSeverity`
Step‑by‑step prioritisation:
- Assign AssetWeight (1 = low, 3 = medium, 5 = critical).
- Assign TechniqueSeverity based on MITRE (e.g., T1190 – Exploit Public‑Facing App: 4, T1059 – Command and Scripting Interpreter: 3).
- Time window relevance – anomalies over a short window (1 minute) score higher than spread over 1 hour.
- Example: 10 failed logins (Event count=10) on a Domain Controller (AssetWeight=5) using Brute Force (Severity=4) → Score = 1054 = 200 (critical). Same 10 attempts on a printer (AssetWeight=1) → Score 40 (low).
5. Actionable thresholds:
- 0‑30: Informational, log only.
- 31‑100: Low – monitor, do not page.
- 101‑200: Medium – create ticket, investigate next shift.
- 200+: Critical – page on‑call analyst.
What Undercode Say:
- Key takeaway 1: Alerts are not facts – tools cannot replace human validation. The best SIEM is worthless if analysts accept every alert as an incident without examining raw logs and context.
- Key takeaway 2: Fundamentals like normalisation and time‑window correlation are the real force multipliers. Jumping to dashboard creation without parsing logs properly leads to missed lateral movement and false positives that desensitise the SOC.
Analysis: The post correctly attacks the “tool‑first” mindset prevalent in cybersecurity training. Many candidates memorise Splunk SPL or QRadar AQL but cannot explain why a log parser fails or how to tune a correlation rule against a business application’s normal behaviour. By focusing on logs→events→normalisation→correlation→validation, the article builds a mental model that works across any SIEM platform. The inclusion of concrete Linux/Windows commands bridges theory to practice, enabling learners to test parsing and alert validation on their own machines. The risk scoring formula adds a missing quantitative layer, turning prioritisation from guesswork into policy. This fundamentals‑first approach directly addresses the 80‑90% false positive rate that plagues under‑tuned SIEMs, ultimately making SOC analysts more effective and less burned out.
Prediction:
- +1 SIEM platforms will increasingly embed automated normalisation and validation assistants using small language models, but analysts who understand underlying parsing logic will debug these AI suggestions faster than those who only click “accept”.
- +1 Open‑source detection engineering (e.g., Sigma rules, uncoder.io) will replace vendor‑specific syntax as the industry standard, making cross‑platform correlation fundamentals even more valuable.
- -1 The flood of “learn SIEM in 2 hours” courses will continue producing candidates who can generate alerts but cannot validate them, leading to more SOCs drowning in false positives and analyst burnout.
- -1 Without strong emphasis on time‑window concepts, cloud SIEMs with massive log volumes will cause cost explosions as junior analysts set overly broad correlation windows, wasting budget on meaningless “incidents”.
- +1 Organisations that mandate fundamentals training (parsing, normalisation, validation) before tool certification will see mean‑time‑to‑detect (MTTD) drop by up to 40%, as their analysts stop chasing ghosts and focus on real adversary behaviour.
▶️ Related Video (74% Match):
🎯Let’s Practice For Free:
🎓 Live Courses & Certifications:
Join Undercode Academy for Verified Certifications
🚀 Request a Custom Project:
Secure, high-velocity infrastructure and disruptive technological engineering. Contact our engineering team for high-tier development and proprietary systems:
[email protected]
💎 Smart Architecture | 🛡️ Secure by Design | ⭐ Trusted by Thousands
IT/Security Reporter URL:
Reported By: Gude Venkata – Hackers Feeds
Extra Hub: Undercode MoN
Basic Verification: Pass ✅


