Heimdall vs OpenClaw scanner: the full comparison

I built Heimdall a few days ago after reading about skill-based attacks in the OpenClaw ecosystem. When OpenClaw announced their VirusTotal partnership, I wanted to know how my scanner compared to their built-in one (the deterministic TypeScript scanner that ships with the platform, separate from the VirusTotal integration).

I didn’t trust my own bias. So I pulled both codebases, put them side by side, and asked two different AI models to do independent reviews:

Opus 4.6 (Anthropic’s Claude) for a security-focused editorial analysis
Codex 5.3 (OpenAI) for a structured, methodical 8-dimension comparison

Both had access to the full source code of both scanners. Neither saw the other’s output. Here’s what they found.

Part 1: Opus 4.6 analysis

Analyst: Ada (Opus 4.6) | Independent review of both scanner codebases

Overview

Aspect	Heimdall	OpenClaw Official
Language	Python	TypeScript
Lines	691	441
Version	v4.0	v2026.2.6 (first release)
Integration	Standalone CLI	Built into OpenClaw core

Pattern coverage

Heimdall detects across 15+ categories with roughly 50+ unique patterns:

Credential access (8 patterns): .env reading, secrets dirs, hardcoded passwords/API keys/tokens, private keys
Network exfiltration (6): curl, wget, requests, fetch, webhook.site, ngrok
Remote code execution (6): eval, exec, subprocess, os.system, curl|bash, reverse shells
Privilege escalation (4): sudo, chmod 777, setuid, chown root
Prompt injection (5): IGNORE PREVIOUS, system prompt override, jailbreak patterns
Crypto mining (4): xmrig, coinhive, stratum+tcp, mining pool URLs
Data destruction (3): rm -rf, shred, dd if=/dev/zero
Obfuscation (3): base64 decode, hex encoding, char code assembly
Plus: remote fetch detection, heartbeat injection, MCP tool abuse, unicode tag injection, agent impersonation, data pre-fill exfiltration, crypto wallet extraction

OpenClaw detects with 8 rules across 5 categories:

Line rules (4):

dangerous-exec: child_process exec/spawn (requires child_process import context)
dynamic-code-execution: eval(), new Function()
crypto-mining: stratum+tcp, coinhive, cryptonight, xmrig
suspicious-network: WebSocket to non-standard ports

Source rules (4):

potential-exfiltration: readFile + network send combo
obfuscated-code (hex): \x sequences (6+ chars)
obfuscated-code (base64): large base64 with decode
env-harvesting: process.env + network send combo

Opus verdict: Heimdall covers roughly 6x more patterns.

Detection approach

OpenClaw uses a two-pass heuristic. Line rules catch direct dangerous constructs; source rules catch multi-signal behavior where a pattern only fires if a secondary context exists (e.g., exec() only flags if child_process is also imported). This is clean engineering. It also deduplicates findings to one per rule per file.

Heimdall uses exhaustive per-line regex matching across a 92-pattern catalog, then applies context-aware post-processing that adjusts severity based on file type (CODE vs CONFIG vs DOCS vs STRING), string literal detection, blocklist/pattern-definition detection, and security tool path indicators. It supports strict mode to disable all context awareness.

Net difference: OpenClaw produces fewer, behavior-correlated signals. Heimdall casts a wider net and uses context to control the noise.

False positive handling

Heimdall has a 5-layer suppression system:

File type context (docs get heavily reduced severity)
String literal detection (inside quotes = suppressed)
Blocklist definition detection (security tool patterns = suppressed)
Security tool indicator (known security tools = reduced)
Manual severity adjustment with tracked reasons

It reports suppressed findings separately for transparency, and --strict mode ignores all context for paranoid scanning.

OpenClaw has 2 layers: context requirements (secondary pattern must exist) and port allowlisting (for WebSocket rule). Skips hidden dirs and node_modules. No suppression tracking.

Opus verdict: Heimdall handles false positives substantially better.

Integration

OpenClaw wins here. It’s a typed TypeScript module with reusable exported functions (scanSource, scanDirectory, scanDirectoryWithSummary), typed results, bounded scanning behavior, and native integration into the OpenClaw install flow. It runs automatically when installing skills from ClawdHub. If the scanner crashes, installs continue with a warning.

Heimdall is CLI-first. JSON output, verbose mode, AI analysis, exit codes for CI, but it must be run manually or via cron. No integration with the install flow.

Opus verdict: OpenClaw’s integration is superior. Heimdall has the AI analysis edge.

AI analysis capability

Heimdall’s --analyze flag sends findings to an LLM for a narrative security report. Explains what each finding means in context, describes attack scenarios, gives actionable recommendations. Falls back through oracle CLI, OpenRouter, then deterministic reporting.

OpenClaw has no AI analysis. Static pattern matching only, no explanations of risk.

Opus verdict: Heimdall’s AI analysis is unique. Most people installing skills aren’t security researchers, and narrative reports bridge that gap.

File type coverage

Heimdall scans: .py, .js, .ts, .sh, .bash, .mjs, .cjs, .md, .yaml, .yml, .json

OpenClaw scans: .js, .ts, .mjs, .cjs, .mts, .cts, .jsx, .tsx

Opus identified three blind spots in OpenClaw:

Cannot detect prompt injection in SKILL.md files
Cannot detect malicious shell scripts
Cannot detect malicious Python code

What each should steal from the other

OpenClaw should take from Heimdall:

Native install integration hook for Heimdall’s broader detection
Context-aware severity (code vs docs vs strings)
Suppression tracking with transparency
Multi-language scanning

Heimdall should take from OpenClaw:

Behavior-coupled multi-signal rules (pattern + requiresContext)
Per-rule deduplication
File size and count limits for predictable runtime
Failure-safe design (scanner crash doesn’t break installs)

Opus final verdict

Heimdall is the better scanner overall. OpenClaw’s advantage is integration (built into the core). Heimdall’s advantage is everything else: coverage, context awareness, AI analysis, multi-language support. Anyone relying solely on the native scanner is still vulnerable to the attacks that matter most in the OpenClaw ecosystem.

Part 2: Codex 5.3 analysis

Analyst: Codex 5.3 (OpenAI) | Structured 8-dimension comparison of both scanner codebases

1. Pattern coverage counts

Scanner	Rule model	Total detection patterns	Severity mix
OpenClaw	LINE_RULES + SOURCE_RULES	8 (4 line + 4 source)	4 critical, 4 warn, 0 info
Heimdall	PATTERNS regex list	92	47 CRITICAL, 33 HIGH, 12 MEDIUM

Category breadth:

OpenClaw: 7 unique rule IDs (dangerous-exec, dynamic-code-execution, crypto-mining, suspicious-network, potential-exfiltration, obfuscated-code, env-harvesting)
Heimdall: 19 explicit categories (credential access, network exfil, shell exec, filesystem, obfuscation, data exfil, privilege, persistence, crypto, remote fetch, heartbeat injection, MCP abuse, unicode injection, auto-approve, crypto wallet, impersonation, prefill exfil, supply chain, telemetry)

Codex conclusion: Heimdall is much broader and deeper in static signature coverage. OpenClaw is intentionally compact and precision-oriented.

2. Detection approach differences

OpenClaw: TypeScript module-first scanner designed for embedding. Two-pass heuristic with line rules for direct dangerous constructs and source rules for multi-signal behavior. Context coupling means dangerous-exec only fires if child_process appears in source, and exfil/env harvesting rules require network context. Noise control via one finding per line-rule per file, source-rule dedupe by ruleId + message, and standard port exclusion for WebSocket alerts.

Heimdall: Python CLI scanner with broad signature matching. Exhaustive per-line regex matching across a 92-pattern catalog. Context-aware post-processing adjusts severity based on file type context (CODE, CONFIG, DOCS, STRING), string-literal detection, blocklist/pattern-definition detection, and security tool path indicators. Supports strict mode to disable context suppression.

Codex conclusion: OpenClaw produces fewer, behavior-correlated signals with lower output volume. Heimdall casts a larger signature net with contextual dampening to control alert flood.

3. False positive handling

OpenClaw strengths: Requires context for key risky patterns. Special-case filtering for WebSocket ports. Rule-level dedupe prevents repeated spam.

OpenClaw limitations: No explicit doc/config/string semantic downgrading. No concept of suppressed findings. No strict/non-strict operating mode.

Heimdall strengths: Explicit FP mitigation pipeline detecting likely string literals, blocklist/regex-definition context, file-context aware severity reduction, with suppression to SAFE with reasons. Tracks both original and adjusted severity. --show-suppressed provides transparency.

Heimdall limitations: String parsing heuristic is approximate (quote counting can misclassify complex syntax). Broad regex catalog still produces significant candidate matches before suppression.

Codex verdict: Heimdall has far stronger explicit false-positive management. OpenClaw has lighter but cleaner precision controls.

4. Integration points

OpenClaw: Exports reusable functions (scanSource, scanDirectory, scanDirectoryWithSummary). Exposes typed results. Embedding-focused controls with includeFiles, maxFiles, maxFileBytes, safe include path enforcement, directory walker skipping hidden dirs and node_modules.

Heimdall: CLI-first via argparse flags (--json, --verbose, --strict, --show-suppressed, --analyze). Programmatic functions exist but primarily UX/report oriented. Exit codes tied to severity thresholds for CI gate scripts.

Codex verdict: For direct product/runtime embedding, OpenClaw is cleaner. For operator tooling and review workflows, Heimdall is richer.

5. AI analysis capability

Heimdall’s --analyze builds a findings summary from active medium+ findings (up to 30 entries), loads truncated skill content (up to 50k chars total, per-file chunk limits), and sends a structured analyst prompt requesting a formatted security report.

Provider chain: oracle CLI with claude-sonnet-4, fallback to OpenRouter, fallback to deterministic basic report.

Codex noted: AI output is non-deterministic and can over/understate risk. Must remain advisory on top of deterministic findings, not a replacement for them.

OpenClaw has no equivalent.

6. File type coverage

Scanner	Extensions	Count
OpenClaw	.js, .ts, .mjs, .cjs, .mts, .cts, .jsx, .tsx	8
Heimdall	.py, .js, .ts, .sh, .bash, .mjs, .cjs, .md, .yaml, .yml, .json	11

Codex noted that Heimdall covers more non-JS skill artifacts (docs + config + shell + Python), while OpenClaw covers TS/React variants better (.mts/.cts/.jsx/.tsx). Heimdall also has a minor internal mismatch where it defines context for some extensions (.txt, .rst, .adoc, .toml, .ini) not in its scan extension allowlist.

7. Which is better and why

Overall winner for security detection depth: Heimdall. 92-pattern coverage across 19 categories vs 8 patterns. Broader threat model. More mature false-positive framework. Optional AI-assisted narrative analysis.

Better for embeddable, low-complexity product integration: OpenClaw. Smaller, typed, deterministic TypeScript module. Cleaner API surface. Lower operational complexity.

Codex practical recommendation: If the objective is maximum malicious-skill detection, Heimdall is currently better. If the objective is minimal, maintainable in-product scanning, OpenClaw is better engineered for embedding.

8. What each should steal from the other

OpenClaw should steal from Heimdall:

File-context-aware severity adjustment (code/config/docs/string)
Suppression accounting and explainable suppression reasons
Expanded pattern taxonomy beyond the current core 8 signals
Strict mode toggle (precision vs paranoid scan profiles)
Richer CLI/report output mode for analyst workflows

Heimdall should steal from OpenClaw:

Behavior-coupled multi-signal rules (pattern + requiresContext) more consistently
Per-rule dedupe controls to reduce repeated findings noise
Size/scan limits (maxFiles, maxFileBytes) for predictable runtime
Path safety constraints for forced includes (OpenClaw-style path containment)
Small typed library API layer (not just CLI script ergonomics)

Codex final verdict

Heimdall is the stronger security scanner today (coverage + context + analyst UX). OpenClaw is the stronger embeddable scanner component (simplicity + typed integration). The ideal next-generation scanner would combine Heimdall’s breadth and context model with OpenClaw’s disciplined API and deterministic scan controls.

Summary: where both models agree

Opus 4.6 and Codex 5.3 reached the same conclusions independently, across every dimension:

Dimension	Winner	Agreement
Pattern coverage	Heimdall (92 vs 8)	Both models flagged this as the biggest gap
Threat categories	Heimdall (19 vs 7)	Both noted heartbeat injection, MCP abuse, unicode tricks as critical misses
Detection sophistication	Heimdall (context-aware pipeline)	Both credited OpenClaw’s multi-signal approach but called Heimdall more mature
False positive handling	Heimdall (5-layer suppression)	Both said Heimdall is substantially stronger here
File type coverage	Heimdall (11 vs 8 extensions)	Both flagged .md, .py, .sh blindspots in OpenClaw as serious
Integration architecture	OpenClaw	Both agreed OpenClaw is better engineered for embedding
AI analysis	Heimdall (unique)	Both called this a meaningful differentiator for non-expert users
Overall verdict	Heimdall for detection, OpenClaw for integration	Unanimous

The interesting thing isn’t that they agreed on the winner. It’s that they agreed on exactly where the gaps are and what each project should borrow from the other. When two models independently converge on the same five recommendations in both directions, that’s a reasonably strong signal.

Neither scanner is complete alone. OpenClaw’s native integration catches the basics at install time. Heimdall catches the stuff that falls through the cracks. Install both.

Links: