Documentation
This manual explains current runtime behaviour as implemented. All sections reflect the live codebase, not planned features.
Prerequisites
Runtime Guard requires Python to run. The version matters more than it might appear.
| Requirement | Version | Notes |
|---|---|---|
| Python (required) | ≥ 3.10 | Hard minimum — older versions will fail |
| Python (recommended) | 3.12+ | Smoother dependency installs |
| macOS system Python | 3.9 | Too old — use Homebrew or python.org |
macOS ships with Python 3.9 which causes dependency install failures. Install a newer Python via Homebrew (brew install python@3.12) or python.org and create a fresh venv from that version.
What the server does
Runtime Guard is an MCP server that sits between your AI agent and your system. It exposes the following tools, all subject to policy evaluation before execution:
| Tool | Description |
|---|---|
server_info | Returns server status, version, and policy hash |
execute_command | Runs shell commands — most policy enforcement applies here |
read_file | Reads file contents with path and extension policy checks |
write_file | Writes files — creates backups before overwriting |
delete_file | Deletes files — creates backup before deletion |
list_directory | Lists directory contents within workspace boundaries |
restore_backup | Restores files from backup with dry-run support |
Scope (intentional)
Understanding what Runtime Guard is not designed to do is as important as understanding what it does.
Runtime Guard prevents accidental damage from hallucinated deletes, wrong-path writes, broad wildcard actions, and accidental secret access. It is not a full malicious-actor containment boundary.
Core controls in scope:
- Block severe destructive and exfiltration actions by policy
- Enforce workspace and path boundaries
- Gate mass and wildcard actions through simulation and budget controls
- Optionally require operator approval for selected risky commands
- Automatically create backups before destructive or overwrite operations
- Audit all allowed and blocked actions and operator decisions
Runtime Guard controls MCP tool calls only. Native client tools outside MCP — for example Claude Code's built-in Bash tool — are outside AIRG enforcement scope.
Runtime environment setup
The recommended approach for most users is the packaged CLI flow.
Packaged CLI (recommended)
| Command | Purpose |
|---|---|
airg-setup | Guided interactive setup — workspace, paths, policy, agent config |
airg-server | Start the MCP server (stdio) |
airg-ui | Start the Flask backend and control plane GUI |
airg-up | Start Flask backend as sidecar then start MCP server in one command |
airg-doctor | Diagnose environment, paths, permissions, and UI build status |
airg-service | GUI service management for macOS/Linux user sessions |
airg-init | Low-level manual bootstrap fallback |
airg-doctor first
Run airg-doctor and resolve any warnings before your first MCP client connection. It checks paths, permissions, UI build status, and reports DB health.
airg-setup options
# interactive guided setup
airg-setup
# unattended defaults
airg-setup --defaults --yes
# setup with GUI service configured and started
airg-setup --gui
# fully silent: defaults + gui, auto-creates one agent profile
airg-setup --silent
After setup, airg-setup prints a ready-to-copy MCP config env block containing all resolved AIRG_* paths including AIRG_REPORTS_DB_PATH.
Direct source runs (alternative)
For direct source or manual runs without the packaged CLI, source the runtime env script first:
source scripts/setup_runtime_env.sh
This exports AIRG_APPROVAL_DB_PATH and AIRG_APPROVAL_HMAC_KEY_PATH and enforces restrictive permissions (700 for directories, 600 for files).
Runtime paths
| Platform | Default state directory |
|---|---|
| macOS | ~/Library/Application Support/ai-runtime-guard/ |
| Linux | ${XDG_STATE_HOME:-~/.local/state}/ai-runtime-guard/ |
Backup root diagnostics
airg-doctor prints the resolved backup_root. If it points to site-packages or the project directory, treat it as misconfiguration and move it to a user-local runtime state path.
Workspace model
AIRG_WORKSPACE defines the operational sandbox root for all agent actions.
How it works
execute_commandruns withAIRG_WORKSPACEas its working directory- File and tool path checks are evaluated against this root and policy path rules
- Traversal outside the workspace root is blocked
Recommended setup
Do not use the Runtime Guard install folder as the agent workspace. Mixing them causes confusing side effects during destructive command testing.
# install location
~/Documents/Projects/runtime-guard/
# workspace for agent operations (separate)
~/airg-workspace/
To allow multiple workspace roots, add extra paths under policy.allowed.paths_whitelist.
Per-agent policy overrides
Runtime Guard supports optional per-agent policy overlays keyed by AIRG_AGENT_ID. This enables different guardrails per agent without maintaining separate policy files.
How overrides work
Overrides live under policy.agent_overrides.<agent_id>.policy and are deep-merged on top of the base policy at startup. Dictionary values merge recursively; scalar and list values replace base values.
Supported override sections
| Section | Overridable |
|---|---|
blocked | Yes |
requires_confirmation | Yes |
requires_simulation | Yes |
allowed | Yes |
network | Yes |
execution | Yes |
reports.* | No |
audit.* | No |
backup_access.* | No |
restore.* | No |
AIRG_WORKSPACE | No — env/MCP configured |
Effective policy is resolved at startup. Restart the MCP server after editing override entries.
Per-agent overrides can be authored in the GUI under Policy → Agent Overrides. Saved overrides are diff-style overlays, not full copies of baseline sections.
Policy tier precedence
Command checks run in strict precedence order. If a command matches multiple tiers, the highest tier wins.
| Priority | Tier | Effect |
|---|---|---|
| 1 (highest) | blocked | Denied immediately — no approval path |
| 2 | requires_confirmation | Paused — operator must approve out-of-band |
| 3 | requires_simulation | Blast radius evaluated — blocked if over threshold |
| 4 (lowest) | allowed | Executes immediately |
Runtime Guard ships in basic-protection mode: severe commands and paths are blocked, all other actions are allowed. Advanced tiers are available in policy for opt-in hardening.
Action options
Basic options
| Option | Behaviour |
|---|---|
allowed | Command passes policy and executes immediately. No human checkpoint required. |
blocked | Command is denied immediately. No approval path exists for that request. Blocked outcomes consume server-side retry budget until final block. |
Advanced options
| Option | Behaviour |
|---|---|
requires_simulation | Runtime simulates blast radius for wildcard and bulk operations. Blocked if simulation exceeds threshold or cannot safely resolve wildcard targets. |
requires_confirmation | Returns an approval token. Requires human operator approval via GUI/API before the exact command can be retried. Approval is one-time, session-scoped, and time-bounded. Does not consume server retry counter. |
If both requires_simulation and requires_confirmation match, confirmation wins by tier precedence. Simulation context is still included in the confirmation response and audit log.
Command matching
Matching is not strict full-string equality. Runtime Guard uses token-aware pattern matching with normalization applied before any policy check.
How patterns match
- Single-token patterns (e.g.
rm) are token-aware — they matchrmas a command token, not a substring - Multi-token patterns (e.g.
rm -rf) match normalized command sequences - Whitespace, case, and tab characters are normalized before matching
Pattern examples
| Command | Matches pattern | Notes |
|---|---|---|
rm -rf /tmp/x | rm -rf | Standard match |
RM -RF /tmp/x | rm -rf | Case-insensitive normalization |
rm -rf /tmp | rm -rf | Whitespace normalization |
rm *.txt | does not match rm -rf | Caught by simulation tier if configured |
find docs -delete | find -delete | Multi-token with intervening arg |
find docs -exec rm {} + | find -exec rm | Nested command match |
printf 'a\n' | xargs rm | xargs rm | Piped command match |
for f in *.tmp; do rm "$f"; done | do rm | Loop body match |
Beyond pattern matching, runtime enforces workspace boundary checks, protected runtime path guards, control-character sanitization, and optional shell containment — regardless of policy tier.
Approvals
The confirmation handshake ensures that high-impact commands require explicit human sign-off before execution proceeds.
Flow
execute_commandmatchesrequires_confirmation— returns a token and blocks- Human operator approves out-of-band via the control plane GUI or API (
/approvals/approve) using the exactcommand+token - Agent retries the exact command — it now proceeds
Storage model
Pending approvals are persisted in approvals.db (SQLite) so separate processes can read and update the same queue. Each pending record includes:
token,command,session_idrequested_at,expires_ataffected_paths(optional)
Approved commands are persisted as one-time session+command grants and consumed on retry.
Security model
The MCP tool surface does not expose an approval tool. Approval decisions come exclusively from out-of-band operator channels (GUI/API) and are logged as source: "human-operator" in activity.log.
Retry behaviour
- Retries are server-side, not client-authoritative
- Retry key is scoped to
(normalized_command + decision_tier + matched_rule) - Different blocked command/rule combinations maintain independent retry counters
- Confirmation-tier blocks do not consume the server retry counter
Simulation
Simulation evaluates wildcard blast radius before a command executes, preventing mass file operations that exceed safe thresholds.
How it works
- Configured command families (wildcards, bulk actions) trigger simulation under
requires_simulation.commands - Runtime resolves the wildcard against real files in the workspace
- If the resolved target count exceeds
bulk_file_threshold, the command is blocked - If the wildcard cannot be safely resolved to concrete targets, the command is blocked
execute_command telemetry can undercount affected_paths_count for some shell-expanded forms (certain wildcard/wrapper move/delete commands). Policy enforcement still applies correctly — only the count metrics in logs and budget metadata may be lower than actual impact.
Budget behaviour
The cumulative budget tracks operations, paths, and bytes across a session to prevent slow-drip mass operations that individually appear safe.
Budget fields
| Field | Tracks | Config key |
|---|---|---|
| budget ops | Cumulative total operations | cumulative_total_operations |
| budget paths | Cumulative unique affected paths | cumulative_unique_paths |
| budget bytes | Estimated bytes touched | cumulative_total_bytes_estimate |
Reset behaviour
reset.idle_reset_seconds— full budget reset after inactivity beyond this thresholdreset.window_seconds— prunes path-timestamp history for unique-path accounting (sliding window)reset.reset_on_server_restart— effectively redundant while budget state is in-memory; process restart always resets counters
Operations spaced beyond idle_reset_seconds avoid meaningful cumulative budget growth — for example, one operation every 901 seconds with idle_reset_seconds=900. This is acceptable for accidental-safety scope but should not be treated as malicious-intent containment.
Not currently enforced
- Per-command budget overrides from the GUI (metadata only — visible in UI, not runtime-enforced)
- Budget override tied to confirmation approvals (temporarily disabled during durable approval migration)
Allowed limits semantics
| Limit | Applies per |
|---|---|
allowed.max_file_size_mb | Per file, not cumulative across an operation |
allowed.max_files_per_operation | Per default-allowed multi-target operation (safety cap) |
allowed.max_directory_depth | Relative to deepest matching allowed root, not filesystem root |
Backup & Restore
Runtime Guard automatically creates backups before destructive and overwrite operations.
Backup behaviour
- Backups are timestamped directories with a
manifest.json audit.backup_on_content_change_only=truededuplicates by content hash (SHA256) and skips redundant snapshots- Version and day pruning is event-driven during backup operations — not a background scheduler
audit.max_versions_per_fileandaudit.backup_retention_daysgovern cleanup
Restore flow
- Run
restore_backupwithdry_run=true— returns arestore_tokenand planned item count - Run
restore_backupwith the token to apply
When restore.require_dry_run_before_apply=true, the apply step requires a valid token from a prior dry-run. Tokens expire after restore.confirmation_ttl_seconds. After expiry, a new dry-run is required. This is an operation-safety gate, not a human approval workflow.
Audit notes
audit.redact_patternsapplies to log output only, not backup file payloadsaudit.log_levelis configuration metadata with limited runtime differentiation currently- Pruning does not currently emit a dedicated event for every removed backup artifact
Network policy
Network policy evaluates outbound network commands against domain rules before execution.
Enforcement modes
| Mode | Behaviour |
|---|---|
off | Skip all network policy checks |
monitor | Evaluate policy, emit warnings, but do not block execution |
enforce | Evaluate policy and block when domain rules fail |
Domain rules
| Setting | Effect |
|---|---|
blocked_domains | Explicit deny list. Matching domains blocked in enforce mode. |
allowed_domains | Explicit allow list. Matching domains allowed when not blocked. |
block_unknown_domains: false | Default. Domains not in either list are allowed. |
block_unknown_domains: true | Default-deny. Domains not in allowed_domains are blocked. |
If a domain appears in both allowed_domains and blocked_domains, the blocklist wins. Subdomains are matched — example.com also matches api.example.com.
network.commands
network.commands is intent classification, not a deny list. Listing a command such as curl here decides whether network-domain policy should evaluate — it does not block the command by itself.
Runtime evaluates domains parsed from command tokens and URLs. Redirect chains and out-of-band destination changes are not deeply inspected.
Control plane GUI
The local web interface runs at http://127.0.0.1:5001 when started with airg-ui or airg-up.
Navigation structure
- Approvals — polls backend, supports approve/deny actions against shared SQLite approval store
- Policy — Commands, Paths, Extensions, Network, Agent Overrides, Advanced Policy tabs
- Reports — Dashboard and Log tabs
- Settings — Agents and Advanced tabs
Policy tabs
| Tab | Controls |
|---|---|
| Commands | Search, tier radios, custom commands, custom categories, command-info modal, advanced JSON editor |
| Paths | Runtime path display (read-only), policy-managed path rules with absolute-path validation, allowed/blocked/requires-approval mapping |
| Extensions | File extension policy rules |
| Network | Enforcement mode, network.commands list, domain allowlist/blocklist |
| Agent Overrides | Per-agent diff-style overlays with section editors and baseline info cards |
| Advanced Policy | Simulation thresholds, cumulative budget, shell workspace containment mode |
Shared policy actions
Available across all policy tabs:
- Reload — reload from disk
- Validate — validate current state without applying
- Apply — validate and atomically write to
policy.json - Revert Last Apply — restore from
policy.json.last-appliedsnapshot - Reset to Defaults — restore from
policy.json.defaultssnapshot
Runtime policy reload is startup-based. After applying policy changes in the GUI, restart the MCP server and reconnect the agent client for changes to take effect.
Settings → Agents
Profile-based MCP config generation with copy-assist modal flows for CLI and JSON formats. Generates and stores MCP config in the runtime state directory under mcp-configs/.
Serving model
- Flask backend serves both REST API endpoints and built frontend assets from
ui_v3/dist - Prebuilt assets are committed and packaged — frontend rebuild is only needed for local UI development
- If the frontend build is missing, API routes still work and
/returns a build-missing hint - Override the built UI path with
AIRG_UI_DIST_PATHwhen needed
Reports & logs
All actions are written to activity.log and indexed into reports.db for dashboard and log analytics.
Dashboard tab
- Total events, blocked events, backups created, confirmations (pending/approved/denied)
- 7-day event and blocked trends with bar charts
- Top commands and top paths with allowed/blocked split
- Blocked by rule breakdown
- Clickable stat cards and table rows navigate to the Log tab with pre-applied filters
- Last indexed freshness indicator — turns amber when ingest lag exceeds threshold or
last_erroris non-null
Log tab
Paginated event view with filters:
| Filter | Field |
|---|---|
| Agent | agent_id |
| Session | agent_session_id |
| Source | source |
| Tool | tool |
| Decision | policy_decision, decision_tier |
| Rule | matched_rule |
| Command / Path | command, path |
| Time range | Today, 7 days, 30 days, all time |
Ingestion behaviour
- Automatic ingestion from
activity.logintoreports.dbwith byte-offset checkpointing - Rotation and truncation detection with reconciliation
- Ingest sync runs on manual refresh and scheduled refresh intervals
- Filter changes query existing indexed data without re-ingesting
Known limitations
These are the current high-priority limitations that affect enforcement or security posture.
| Area | Limitation |
|---|---|
| Auth | Operator endpoint authentication is local-trust oriented. Should be hardened before broad deployment. |
| Shell execution | shell=True remains in the command execution path. |
| Shell containment | execution.shell_workspace_containment is heuristic. Does not replace OS-level sandboxing. |
| Budget defaults | Cumulative budget defaults may be too high to trigger in typical manual runs. |
| UI overrides | Per-command budget and retry overrides in the GUI are metadata only — not runtime-enforced. |
| Native tools | AIRG enforcement applies to MCP tool calls only. Native client shell and file tools (e.g. Claude Code Bash) bypass AIRG controls. |
| Claude Code | A sample MCP-only skill is at docs/mcp-only.md. Save as <workspace>/.claude/skills/mcp-only.md to guide strict MCP-only behaviour. |
Runtime Guard cannot enforce policy on actions taken through native client tools outside MCP. For Claude Code users, configuring .claude/settings.local.json to deny native Bash, Glob, Read, Write, and Edit tools is the strongest available mitigation.
Capabilities & caveats
Current capabilities
- Default basic profile blocks severe actions and allows non-severe actions
- Advanced tiers (
requires_simulation,requires_confirmation) are policy-available and per-command configurable - Approvals are out-of-band — agents cannot approve via MCP tools
- Runtime includes audit logging, backup/restore flows, normalization, and path/workspace hardening
- GUI supports policy editing, custom commands, custom categories, and agent profile-based MCP config generation
Current caveats
- Policy reload is startup-based — restart the MCP server after applying policy changes
- "Basic/Advanced" are policy conventions, not hard runtime modes
- Redaction and obfuscation defences are pattern-based and not exhaustive
- Some blast-radius and target inference for complex shell patterns is heuristic
- Cumulative budget efficacy depends on threshold tuning