Runtime Guard v2.2

Documentation

This manual explains current runtime behaviour as implemented. All sections reflect the live codebase, not planned features.

Prerequisites

Runtime Guard requires Python to run. The version matters more than it might appear.

RequirementVersionNotes
Python (required)≥ 3.10Hard minimum — older versions will fail
Python (recommended)3.12+Smoother dependency installs
macOS system Python3.9Too old — use Homebrew or python.org
macOS system Python

macOS ships with Python 3.9 which causes dependency install failures. Install a newer Python via Homebrew (brew install python@3.12) or python.org and create a fresh venv from that version.

Installation

Runtime Guard should be installed in an isolated Python environment. This prevents conflicts with system packages, keeps upgrades and uninstalls predictable, and avoids permission issues on Linux and macOS.

Choosing an isolation method

MethodBest for
pipx (recommended)Operators running AIRG as a local tool. Installs in isolation with global CLI shims.
venvDevelopment or source-based workflows.
Why isolation matters

Installing into the system Python risks version conflicts, makes uninstalls messy, and can require sudo on Linux. An isolated environment keeps AIRG self-contained and easy to remove or upgrade.

Quick start with pipx (recommended)

pipx may not be installed by default. Install it first if needed:

PlatformCommand
Ubuntu / Debiansudo apt install pipx
Fedora / RHELsudo dnf install pipx
macOS (Homebrew)brew install pipx

Then install and set up Runtime Guard:

pipx install ai-runtime-guard
pipx ensurepath   # run once if airg* commands are not found
# open a new terminal after ensurepath
airg-setup
airg-doctor
Open a new terminal after pipx ensurepath

The PATH change does not apply to your current shell session. Open a new terminal before running any airg* commands.

Alternative: venv install

python3 -m venv .venv-airg
source .venv-airg/bin/activate
python -m pip install --upgrade pip
python -m pip install ai-runtime-guard
airg-setup
airg-doctor

Source install

git clone --branch main https://github.com/runtimeguard/runtime-guard.git
cd runtime-guard
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install .
airg-setup
airg-doctor

For unattended automation or CI only:

airg-setup --defaults --yes --workspace /absolute/path/to/workspace

Runtime model

Understanding the three separate folders AIRG uses avoids common setup confusion:

FolderPurpose
Install folderWhere the package or cloned repo lives.
Runtime state folderPolicy, approvals DB, HMAC key, activity log, reports DB, and backups.
Workspace (AIRG_WORKSPACE)Where guarded agent operations run.
Keep these folders separate

Do not use the install folder as the agent workspace. Mixing them causes confusing side effects during testing.

Default runtime state locations:

PlatformPath
macOS~/Library/Application Support/ai-runtime-guard/
Linux (config)${XDG_CONFIG_HOME:-~/.config}/ai-runtime-guard/
Linux (state)${XDG_STATE_HOME:-~/.local/state}/ai-runtime-guard/

After setup

Once airg-setup and airg-doctor complete, open http://127.0.0.1:5001 and add your first agent from Settings -> Agents.

Service commands

To run the GUI as a persistent background service:

airg-service install --workspace /absolute/path/to/airg-workspace
airg-service start
airg-service status
airg-service stop
airg-service restart
airg-service uninstall

What the server does

Runtime Guard is an MCP server that sits between your AI agent and your system. It exposes the following tools, all subject to policy evaluation before execution:

ToolDescription
server_infoReturns server status, version, and policy hash
execute_commandRuns shell commands — most policy enforcement applies here
read_fileReads file contents with path and extension policy checks
write_fileWrites files — creates backups before overwriting
delete_fileDeletes files — creates backup before deletion
list_directoryLists directory contents within workspace boundaries
restore_backupRestores files from backup with dry-run support

Scope (intentional)

Understanding what Runtime Guard is not designed to do is as important as understanding what it does.

Designed for accident prevention, not adversarial containment

Runtime Guard prevents accidental damage from hallucinated deletes, wrong-path writes, broad wildcard actions, and accidental secret access. It is not a full malicious-actor containment boundary.

Core controls in scope:

  • Block severe destructive and exfiltration actions by policy
  • Enforce workspace and path boundaries
  • Preserve policy intent with Script Sentinel across write and execute paths
  • Optionally require operator approval for selected risky commands
  • Automatically create backups before destructive or overwrite operations
  • Audit all allowed and blocked actions and operator decisions
Enforcement boundary

Runtime Guard controls MCP tool calls only. Native client tools outside MCP — for example Claude Code's built-in Bash tool — are outside AIRG enforcement scope.

Architecture brief: Why MCP

Runtime Guard is built as an MCP server because MCP provides the right interception point. When an agent issues a tool call, the request passes through Runtime Guard before execution, giving policy a chance to allow, block, or gate the action.

  • MCP tool calls are interceptable before they take effect on the host
  • Pre-tool hooks (supported by clients like Claude Code) can deny the agent's native file and shell tools, forcing risky operations through the AIRG MCP layer
  • No agent modification, retraining, or prompt engineering required
  • No system-level privileges required on the host
Why not kernel-level interception

Kernel-level enforcement (syscall interception, LSM modules, eBPF) is more comprehensive but requires elevated privileges, OS-specific engineering, and per-platform maintenance. MCP-layer enforcement combined with pre-tool hooks offers a close approximation without the operational complexity, and works across any MCP-compatible agent.

Why not containers

A common question is whether Runtime Guard should run the agent in a container instead. Containers solve a different problem and introduce a visibility gap that undermines this use case.

  • Real workflows require the agent to operate on real project files on the host. Full containerization removes that capability
  • With host mounts, policy still needs to see real host paths to apply workspace boundaries, path whitelists, and backup logic. A container boundary obscures that visibility
  • Backups, restore, and audit logs need to attach to real host paths so operators can recover what the agent actually touched
  • Container orchestration adds operational overhead that does not fit alongside a developer's existing agent workflow

Complementary, not competitive

Runtime Guard complements sandboxing and containerization rather than replacing them. A hardened setup can combine AIRG policy enforcement with client-native sandbox controls (for example, Claude Code sandbox) for additional isolation layers.

Runtime environment setup

The recommended approach for most users is the packaged CLI flow.

Packaged CLI (recommended)

CommandPurpose
airg-setupGuided interactive setup — workspace, paths, policy, agent config
airg-serverStart the MCP server (stdio)
airg-uiStart the Flask backend and control plane GUI
airg-upStart Flask backend as sidecar then start MCP server in one command
airg-doctorDiagnose environment, paths, permissions, and UI build status
airg-serviceGUI service management for macOS/Linux user sessions
airg-initLow-level manual bootstrap fallback
Run airg-doctor first

Run airg-doctor and resolve any warnings before your first MCP client connection. It checks paths, permissions, UI build status, and reports DB health.

airg-setup options

# interactive guided setup
airg-setup

# unattended defaults
airg-setup --defaults --yes

# setup with GUI service configured and started
airg-setup --gui

# fully silent: defaults + gui, auto-creates one agent profile
airg-setup --silent

After setup, airg-setup prints a ready-to-copy MCP config env block containing all resolved AIRG_* paths including AIRG_REPORTS_DB_PATH.

Direct source runs (alternative)

For direct source or manual runs without the packaged CLI, source the runtime env script first:

source scripts/setup_runtime_env.sh

This exports AIRG_APPROVAL_DB_PATH and AIRG_APPROVAL_HMAC_KEY_PATH and enforces restrictive permissions (700 for directories, 600 for files).

Runtime paths

PlatformDefault state directory
macOS~/Library/Application Support/ai-runtime-guard/
Linux${XDG_STATE_HOME:-~/.local/state}/ai-runtime-guard/

Backup root diagnostics

airg-doctor prints the resolved backup_root. If it points to site-packages or the project directory, treat it as misconfiguration and move it to a user-local runtime state path.

Workspace model

AIRG_WORKSPACE defines the operational sandbox root for all agent actions.

How it works

  • execute_command runs with AIRG_WORKSPACE as its working directory
  • File and tool path checks are evaluated against this root and policy path rules
  • Traversal outside the workspace root is blocked

Recommended setup

Keep install and workspace separate

Do not use the Runtime Guard install folder as the agent workspace. Mixing them causes confusing side effects during destructive command testing.

# install location
~/Documents/Projects/runtime-guard/

# workspace for agent operations (separate)
~/airg-workspace/

To allow multiple workspace roots, add extra paths under policy.allowed.paths_whitelist.

Per-agent policy overrides

Runtime Guard supports optional per-agent policy overlays keyed by AIRG_AGENT_ID. This enables different guardrails per agent without maintaining separate policy files.

How overrides work

Overrides live under policy.agent_overrides.<agent_id>.policy and are deep-merged on top of the base policy at startup. Dictionary values merge recursively; scalar and list values replace base values.

Supported override sections

SectionOverridable
blockedYes
requires_confirmationYes
script_sentinelYes
allowedYes
networkYes
executionYes
reports.*No
audit.*No
backup_access.*No
restore.*No
AIRG_WORKSPACENo — env/MCP configured
Restart required

Effective policy is resolved at startup. Restart the MCP server after editing override entries.

Per-agent overrides can be authored in the GUI under Policy → Agent Overrides. Saved overrides are diff-style overlays, not full copies of baseline sections.

Policy tier precedence

Command checks run in strict precedence order. If a command matches multiple tiers, the highest tier wins.

PriorityTierEffect
1 (highest)blockedDenied immediately — no approval path
2requires_confirmationPaused — operator must approve out-of-band
3 (lowest)allowedExecutes immediately
Default profile

Runtime Guard ships in basic-protection mode: severe commands and paths are blocked, all other actions are allowed. Confirmation gates and Script Sentinel are optional hardening controls.

Action options

Core policy options

OptionBehaviour
allowedCommand passes policy and executes immediately. No human checkpoint required.
blockedCommand is denied immediately. No approval path exists for that request. Blocked outcomes consume server-side retry counter until final block.
requires_confirmationReturns an approval token. Requires human operator approval via GUI/API before the exact command can be retried. Approval is one-time, session-scoped, and time-bounded. Does not consume server retry counter.

Script Sentinel options

OptionBehaviour
match_originalPreserves the original policy tier when a script hit is detected.
blockAny script hit blocks execution.
requires_confirmationAny script hit requires operator approval before execution.
Script decision continuity

When Script Sentinel is enabled, patterns detected during file write/edit are re-checked at execute time so policy intent is preserved across indirect execution flows.

Command matching

Matching is not strict full-string equality. Runtime Guard uses token-aware pattern matching with normalization applied before any policy check.

How patterns match

  • Single-token patterns (e.g. rm) are token-aware — they match rm as a command token, not a substring
  • Multi-token patterns (e.g. rm -rf) match normalized command sequences
  • Whitespace, case, and tab characters are normalized before matching

Pattern examples

CommandMatches patternNotes
rm -rf /tmp/xrm -rfStandard match
RM -RF /tmp/xrm -rfCase-insensitive normalization
rm  -rf  /tmprm -rfWhitespace normalization
rm *.txtdoes not match rm -rfRequires explicit matching pattern to block
find docs -deletefind -deleteMulti-token with intervening arg
find docs -exec rm {} +find -exec rmNested command match
printf 'a\n' | xargs rmxargs rmPiped command match
for f in *.tmp; do rm "$f"; donedo rmLoop body match
Non-command safety invariants

Beyond pattern matching, runtime enforces workspace boundary checks, protected runtime path guards, control-character sanitization, and optional shell containment — regardless of policy tier.

Approvals

The confirmation handshake ensures that high-impact commands require explicit human sign-off before execution proceeds.

Flow

  1. execute_command matches requires_confirmation — returns a token and blocks
  2. Human operator approves out-of-band via the control plane GUI or API (/approvals/approve) using the exact command + token
  3. Agent retries the exact command — it now proceeds

Storage model

Pending approvals are persisted in approvals.db (SQLite) so separate processes can read and update the same queue. Each pending record includes:

  • token, command, session_id
  • requested_at, expires_at
  • affected_paths (optional)

Approved commands are persisted as one-time session+command grants and consumed on retry.

Security model

Agents cannot self-approve

The MCP tool surface does not expose an approval tool. Approval decisions come exclusively from out-of-band operator channels (GUI/API) and are logged as source: "human-operator" in activity.log.

Retry behaviour

  • Retries are server-side, not client-authoritative
  • Retry key is scoped to (normalized_command + decision_tier + matched_rule)
  • Different blocked command/rule combinations maintain independent retry counters
  • Confirmation-tier blocks do not consume the server retry counter

Script Sentinel

Script Sentinel preserves policy intent across indirect execution patterns.

Purpose

  • Preserve policy intent when commands are written into scripts and executed later.

Model

  1. Flag at write time (write_file and edit_file): scan content against blocked and approval-gated policy patterns.
  2. Check at execute time (execute_command): detect script invocation targets and enforce decision continuity.

Modes

  1. match_original: keeps original pattern tier (blocked or requires_confirmation).
  2. block: any hit blocks execution.
  3. requires_confirmation: any hit requires approval.

Scan modes

  1. exec_context (default): executable-context signatures only.
  2. exec_context_plus_mentions: includes mention-only signatures for audit visibility.

Boundary

  1. Coverage is limited to content written through AIRG file-edit/write tools.
  2. This feature targets policy-enforcement evasion patterns, not malicious intent classification.

Detected tags example

Example output tags from Script Sentinel (sanitized path names):

PATH                                DETECTED CONTENT                                               EXEC CONTEXT  LAST SEEN     ACTIONS
/workspace/move_via_mv.py            policy_command: mv | wrapper_signature: subprocess             Yes           106 hours ago  Dismiss once | Trust
/workspace/random_mv_test.txt        policy_command: mv                                             Yes           106 hours ago  Dismiss once | Trust
/workspace/test_content_scan.py      policy_command: rm -rf | policy_command: rm -rf /              Yes           106 hours ago  Dismiss once | Trust

Backup & Restore

Runtime Guard automatically creates backups before destructive and overwrite operations.

Backup behaviour

  • Backups are timestamped directories with a manifest.json
  • audit.backup_on_content_change_only=true deduplicates by content hash (SHA256) and skips redundant snapshots
  • Version and day pruning is event-driven during backup operations — not a background scheduler
  • audit.max_versions_per_file and audit.backup_retention_days govern cleanup

Restore flow

  1. Run restore_backup with dry_run=true — returns a restore_token and planned item count
  2. Run restore_backup with the token to apply
Token expiry

When restore.require_dry_run_before_apply=true, the apply step requires a valid token from a prior dry-run. Tokens expire after restore.confirmation_ttl_seconds. After expiry, a new dry-run is required. This is an operation-safety gate, not a human approval workflow.

Audit notes

  • audit.redact_patterns applies to log output only, not backup file payloads
  • audit.log_level is configuration metadata with limited runtime differentiation currently
  • Pruning does not currently emit a dedicated event for every removed backup artifact

Network policy

Network policy evaluates outbound network commands against domain rules before execution.

Enforcement modes

ModeBehaviour
offSkip all network policy checks
monitorEvaluate policy, emit warnings, but do not block execution
enforceEvaluate policy and block when domain rules fail

Domain rules

SettingEffect
blocked_domainsExplicit deny list. Matching domains blocked in enforce mode.
allowed_domainsExplicit allow list. Matching domains allowed when not blocked.
block_unknown_domains: falseDefault. Domains not in either list are allowed.
block_unknown_domains: trueDefault-deny. Domains not in allowed_domains are blocked.
Blocklist precedence

If a domain appears in both allowed_domains and blocked_domains, the blocklist wins. Subdomains are matched — example.com also matches api.example.com.

network.commands

network.commands is intent classification, not a deny list. Listing a command such as curl here decides whether network-domain policy should evaluate — it does not block the command by itself.

Current limitation

Runtime evaluates domains parsed from command tokens and URLs. Redirect chains and out-of-band destination changes are not deeply inspected.

Telemetry

AIRG supports optional anonymous telemetry to help improve product quality and prioritization.

What is collected

When enabled, AIRG sends one aggregate payload per UTC day to:

  • https://telemetry.runtime-guard.ai/v1/telemetry

Example payload:

{
  "airg_version": "2.2.0",
  "platform": "macos",
  "python_version": "3.12.3",
  "install_method": "unknown",
  "agents_bucket": "1",
  "agent_types": ["cursor"],
  "events_bucket": "11-50",
  "blocked_bucket": "2-5",
  "approvals_bucket": "0",
  "sentinel_enabled": true,
  "sentinel_flagged_bucket": "1",
  "sentinel_blocked_bucket": "0",
  "period_days": 1
}

What is not collected

  • No command text.
  • No file contents or file paths.
  • No prompt/completion text.
  • No usernames, emails, hostnames, or machine identifiers.
  • No install ID or persistent telemetry identifier.
  • No high-resolution timestamps (daily aggregate only).

Opt in / opt out

  • During setup/update, AIRG prompts for telemetry opt-in (default is Yes).
  • You can change telemetry preference at any time in GUI: Policy -> Advanced -> Anonymous telemetry.
  • GUI Enable/Disable writes directly to policy (telemetry.enabled) and is the runtime source of truth.

Payload preview

  • In Policy -> Advanced -> Anonymous telemetry, click See Payload.
  • AIRG shows the exact JSON shape/value that would be sent.

Endpoint and delivery

FieldValue
Default endpointhttps://telemetry.runtime-guard.ai/v1/telemetry
MethodPOST JSON (Content-Type: application/json)
Timeout5 seconds total
RetriesNone
Failure handlingFailures are silently dropped (no queue/persist)
Success responseHTTP 204 No Content

To point to a different endpoint, set policy.telemetry.endpoint to a custom URL.

Control plane UI

The local web interface runs at http://127.0.0.1:5001 when started with airg-ui or airg-up.

Navigation structure

  • Approvals — polls backend, supports approve/deny actions against shared SQLite approval store
  • Policy — Rules, Network, Script Sentinel, Agent Overrides, and Advanced tabs
  • Reports — Dashboard and Log tabs
  • Settings — Advanced tab

Policy tabs

TabControls
RulesCombined policy rule management for commands, paths, and file extensions in one view.
NetworkEnforcement mode, network.commands list, domain allowlist/blocklist
Script SentinelFlag at write, enforce at execute; matches command patterns blocked by policy from being executed using scripts
Agent OverridesPer-agent diff-style overlays with section editors and baseline info cards
AdvancedAdvanced Policy Configuration

Shared policy actions

Available across all policy tabs:

  • Reload — reload from disk
  • Validate — validate current state without applying
  • Apply — validate and atomically write to policy.json
  • Revert Last Apply — restore from policy.json.last-applied snapshot
  • Reset to Defaults — restore from policy.json.defaults snapshot
Restart required after Apply

Runtime policy reload is startup-based. After applying policy changes in the GUI, restart the MCP server and reconnect the agent client for changes to take effect.

Serving model

  • Flask backend serves both REST API endpoints and built frontend assets from ui_v3/dist
  • Prebuilt assets are committed and packaged — frontend rebuild is only needed for local UI development
  • If the frontend build is missing, API routes still work and / returns a build-missing hint
  • Override the built UI path with AIRG_UI_DIST_PATH when needed

Reports & logs

All actions are written to activity.log and indexed into reports.db for dashboard and log analytics.

Dashboard tab

  • Total events, blocked events, backups created, confirmations (pending/approved/denied)
  • 7-day event and blocked trends with bar charts
  • Top commands and top paths with allowed/blocked split
  • Blocked by rule breakdown
  • Clickable stat cards and table rows navigate to the Log tab with pre-applied filters
  • Last indexed freshness indicator — turns amber when ingest lag exceeds threshold or last_error is non-null

Log tab

Paginated event view with filters:

FilterField
Agentagent_id
Sessionagent_session_id
Sourcesource
Tooltool
Decisionpolicy_decision, decision_tier
Rulematched_rule
Command / Pathcommand, path
Time rangeToday, 7 days, 30 days, all time

Ingestion behaviour

  • Automatic ingestion from activity.log into reports.db with byte-offset checkpointing
  • Rotation and truncation detection with reconciliation
  • Ingest sync runs on manual refresh and scheduled refresh intervals
  • Filter changes query existing indexed data without re-ingesting

Agent Settings

Use Agent Settings to generate and apply MCP configuration for supported clients, then verify enforcement posture per profile.

Common flow (all agent profiles)

  1. Open Settings -> Agents.
  2. Click + Add.
  3. Set Agent type, Agent ID, Workspace, and optional Scope (in Advanced).
  4. Save profile.
  5. Use Copy MCP JSON / Copy CLI command or Apply MCP Config.
  6. Use Security Posture panel to verify enforcement state.
  7. Optional: use Apply (enforcement) and Undo All (revert hardening-only changes, keep MCP).

Enforcement color model in UI

  1. Off (gray): AIRG MCP not detected.
  2. Standard (red): MCP detected, strict controls not fully in place.
  3. Strict (yellow): strict controls in place (agent-specific).
  4. Maximum (green): strongest configured posture (agent-specific).

Claude Code

Scope options

  1. Project (default)
  2. Local
  3. User

Apply MCP Config writes to

  1. Project scope: <workspace>/.mcp.json
  2. Local scope: ~/.claude.json under projects.<workspace>.mcpServers
  3. User scope: ~/.claude.json under mcpServers
  4. Also syncs <workspace>/.claude/settings.local.json AIRG MCP allowlist entries.

Enforcement options

  1. Standard: MCP only (configured via Apply MCP Config).
  2. Strict: Hook active + Native tools restricted (Bash, Write, Edit, MultiEdit).
  3. Maximum: Sandbox enabled + Sandbox escape closed.
  4. Optional: Read/Glob/Grep hook coverage.

Posture logic

  1. Green when MCP + tier1 hook + native tool restriction + sandbox enabled + sandbox escape closed.
  2. Yellow when MCP + tier1 hook active but not full maximum set.
  3. Red when MCP is present but strict hook not active.
  4. Gray when MCP missing.

Example

  1. Policy blocks rm -rf.
  2. Agent tries native Bash rm -rf ....
  3. Strict hook denies native Bash and instructs MCP use.
  4. If retried via mcp__ai-runtime-guard__execute_command, AIRG blocks by policy and logs event.

Codex

Scope options

  1. Global (default): ~/.codex/config.toml
  2. Project: <workspace>/.codex/config.toml

Apply MCP Config writes

  1. mcp_servers.ai-runtime-guard in the selected Codex config TOML.

Enforcement options

  1. Standard: MCP configured.
  2. Strict:
    • Guidance (managed AIRG block in ~/.codex/AGENTS.md)
    • Policy mirror (managed rules in ~/.codex/rules/default.rules)
    • Mirror Approvals mode: Allow | Deny | Require Approval.
  3. Maximum:
    • Sandbox mode: read-only | workspace-write | danger-full-access
    • Approval policy: untrusted | on-request | never
    • Workspace-write extras:
      • network_access
      • exclude_slash_tmp
      • exclude_tmpdir_env_var

Posture logic

  1. Green when strict is in sync and sandbox mode is read-only and approval policy is untrusted.
  2. Yellow when strict controls are in place but maximum criteria not met.
  3. Red when MCP exists but strict controls are incomplete.
  4. Gray when MCP missing.

Example

  1. Policy mirror enabled and Mirror Approvals = Require Approval.
  2. Command in AIRG requires_confirmation.commands is mirrored into Codex rule prompt.
  3. Codex prompts at rule layer; if command is sent through AIRG MCP, AIRG confirmation flow still applies and is audited.

Claude Desktop

Scope options

  1. Effective scope is desktop (single config location).

Apply MCP Config writes to

  1. macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  2. Linux: ~/.config/Claude/claude_desktop_config.json
  3. Windows: %APPDATA%\Claude\claude_desktop_config.json

Enforcement options

  1. MCP only (no hook/sandbox/native-tool controls exposed for Claude Desktop in current implementation).

Posture behavior

  1. Green when AIRG MCP is detected in Claude Desktop config.
  2. Gray when not detected.

Cursor

Scope options

  1. Default only.

Current support level

  1. Posture detection checks Cursor MCP files.
  2. No full managed enforcement panel like Claude Code/Codex.
  3. Use MCP JSON/manual setup path for now (advanced hardening not available in current UI flow).

Posture behavior

  1. Red when MCP found (MCP-layer protection active, limited advanced hardening).
  2. Gray when MCP missing.

Custom agent type

Scope options

  1. Default only.

Current support level

  1. Generic posture + MCP presence checks only.
  2. No dedicated managed hardening controls.
  3. Use generated JSON and manual client integration.

Important limitations to document

  1. AIRG enforces only operations routed through AIRG MCP tools.
  2. Native client tools outside MCP can bypass AIRG unless client-side restrictions/hooks are active.
  3. In STDIO MCP deployments, same-client same-workspace multi-instance identity separation is limited; practical separation is profile/env (AIRG_AGENT_ID + workspace config).

Known limitations

These are the current high-priority limitations that affect enforcement or security posture.

AreaLimitation
Auth Operator endpoint authentication is local-trust oriented. Should be hardened before broad deployment.
Shell execution shell=True remains in the command execution path.
Shell containment execution.shell_workspace_containment is heuristic. Does not replace OS-level sandboxing.
Script Sentinel coverage Detection coverage is limited to content written through AIRG file-edit/write tools.
Classification scope Script Sentinel focuses on policy-enforcement evasion patterns, not malicious intent classification.
Native tools AIRG enforcement applies to MCP tool calls only. Native client shell and file tools (e.g. Claude Code Bash) bypass AIRG controls.
Claude Code A sample MCP-only skill is at docs/mcp-only.md. Save as <workspace>/.claude/skills/mcp-only.md to guide strict MCP-only behaviour.
MCP enforcement boundary

Runtime Guard cannot enforce policy on actions taken through native client tools outside MCP. For Claude Code users, configuring .claude/settings.local.json to deny native Bash, Glob, Read, Write, and Edit tools is the strongest available mitigation.

Capabilities & caveats

Current capabilities

  • Default basic profile blocks severe actions and allows non-severe actions
  • Approval gating and Script Sentinel modes are available and configurable by policy
  • Approvals are out-of-band — agents cannot approve via MCP tools
  • Runtime includes audit logging, backup/restore flows, normalization, and path/workspace hardening
  • GUI supports policy editing, custom commands, custom categories, and agent profile-based MCP config generation

Current caveats

  • Policy reload is startup-based — restart the MCP server after applying policy changes
  • "Basic/Advanced" are policy conventions, not hard runtime modes
  • Redaction and obfuscation defences are pattern-based and not exhaustive
  • Some blast-radius and target inference for complex shell patterns is heuristic
  • Native tool bypass remains possible unless client restrictions/hooks are enabled