Zero-cloud runtime intelligence for agents

Stop blind agent fan-out before it taxes the machine.

Axon is a local MCP server that helps agents understand when the machine is the bottleneck, before they spawn more tools, burn more credits, or debug the wrong problem.

Private by design Works with local agents Backed by A/B testing

The bottleneck moved below the prompt.

Modern coding agents can plan, edit, test, spawn tools, drive browsers, and call MCP servers. But they usually do it blind to the local machine they are running on.

Invisible cost

Tool servers accumulate

Old MCP servers, stale shells, renderers, and browser controllers quietly compete with every new agent task.

Wrong diagnosis

Agents blame the code

When hardware pressure causes flaky tests or slow builds, the agent may debug false software problems instead of reducing local pressure.

Lost velocity

Users feel the slowdown

The visible outcome is lag, OOMs, fan noise, slow IDEs, failed browser automation, and wasted tokens.

What Axon gives an agent

workload_adviceRun/defer/degrade recommendation with safe parallelism for builds, tests, Docker, browser automation, GPU jobs, subagents, and MCP-heavy workflows.
agent_runtime_healthLive inventory of Codex, Claude, Cursor, MCP servers, stale sessions, duplicate server groups, renderer CPU, and business workflow impact.
process_blameTop culprit process or process group, impact severity, anomaly type, and concrete fix.
hw_snapshotCPU, memory, disk, thermal, GPU, and headroom fields for pre-task gating.

Example decision

$ axon query workload_advice
{
  "recommendation": "defer",
  "risk": "critical",
  "safe_parallelism": 1,
  "reasons": [
    "Disk at 96% (Critical)",
    "System impact is critical",
    "Current anomaly is memory_pressure"
  ],
  "suggested_actions": [
    "defer new heavy work",
    "cap parallelism at 1 instead of 4"
  ]
}

Built for the agent infrastructure layer.

Axon is useful anywhere agents run on machines: laptops, workstations, developer VMs, CI-like local runners, GPU boxes, and hardware platforms.

Agent companies

Agent IDEs and app builders

Prevent runaway local fan-out, surface host constraints, and make “why is my agent slow?” answerable without cloud telemetry.

Hardware vendors

NVIDIA, AMD, Apple, workstation OEMs

Expose local capacity and bottlenecks as a standard primitive agents can use before GPU, compile, and browser workloads.

Developers

Faster feedback with fewer false failures

Keep useful work moving while avoiding memory storms, thermal throttling, stale MCP servers, and broken long sessions.

Install once. Every local agent gets hardware context.

Axon is private by architecture: no telemetry, no analytics, no cloud calls. It speaks MCP over stdio and runs locally.

# Homebrew
brew install rudraptpsingh/tap/axon

# Configure agents
axon setup
axon setup cursor
axon setup vscode

# Try it directly
axon query agent_runtime_health
axon query workload_advice

Make local runtime a first-class agent signal.

Axon turns host pressure into execution policy, so agents can prove value with fewer failed runs, fewer unnecessary tool workers, and clearer user-facing impact.