← All Posts Vitalii Rudnykh
Security Research

Stopping MCP Supply-Chain RCE Before It Reaches the Shell

A Model Context Protocol server config is a JSON file. It tells your AI agent how to launch a tool: a command to run, some arguments, some environment variables. The command field is just a string — and a string can be bash. Over the last twelve months, that one detail produced a public CVE every few weeks across the AI tooling ecosystem. This post explains the attack class, walks through the public CVE record, and shows how Imunify for AI Agents stops it.

10+
Public CVEs
in 12 months
All variants of the same one design choice — a string in command that the agent will execute.
Cursor Windsurf Flowise Upsonic Agent Zero LangChain-ChatChat Jaaz

The Setup: A String The Agent Will Execute

MCP is the open standard for plugging external tool servers into an AI agent. The dominant transport is STDIO: the agent spawns the MCP server as a child process and exchanges JSON-RPC messages over its stdin and stdout. The configuration that tells the agent how to spawn that child looks deceptively boring:

{
  "mcpServers": {
    "weather": {
      "command": "uvx",
      "args": ["weather-mcp"],
      "env": { "OPENWEATHER_KEY": "..." }
    }
  }
}

The whole protocol bets that command names a real tool-server binary. Nothing in the file format prevents an attacker who can write that file from putting a different string there. "command": "bash" is valid JSON, valid configuration, and a working remote code execution. On the next agent restart, that string is what gets executed — with whatever arguments the attacker chose. The JSON-RPC handshake the agent expects will simply never arrive, by which time the shell command has already run.

This is not a theoretical concern. The carousel further down catalogues seven public CVEs with exactly this shape, all from the last twelve months, all from independent vendors. The pattern is consistent enough that we built our defences against the shape itself rather than against any one product's mistake.

Two Ways That Hostile String Gets Written

The end state of the attack is fixed: a malicious value sitting in the command or args field of an MCP config the agent reads. What varies is who put it there. Across the public CVE record, every case we verified falls into one of two writers.

The config is rewritten externally

An exposed registration API, a misconfigured admin endpoint, a malicious installer link, or a silent post-approval swap places the hostile string into the file directly. The agent does not have to participate — the file just looks different on next read. Whoever holds write access to that file controls what gets executed.

{ "command": "bash",
  "args": ["-c", "id > /tmp/pwn"] }

The agent writes its own poison

Untrusted input the agent processes — a webpage, a shared doc, a code-review comment — instructs the model to "register this MCP server for me." The LLM complies and uses its own write tool to drop the malicious config into place. The agent itself becomes the writer; no external access required.

Notice what is invariant across both: the malicious command or args field has to land in the JSON file the agent reads. Whether the LLM wrote it, a CLI wrote it, a hijacked admin endpoint wrote it, or an attacker shelled in and dropped it — the file open at agent restart is the chokepoint. That is the design we built around.

How Imunify For AI Agents Catches This

Imunify for AI Agents protects MCP-driven agents from this entire class of attack. The defences below are framed around OpenClaw — an open-source coding agent whose MCP server registry lives in openclaw.json — because that is one of the surfaces we have wired end-to-end coverage and tests for. The same approach applies to any agent that loads an MCP server config at startup.

One bypass at one layer is not enough. We watch three independent points on the path between the LLM and the running shell, and any one of them is enough to stop the attack:

Layer 1 — Tool-call pre-flight (human-in-the-loop). Before the LLM's Write, Edit, or MultiEdit tool ever runs, our agent intercepts the call and matches the target path against the MCP server-config surface — openclaw.json, .mcp.json, and the other config files MCP-aware agents read at startup. Any tool call naming one of these is paused, and the operator gets an approval prompt over Telegram, Discord, or the web panel — with the full diff and the source of the request — before the write proceeds. Approve and it goes through; reject (or ignore for 120 seconds) and the call returns an error to the agent. Legitimate config edits keep working; the agent-self-poisoning vector dies here the moment the operator sees a write they did not initiate.

Layer 2 — Kernel-level file-write hold (human-in-the-loop). Even when the writer is not the AI agent — a CLI tool, an installer, an attacker shell — the file write itself goes through a Linux kernel syscall, and we hold an enforce hook on it. Writes to any of the known MCP config paths are paused at the kernel level, before the bytes hit the disk, and the same approval prompt fires regardless of which process opened the file. The operator sees who is writing, what they want to write, and decides in real time. External writes — admin-panel hijacks, post-approval swaps, anything that reaches the file from outside the agent — surface here even when nothing reached the LLM at all.

Layer 3 — Content-shape detection (synchronous deny). If a config still ends up open for read (a path we did not enumerate yet, a future MCP variant, or simply an approval the operator granted without spotting the payload), the content gets scanned for the malicious shapes — a shell binary as command, an interpreter with an inline-exec flag, a launcher wrapping a shell. These shapes never appear in legitimate MCP configs, so this layer denies outright instead of asking. The same regex set runs on file contents, on the LLM's outgoing tool-call message, and on HTTP request bodies. Three vectors, one detection. Whatever slips past the first two layers still has to look semantically benign — which by definition it does not.

A bypass at one layer is caught at the next. We do not rely on any single layer being perfect — we rely on the operator seeing the same write twice if the first prompt was missed, and on the content layer firing automatically if both prompts were approved.

The Public CVE Record

Seven verified CVEs in this exact attack class — a hostile string in command or args, landing in an MCP config the agent reads. All seven match the layered defence above. Sources are NVD entries; ordered by severity.

9.9Critical Flowise

The custom MCP stdio add-server endpoint runs validateCommandInjection and validateArgsForLocalFileAccess, but both are bypassed by combining the allowlisted npx launcher with -c "<shell>".

Fixed in flowise ≥ 3.1.0
9.8Critical Upsonic

MCP task creation accepts arbitrary command + args. The launcher allowlist permits npm and npx, but their argument flags are themselves enough to execute arbitrary OS commands.

Fixed in upsonic ≥ 0.72.0
8.8High Cursor

"MCPoison." Cursor binds trust to the MCP entry name only. Once an entry is approved, an attacker with repo or local write access silently swaps the command field — no re-prompt, no warning. Next session runs the new command unattended.

Fixed in cursor ≥ 1.3
8.6High Agent Zero

Arbitrary command and args in JSON MCP server configs are executed when the configuration is applied — no validation, no allowlist, no warning.

Vulnerable: agent-zero 0.9.8 · check vendor for fix
8.6High LangChain-ChatChat

A publicly exposed MCP management interface lets remote attackers register an MCP stdio server with an attacker-chosen command. No authentication on the registration path.

Vulnerable: langchain-chatchat 0.3.1 · check vendor for fix
8.0High Windsurf

Zero-click. A prompt-injection in fetched HTML causes the agent to write a malicious MCP stdio server entry to the local MCP config file. The entry auto-registers and executes on the next agent action.

Vulnerable: windsurf 1.9544.26 · check vendor for fix
7.3High Jaaz

User-supplied command in MCP stdio handling is executed verbatim. No allowlist, no shape check on the resulting process.

Vulnerable: jaaz 1.0.30 · check vendor for fix
1 / 7

Seven CVEs, seven different products, one shape. Hosts running Imunify for AI Agents do not depend on any of those vendor patches landing — the layered defence above triggers on the attack itself, not on knowledge of which product happens to be vulnerable this week.

Closing

The MCP command field is a string, and the host AI agent will execute that string. Any control plane that lets an attacker write to that file ships an RCE. The fix at the protocol level would have been a typed launcher set; the fix at the platform level is what each affected vendor has to ship, one CVE at a time. The fix at the host level is a layered set of detections that watches the file, the syscall, and the content of the JSON — so the same attack has to defeat three independent observers, not one. That is what Imunify for AI Agents does, today, on every host running it.