Pipelock Blog

Security for AI agent systems — research, tools, and practical guidance.

View on GitHub
← Back to all posts

Lateral movement in multi-agent LLM systems

February 08, 2026 — luckyPipewrench

A security gap nobody is patching


The setup

I run two AI agents. One manages my infrastructure. The other writes code. They share a workspace: config files, memory, task lists. They talk to each other through a shared git repo and file drops.

This isn’t unusual anymore. OpenHands users pair it with Claude Code. Dev teams run multiple specialized agents. Homelab people (myself included) have agents managing different parts of their stack.

The problem is simple. If one agent gets compromised, it can silently take over every other agent it talks to.

The attack

Researchers have already shown this works. Lee and Tiwari published “Prompt Infection” in October 2024, showing that malicious prompts self-replicate across connected LLM agents. A compromised agent spreads the infection to other agents through their normal communication channels (arxiv.org/abs/2410.07283). Gu et al. showed in “Agent Smith” that a single poisoned image can jailbreak agents exponentially fast in multi-agent setups.

Those papers focus on direct message passing between LLMs. In the real world, the attack surface is bigger and harder to see.

How agents actually talk to each other

Real multi-agent setups don’t use clean protocols. They share:

None of these channels have integrity checking. None use signatures. There’s no way to tell the difference between a file written by a healthy agent and one written by a compromised agent.

What this looks like in practice

  1. Agent A visits a webpage with a hidden prompt injection
  2. Agent A gets compromised. It still looks normal, still responds correctly
  3. Agent A writes a “task update” to the shared workspace with embedded instructions
  4. Agent B reads the handoff as part of its normal routine
  5. Agent B follows the instructions because they came from a trusted source
  6. Both agents are compromised. The poisoned files stay in the workspace across restarts

That’s lateral movement. Same idea as in traditional network security, where an attacker hops from one compromised machine to another. Except here the hop goes through shared files instead of network connections.

Why this is worse than regular lateral movement

On a traditional network, moving laterally means exploiting vulnerabilities or stealing credentials at each step. With agents:

What’s missing from the ecosystem

People have responded to individual agent threats:

But nobody has built anything to secure the communication between cooperating agents in a dev or self-hosted environment. AutoGen, CrewAI, LangGraph, and similar frameworks have zero security for inter-agent communication. OWASP’s agentic AI guidance acknowledges the risk of prompt injection spreading between agents but doesn’t provide a technical fix for shared-workspace attacks.

Benchmarks confirm the problem is real. InjecAgent (Zhan et al., 2024) showed roughly 50% injection success rates against GPT-4 and Claude in agent scenarios. AgentDojo (Debenedetti et al., 2024) showed injections succeed even when agents use defensive prompting.

What we built

Pipelock now includes integrity monitoring for agent workspaces. It’s the first layer of defense against lateral movement through shared files.

How it works

# Hash all critical files in the workspace
pipelock integrity init ./workspace --exclude "logs/**" --exclude "temp/**"

# Verify nothing changed
pipelock integrity check ./workspace
# Exit 0 = clean, non-zero = something changed

# Re-hash after you approve changes
pipelock integrity update ./workspace

The manifest stores SHA256 hashes for every protected file. When an agent starts up, it checks that config files, skill definitions, and identity files haven’t been changed outside of a normal workflow.

This doesn’t stop every lateral movement attack. A compromised agent can still write to files that aren’t in the manifest, and we need signing to verify who actually made a change. But it catches the most dangerous thing: someone (or something) quietly editing the files that control how your agents behave.

Now available

Coming next

What you can do right now

If you run more than one agent on shared storage:

  1. Keep data separate from instructions. Agent notes and memory shouldn’t live next to config files and skill definitions.
  2. Use read-only mounts where you can. If Agent B only reads Agent A’s config, mount it read-only.
  3. Know your attack surface. List every way your agents communicate. Every channel is a potential path for lateral movement.
  4. Check for unexpected changes to behavioral files. Even running diff manually is better than nothing.

Or try Pipelock’s integrity monitoring: github.com/luckyPipewrench/pipelock.


References


← Back to all posts | Pipelock on GitHub