ASI06 - Memory & Context Poisoning

ASI06

📰 In The Wild

Gemini Memory Hack (Feb 2025) — Researchers used prompt injection to plant false memories in Google Gemini's persistent memory store that survived across sessions and kept influencing responses.

Source: Ars Technica / Google Security, Feb 2025

BONUS TECH DECODER

RAG (Retrieval-Augmented Generation):AI retrieves information from a knowledge store before responding — powerful, but only as trustworthy as the data it retrieves.

Vector Database:Stores information as mathematical embeddings for semantic search — the backbone of AI memory, and a prime target for poisoning.

Bootstrap Poisoning:When an agent's own outputs feed back into its memory, creating a self-reinforcing loop of corrupted content that grows over time.

🔗 LLM Top 10 Connections

LLM01LLM04LLM08

Prompt Injection · Data Poisoning · Vector Weaknesses

🧠 WHAT IS IT?

AI agents store information across tasks using conversation history, memory tools, RAG stores, and vector databases. Unlike a one-time prompt injection, memory poisoning corrupts the persistent layer — a single malicious input can warp the agent's reasoning across all future sessions. It's like someone rewriting your diary so you remember things that never happened.

🔍 HOW IT HAPPENS

Malicious data enters a vector database via poisoned sources, causing the agent to retrieve and act on false context
Crafted content injected into a conversation gets summarized and saved to long-term memory, contaminating future sessions
Incremental biased data gradually shifts stored knowledge or goal weighting without triggering any single alert
Contaminated context spreads between cooperating agents that share memory stores, compounding corruption system-wide

🚨 WHY IT MATTERS

Memory poisoning is persistent and self-reinforcing. Unlike prompt injection that affects one session, poisoned memory shapes every subsequent decision the agent makes — and detecting it requires inspecting the memory store itself, which most security tools never monitor.

🛡️ HOW TO PREVENT IT

Scan all new memory writes for malicious content; allow only authenticated, curated sources to contribute to memory
Isolate user sessions with strict namespace controls — prevent cross-tenant or cross-session memory bleed
Expire unverified memory entries over time; maintain rollback and quarantine for suspected poisoning events
Block automatic re-ingestion of an agent's own generated outputs to prevent self-reinforcing contamination