Andrej Karpathy's Wiki Pattern Gets Its First Real Implementation
Someone finally built the thing Andrej Karpathy sketched in a GitHub gist.
The former Tesla AI director's "LLM Wiki" concept has been floating around developer circles for months - a persistent knowledge base that AI agents maintain themselves, using markdown and Git instead of disappearing into chat history. Now we have wuphf, the first implementation that doesn't look like a weekend hackathon project.
Git Commits from an AI Librarian
The most fascinating detail? All AI-generated content gets committed under a distinct identity called "Pam the Archivist." Every wiki update, every entity brief, every synthesis - all tracked with full Git provenance. It's either brilliant or deeply unsettling that we're giving AIs their own Git identities now.
The architecture follows Karpathy's three-layer blueprint exactly:
- Raw immutable sources at the bottom
- LLM-generated markdown wiki in the middle
- Query interface on top
But the implementation details reveal the real engineering thinking. Private agent notebooks live at agents/{slug}/notebook/.md while shared knowledge goes through a draft-to-wiki promotion workflow. Human or agent review before anything becomes canonical. Smart.
<> Rather than traditional retrieval-augmented generation (RAG) that re-derives answers on every query, Karpathy's pattern has the LLM incrementally build and maintain a persistent wiki./>
This isn't just another RAG system with fancy marketing. The difference is persistence. Instead of re-deriving the same answers every time, the AI builds up institutional memory. Knowledge compounds instead of evaporating.
What Nobody Is Talking About
Everyone's obsessing over the Git angle, but the real innovation is the append-only JSONL fact logs for entities. Clean separation between raw facts and synthesized narrative. When the synthesis workers rebuild entity briefs, they're working from structured data, not trying to parse their own previous prose.
The tech stack choices reveal a developer who actually ships things:
- BM25 via Bleve for search (not vector embeddings)
- SQLite for indexing (not Neo4j or some graph database)
- Local-first in
~/.wuphf/wiki/(you can literallygit cloneyour knowledge)
No vector databases. No graph databases. No infrastructure complexity. Just search that works and data you can actually own.
The Hype Reality Check
Karpathy's original gist spawned multiple implementations already - Agent Skills versions for Claude and Cursor, something called "OmegaWiki" that claims to be "fully realized," integrations with Logseq and Obsidian. The usual open-source fragmentation.
But most look like demos. This one has daily linting for contradictions, broken-link detection, and heuristic routing (BM25 for short queries, cited-answer loops for narrative ones). Details that matter when you're actually using the thing.
The /lookup slash command and MCP tool integration suggest the developer understands how this fits into real workflows. You're not switching to a new app - you're adding persistent memory to agents you already use.
The Bigger Picture
This represents the agentic AI systems everyone's building toward - persistent state across sessions, context that compounds rather than resets. The alternative is re-pasting the same context daily like some kind of digital Groundhog Day.
Whether Karpathy's pattern becomes the standard remains unclear. But at least now we have something concrete to evaluate instead of just gists and concept posts.
Pam the Archivist might be the most honest AI identity yet - an artificial librarian maintaining institutional memory while humans focus on higher-level thinking.
Or maybe I'm just impressed that someone shipped instead of tweeting about it.
