The Friction of Knowledge Compilation: Iterative STORM vs. The Karpathy Wiki PatternIn the architecture of LLM-driven knowledge management, we are witnessing a quiet war between two paradigms of synthesis: The Karpathy Wiki Pattern (structured, incremental aggregation) and what we can call the Iterative Closed-Loop STORM (emergent, multi-perspective dialectic applied to a finite corpus).While both attempt to tame entropy within private notes, they suffer from opposing structural failures and exhibit fundamentally different operational lifespans.Karpathy Wiki: [New Note] ──> [Canonical Profile] ──> (Linguistic Pasteurization / Flatness) Iterative STORM: [Note A] ⚡ [Note B] ──> [Dialectic Friction] ──> (Recursion Loop Risk)Theoretical Definitions1. The Karpathy Wiki Pattern (Incremental Compilation)Inspired by Andrej Karpathy's experiments with localized synthesis, this pattern treats data ingestion like a compiler. Raw documents are processed to incrementally update a canonical, markdown-based encyclopedia (/wiki). The Underlying Logic:* Deduplication via categorization. It seeks to map nouns, entities, and static relationships into a stable topology of knowledge.2. The Iterative Closed-Loop STORM (Hermeneutic Dialectic)Adapted from Stanford’s STORM framework, this architecture assigns a distinct agent-persona to individual documents within a fixed corpus. Instead of summarizing, these personas enter an adversarial debate over a prompt, feeding the output of Round $N$ back into the arena for Round $N+1$. The Underlying Logic:* Emergence through semantic friction. It treats knowledge not as a library to be categorized, but as a system of dynamic tensions.The Theoretical DivergenceDimensionThe Karpathy Wiki PatternIterative Closed-Loop STORMEpistemological GoalConvergence & Order: Eliminating noise to build a pristine, searchable monolith.Emergence & Friction: Unearthing latent contradictions and unwritten insights.The Processing LensThe LLM as an Editor: Standardizes tone and flattens linguistic anomalies to fit a template.The LLM as a Catalyst: Accelerates the reaction between disparate conceptual nodes.Handling ParadoxResolution: It smooths over contradictions to maintain a coherent wiki page.Amplification: It uses personas to weaponize contradictions into debate arguments.Where the Karpathy Wiki Fails (and STORM Continues)The Karpathy Wiki suffers from Semantic Loss via Attrition. Because it constantly overwrites its own markdown pages based on new inputs, the LLM’s statistical weights act as a linguistic sander. Over time, the unique, idiosyncratic vocabulary of your raw thoughts is replaced by the sterile, highly probable vocabulary of the base model.Where STORM continues: Because STORM pits documents against each other as distinct agents, it prevents early harmonization. The system preserves the raw "voice" of Note A because that voice is weaponized to challenge the voice of Note B. It does not integrate; it confronts. It succeeds where Karpathy fails by capturing the spaces between thoughts—the emergent hypotheses that do not fit into a neat wiki property tag.Where STORM Fails (and Karpathy Continues)The Iterative Closed-Loop STORM suffers from Recursive Entropy (The Echo Chamber Paradigm). If the synthesis of Round 1 replaces the raw inputs in Round 2, the agents stop debating your ideas and start debating the model’s interpretation of your ideas. The system rapidly detaches from the source data, falling into a feedback loop of hyper-abstract hallucination.Where Karpathy continues: The Karpathy Wiki remains a pristine, bulletproof architecture for long-term stability. It does not lose its anchor because it does not iterate recursively on its own conclusions; it merely updates its indexes. It acts as an external hard drive for your memory, whereas an unconstrained iterative STORM acts as a dreaming mind that eventually forgets the waking world.Architectural Conclusion: The Synthesizer's FixTo build a flawless pipeline for messy, random thoughts, one must engineer a hybrid system:1. Use Karpathy’s Immutability: Keep the raw chunks untouched and permanently pinned to the context window of every round.2. Use STORM’s Adversarial Engine: Force the personas to debate, but instruct the synthesis agent to output a Linter of Paradoxes rather than a clean consensus.True cognitive emergence is not found in a pristine encyclopedia, nor is it found in an endless echo chamber. It is found in the structured, unresolvable tension between your chaotic notes, curated by an engine designed to highlight the chaos rather than hide it.
- 16073 FOLLOWERS
- 916 FOLLOWING
- MEMBER OF 36 CLUBS
This week I start my new project, which I share with you below. It's my own version of the "Karpathy LLM Wiki". If you think it's cool, feel free to copy and implement your version, I would be very happy to know how it works for you. 😁---### Multi-Agent Literature Synthesis and Analysis for Obsidian Vault#### 1. Core ConceptThe STORM-PKM project adapts the academic "Stanford STORM" method for the context of Personal Knowledge Management (Personal Knowledge Management). Instead of searching for open information on the internet, the system operates on a closed corpus (your Obsidian Vault), filtering notes by the YAML tag `domain: [Subject]`.The system utilizes an AI-Based Microservices architecture, where the monolithic research process is split among different "Specialists" (Hermes Profiles). These specialists collaborate asynchronously through a native task board (Hermes Kanban), passing the baton and the documentary findings from one to another until the generation of a final report perfectly cited in Markdown.#### 2. Objectives Zero Hallucination:* Force the AI to extract answers exclusively from local files, declaring ignorance ("not addressed") when the information does not exist in the corpus. Evidence-Based Synthesis:* Generate a fluid final document, where every statement is traceable through wikilinks `[[Note Name]]`. Knowledge Auditing (Blind Spots):* Map not only what you know about a domain, but also identify gaps in your Vault (unanswered questions). Resilience and Traceability:* Ensure that API failures or token limits in one step do not break the entire process (resuming from where it left off) and maintain a record (logs/runs) of everything the agents thought.#### 3. System ArchitectureThe architecture integrates your local files with the Hermes engines:##### A. Storage Layer (Data Layer) Obsidian Vault:* The root directory containing your .md files. Hermes Kanban DB (kanban.db):* The local SQLite database that will manage the state of tasks, dependencies, and the context passing between agents.##### B. Orchestration Layer (Control Layer) Hermes Gateway:* Runs in the background, hosting the Kanban Dispatcher (which wakes up the agents at the right time). The Trigger (Trigger):* Can be a manual trigger via CLI/Chat (`/kanban create ...`) or a CronJob configured to scan Obsidian for new pending research tags.##### C. Agents Layer (Profiles Layer)Each agent will have its own `SOUL.md` (Personality and System Instructions) and tool limit. storm-orchestrator:* The project manager. It receives the theme and creates the 5 tasks in the Kanban, configuring the dependencies (parents). storm-discoverer:* Reads the domain files, identifies the theses, and generates the "fundamental questions". storm-interviewer:* Reads the questions from the discoverer, goes to the source files, and extracts the answers citing the origin. storm-editor:* Reads the raw interviews, cross-references the data, resolves contradictions, and creates the hierarchical Outline. storm-writer:* Uses the Outline to draft the final Synthesis document, inserting the wikilinks and saving it to the Vault. storm-moderator:* Reads the Synthesis, compares it with the original notes, and drafts the Gaps and Biases report.#### 4. Data Pipeline (Kanban Flow)The magic occurs through the Structured Handoff and Links (Dependencies) feature of the Hermes Kanban: T0 (Orchestration)* → Triggers creation of T1 to T5. T1 (Discoverer)* → Finalizes its task. The Kanban promotes T2 to "Ready". T2 (Interviewer)* → Hermes wakes up the interviewer. It uses `kanban_show()` to read the context that T1 left (the questions). Executes, writes the result, finalizes. T3 (Editor)* → Wakes up, reads the answers left by T2, generates the Outline, finalizes. T4 (Writer)* → Wakes up, takes the Outline from T3, writes the final file to disk, finalizes. T5 (Moderator)* → Wakes up, evaluates the work of T4, generates the Gaps file on disk, closes the cycle.---Source (Here you will find links to the thesis and papers on the STORM method):https://github.com/stanford-oval/storm
This week I start my new project, which I share with you below. It's my own version of the "Karpathy LLM Wiki". If you think it's cool, feel free to copy and implement your version, I would be very happy to know how it works for you. 😁---### Multi-Agent Literature Synthesis and Analysis for Obsidian Vault#### 1. Core ConceptThe STORM-PKM project adapts the academic "Stanford STORM" method for the context of Personal Knowledge Management (Personal Knowledge Management). Instead of searching for open information on the internet, the system operates on a closed corpus (your Obsidian Vault), filtering notes by the YAML tag `domain: [Subject]`.The system utilizes an AI-Based Microservices architecture, where the monolithic research process is split among different "Specialists" (Hermes Profiles). These specialists collaborate asynchronously through a native task board (Hermes Kanban), passing the baton and the documentary findings from one to another until the generation of a final report perfectly cited in Markdown.#### 2. Objectives Zero Hallucination:* Force the AI to extract answers exclusively from local files, declaring ignorance ("not addressed") when the information does not exist in the corpus. Evidence-Based Synthesis:* Generate a fluid final document, where every statement is traceable through wikilinks `[[Note Name]]`. Knowledge Auditing (Blind Spots):* Map not only what you know about a domain, but also identify gaps in your Vault (unanswered questions). Resilience and Traceability:* Ensure that API failures or token limits in one step do not break the entire process (resuming from where it left off) and maintain a record (logs/runs) of everything the agents thought.#### 3. System ArchitectureThe architecture integrates your local files with the Hermes engines:##### A. Storage Layer (Data Layer) Obsidian Vault:* The root directory containing your .md files. Hermes Kanban DB (kanban.db):* The local SQLite database that will manage the state of tasks, dependencies, and the context passing between agents.##### B. Orchestration Layer (Control Layer) Hermes Gateway:* Runs in the background, hosting the Kanban Dispatcher (which wakes up the agents at the right time). The Trigger (Trigger):* Can be a manual trigger via CLI/Chat (`/kanban create ...`) or a CronJob configured to scan Obsidian for new pending research tags.##### C. Agents Layer (Profiles Layer)Each agent will have its own `SOUL.md` (Personality and System Instructions) and tool limit. storm-orchestrator:* The project manager. It receives the theme and creates the 5 tasks in the Kanban, configuring the dependencies (parents). storm-discoverer:* Reads the domain files, identifies the theses, and generates the "fundamental questions". storm-interviewer:* Reads the questions from the discoverer, goes to the source files, and extracts the answers citing the origin. storm-editor:* Reads the raw interviews, cross-references the data, resolves contradictions, and creates the hierarchical Outline. storm-writer:* Uses the Outline to draft the final Synthesis document, inserting the wikilinks and saving it to the Vault. storm-moderator:* Reads the Synthesis, compares it with the original notes, and drafts the Gaps and Biases report.#### 4. Data Pipeline (Kanban Flow)The magic occurs through the Structured Handoff and Links (Dependencies) feature of the Hermes Kanban: T0 (Orchestration)* → Triggers creation of T1 to T5. T1 (Discoverer)* → Finalizes its task. The Kanban promotes T2 to "Ready". T2 (Interviewer)* → Hermes wakes up the interviewer. It uses `kanban_show()` to read the context that T1 left (the questions). Executes, writes the result, finalizes. T3 (Editor)* → Wakes up, reads the answers left by T2, generates the Outline, finalizes. T4 (Writer)* → Wakes up, takes the Outline from T3, writes the final file to disk, finalizes. T5 (Moderator)* → Wakes up, evaluates the work of T4, generates the Gaps file on disk, closes the cycle.---Source (Here you will find links to the thesis and papers on the STORM method):https://github.com/stanford-oval/storm
If you use LLMs or AI agents in your daily workflow, stop cleaning up your chaos before feeding it to them. Instead, maintain simple text files containing raw lists of your interests — your favorite movies, games, painters, and fragmented brainstorm notes.By injecting these files into the context, you create a creative harness: an entropy modulator that transforms generic outputs into highly tailored, bespoke solutions.This approach works due to three precise mechanisms: Automatic Semantic Enrichment:* LLMs are high-dimensional probabilistic machines. When reading a string like "Interstellar", the AI instantly reverse-engineers it, extracting directors, tropes, and color palettes. It calculates the mathematical intersection of your taste without you having to explain a single line. Breaking the Statistical Cliché:* Asking for a "modern design" forces the AI to fetch the baseline average of the internet. Adding your lists introduces a gravitational anomaly into the model's probability map. To solve the task, it is forced to traverse latent paths, eliminating the obvious. Structural Isomorphism: AIs excel at transfer learning*. They can isolate the logical structure of a narrative or artwork and transpose that exact same mechanic onto a software architecture or business workflow.Providing your raw data signature forces the AI to operate on your exact logical and aesthetic wavelength. The simplicity of your structured data is precisely what unlocks the model's creative complexity.
I created my first real (private) repo on GitHub, with automatic backup and all! I'm evolving 🤣
Exploring Local LLMs for Agents: My Key Takeaways 🤖💻I’ve been experimenting with running the Hermes Agent locally. The reality is clear: you cannot expect truly reliable performance for complex or professional autonomous tasks without powerful, dedicated hardware. However, the research itself brought some fascinating insights.Here are my two main discoveries:1. KoboldCPP: The Perfect Middle GroundWhen looking for a local inference engine, I found KoboldCPP. It fits perfectly between Ollama and llama.cpp/vLLM: Ollama* works out of the box but offers very limited granular control. llama.cpp / vLLM* can fail to compile if your GPU is older or lacks modern CUDA support. KoboldCPP bridges this gap. It allows you to use the Vulkan SDK if your GPU doesn't support the latest CUDA versions. It also offers excellent, manual management of the KV Cache* (SmartCache), delivering significantly faster processing times than typical setups on modest hardware.2. Local Agents & Security: A Reality CheckIf you run an agent like Hermes locally without isolating it (via Docker or a similar sandbox), it can be dangerous. Even if you define a specific workspace directory for it to act in, a complex multi-step prompt can trigger the agent to access or modify any file on your system.I discovered this firsthand when Hermes, during a debugging session, autonomously wrote a template and a new initialization script directly into my Kobold software directory. It was incredibly practical, but a bit terrifying.Lesson learned: Always containerize your local agents!
My first time messing around with Hermes Agent:I installed it locally and connected it to a solid knowledge base. After a few questions and interactions, watching it seamlessly access the files... the feeling is great. It genuinely feels like having a real assistant. All of this was done using the free models provided by Nous Research. Overall, a really positive experience.Then, I decided to test it with a local model via Ollama. A modest model on modest hardware.First catch: you have to recompile that modest model via a Modelfile (simple and instant) to increase the context window that Hermes requires. The experience here? Satisfying.It took about 4 minutes to initialize the agent because it loads a massive amount of context right at the start. Once loaded, the conversation flows well, and the previous chat memory stays alive and present, which is awesome. When calling a skill (in this case, asking about Hermes itself), it took another 4 minutes of context loading. However, the system harness is highly consistent—a lightweight model delivered a correct, structured response, which was genuinely impressive.I'm not sure how the memory management is handled under the hood, but the impact on VRAM/RAM was zero (the exact same as running the model via Ollama alone, without Hermes). The processing, however, is heavy.It’s definitely not an option for heavy production tasks, but as a backup for interacting with an agent that has prior context, it's totally fine. One downside: Ollama unloads the model after a period of inactivity, which forces that initial 4 plus-minute agent loading process all over again. It's probably an easy fix, but a bit of a pain. 😅
My first time messing around with Hermes Agent:I installed it locally and connected it to a solid knowledge base. After a few questions and interactions, watching it seamlessly access the files... the feeling is great. It genuinely feels like having a real assistant. All of this was done using the free models provided by Nous Research. Overall, a really positive experience.Then, I decided to test it with a local model via Ollama. A modest model on modest hardware.First catch: you have to recompile that modest model via a Modelfile (simple and instant) to increase the context window that Hermes requires. The experience here? Satisfying.It took about 4 minutes to initialize the agent because it loads a massive amount of context right at the start. Once loaded, the conversation flows well, and the previous chat memory stays alive and present, which is awesome. When calling a skill (in this case, asking about Hermes itself), it took another 4 minutes of context loading. However, the system harness is highly consistent—a lightweight model delivered a correct, structured response, which was genuinely impressive.I'm not sure how the memory management is handled under the hood, but the impact on VRAM/RAM was zero (the exact same as running the model via Ollama alone, without Hermes). The processing, however, is heavy.It’s definitely not an option for heavy production tasks, but as a backup for interacting with an agent that has prior context, it's totally fine. One downside: Ollama unloads the model after a period of inactivity, which forces that initial 4 plus-minute agent loading process all over again. It's probably an easy fix, but a bit of a pain. 😅
Lightweight model to run locally, very efficient for "reading" images.
- minicpm-v4.6:1b (1.6 GB)
"The image depicts a misty urban scene where tall, modern buildings are partially shrouded in fog, creating a soft, hazy atmosphere that blurs the lines between structures and surroundings. A prominent metal communication tower stands out on the left side, its intricate framework visible despite the diffused light. In the foreground, lush green tree canopies add natural contrast to the industrial backdrop, suggesting a blend of nature and city life. The overall mood is calm and subdued, with the fog enhancing the sense of depth and tranquility, making the urban landscape feel slightly mysterious and serene."
I went to bed in Brazil and woke up in Silent Hill 🥶
I've finished my first custom workflow for AI agents. By combining skills and scripts, the agent is able to correctly catalog (according to a restricted standard) documents, notes, etc., leaving them ready for RAG and graph edge creation. It analyzes the document, triggering each specialized skill for each of the fields, all validated and orchestrated by scripts, and at the end, it autonomously adds or edits the file's front matter. It works super well with Ollama locally (even though the process is veeeery slow, because for each field it "spawns" a sub-agent that reasons about a specific skill...). The basic mechanism works; expanding it now is simpler. This is the first brick; now I need to install Hermes, do the integration, and see what happens. And hope it doesn't delete my hard drive. 😆
Technical Romanticism, that's my problem...The token is not your pet project; It is a fiat value extraction tool! 😁
boa noite.Web3 Social is completely missing the mark. The market is obsessed with turning social networks into a financial casino of silly microtransactions—minting tokens for every single click—while completely ignoring the elephant in the room: the true power of decentralization is delivering a Global, Open Knowledge Graph.Look at market history. Big Tech didn’t build empires and winning products by coincidence or sheer luck. Google changed the world when it realized the web wasn't just a bunch of loose pages, but rather the Google Knowledge Graph. Meta swallowed global attention by mapping the Social Graph. The resounding success of these giants—even if invisible to the average user—was built entirely on the backbone of knowledge graph architecture.Understand this once and for all: the fact that Web3 information is consolidated publicly and immutably means that data has finally been liberated from corporate silos. The exact raw material that made Google and Meta billionaires now belongs to everyone. Given this reality, burning engineering energy trying to lock down access to this data behind financial friction and penny tokens is brutal product myopia. You should be investing in Graphs! The real value isn't in turning data into currencies; it’s in turning it into intelligent connections.When data is public and the infrastructure is open, the monopoly of the secret algorithm dies. The future belongs to the decentralization of inference engines. With data available in the public square, every app, every interface, or even the users themselves on their local machines, can build their own artificial intelligence to read and interpret the exact same global mesh under entirely different perspectives and ontologies.Big Tech won by using graphs to centralize the world. Web3 will only win if it uses graphs to decentralize it. Stop playing bank. Start building the connective tissue of the next era of intelligence.