Show HN: Hippo, biologically inspired memory for AI agents
github.com112 points by kitfunso 15 hours ago
112 points by kitfunso 15 hours ago
Oh hey, something I know something about!
I've long held the belief that if you want to simulate human behaviour, you need human-like memory storage, because so much of our behaviour is influenced by how our memories work. Even something as stupid as walking into between rooms and forgetting why you went there, is a behaviour that would otherwise have to be simulated directly but can be indirectly simulated by the memory of why an Agent is moving from room to room having a chance of disappearing.
Now, as for how useful this will be for something that isn't trying to directly simulate a human and is trying to be "superintelligent", I'm not entirely sure, but I am excited that someone is exploring it.
https://ieeexplore.ieee.org/abstract/document/5952114 https://ieeexplore.ieee.org/abstract/document/5548405 https://ieeexplore.ieee.org/abstract/document/5953964
I never did get many citations for these, maybe I just wasn't very good at "marketing" my papers.
We're exploring related ideas in embodied AI rather than LLM agents. MH-FLOCKE uses Izhikevich spiking neurons with R-STDP to control quadruped locomotion — the memory is in the synaptic weights, not in a vector store.
The brain persists across sessions: stop the robot, restart it, synaptic weights reload and it continues from where it left off. Decay happens naturally through R-STDP — synapses that don't contribute to reward weaken over time. No explicit forgetting mechanism needed.
Currently running on a Unitree Go2 (MuJoCo) and a 100€ Freenove robot dog (Raspberry Pi 4, real hardware). Same architecture, different bodies.
github.com/MarcHesse/mhflocke
The biggest issue I have with these systems is, I don't want a blanket memory. I want everything to be embedded in skills and progressively discovered when they are required.
I've been playing around with doing that with a cron job for a "dream" sequence.
I really want to get them out of main context asap, and where they belong, into skills.
Isn't this the idea behind holographic memory? Chopping the image in half gets you the same image at half the resolution? Or so I've heard...
What you want is a context mipmap.
Then there was the Claude article describing using filesystem hierarchy to organize markdown knowledge, which apparently beats RAG.
Interesting that the framing is about forgetting. Feels like forgetting is mostly only a problem because of the push model — context auto-injected at session start, budget-limited, ranked by relevance. In that world yes, you have to throw things away, the budget is finite. But Hippo already has an MCP server and stores everything as markdown in a folder. Most of the substrate for a pull model is there — the agent could just query memories when they're actually relevant to what it's doing, instead of having a budgeted blob shoved into context up front. In that world, decay and consolidation feel like they're solving a problem the architecture mostly doesn't need to have. Curious why you went push-first rather than pull-first?
Really cool to see someone model memory consolidation and decay for agents rather than just dumping everything into a vector store. The biggest problem I've run into with long-running agents is they drown in their own context, so having a forgetting mechanism feels counterintuitive but probably necessary. Would love to see benchmarks comparing this against plain RAG on tasks that require recalling something from 50+ interactions ago.
I think explicit post-training is going to be needed to make this kind of approach effective.
As this repo notes is "The secret to good memory isn't remembering more. It's knowing what to forget." But knowing what is likely to be important in the future implies a working model of the future and your place in it. It's a fully AGI complete problem: "Given my current state and goals, what am I going to find important conditioned on the likelihood of any particular future...". Anyone working with these agents knows they are hopelessly bad at modeling their own capabilities much less projecting that forward.
Cool project. I like the neuroscience analogy with decay and consolidation.
I've been working on a related problem from the other direction: Claude Code and Codex already persist full session transcripts, but there's no good way to search across them. So I built ccrider (https://github.com/neilberkman/ccrider). It indexes existing sessions into SQLite FTS5 and exposes an MCP server so agents can query their own conversation history without a separate memory layer. Basically treating it as a retrieval problem rather than a storage problem.
Thank you so much for all the feedback! I really appreciate it and have implemented the majority of them. Please check out v0.10.0!
Aren't tools like claude already store context by project in file system? Also any reason use "capture" instead of "export" (an obvious opposite of import)?
> Aren't tools like claude already store context by project in file system?
They do, the missing piece is a tool to access them. See comment about my tool that addresses this: https://news.ycombinator.com/item?id=47668270
wow, i checked the repo and we have similar ideas)
we're building swarm-like agent memory agents share memories across rooms and nodes. Reading Steiner + Time Leap Capsules (yeah, Steins;Gate easter eggs lol).
your consolidation and decay mechanics are close to what we want. might integrate similar approach.
hmm the repo doesnt mention this at all but this name and problem domain brings up HippoRAG https://arxiv.org/abs/2405.14831 <- any relation? seems odd to miss out this exactly similarly named paper with related techniques.
a working group of ~300 senior eng are experimenting with different skills for stuff like this: https://swg.fyi/mom
Cool to see others on this thread.
Here's a post I wrote about how we can start to potentially mimic mechanisms
https://n0tls.com/2026-03-14-musings.html
Would love to compare notes, I'm also looking at linguistic phenomena through an LLM lens
https://n0tls.com/2026-03-19-more-musings.html
Hoping to wrap up some of the kaggle eval work and move back to researching more neuropsych.
no open code plugin? This seems like something that should just run in the background. It's well documented that it should just be a skill agents can use when they get into various fruitless states.
The "biological" memory strength shouldn't just be a time thing, and even then, the time of the AI agent should only be conformed to the AI's lifetime and not the actual clock. Look up https://stackoverflow.com/questions/3523442/difference-betwe... monotonic clock. If you want a decay, it shouldn't be related to an actual clock, but it's work time.
But memory is more about triggers than it is about anything else. So you should absolutely have memory triggers based on location. Something like a path hash. So whever an agent is working and remembering things it should be tightly compacted to that location; only where a "compaction" happens should these memories become more and more generalized to locations.
The types of memory that often are more prominent are like this, whether it's sports or GUIs, physical location triggers much more intrinsics than conscious memory. Focus on how to trigger recall based on project paths, filenames in the path, file path names, etc.
Memory links to location but that's largely because humans are localised. Isn't that also a weakness. We should be trying to exploit the benefits of non-locality [of ML models and training data] too.
I feel like much of my life is virtual, non-localised. Writing missives to the four corners of the wind here and elsewhere; gaming online; research/chats with LLMs or on the web, email with people.
My physical location is often not important - a continuing context from non-physical aspects of my existence matters more.
That said, one of the things that's hard for me about digital life is the lack of waymarks - I used to be quite "geographical" in my thinking. Like "oh the part I found interesting was on the left page after the RGB diagram", I'd find that and also find my train of thought and extend it. Now, information can be in any myriad of freeform places across at least 3 devices and in emails, notebooks, bookmarks, chat histories, and of course my brain. When some ready syncretism of those things happens it feels like we'll make better advances. Personal agents can be a part of that.
yegge has a cool solution for this in gastown: the current agent is able to hold a seance with the previous one
How does it select what to forget? Let's say I land a PR that introduces a sharp change, migrating from one thing to another. An exponential decay won't catch this. Biological learning makes sense when things we observe similar things repeatedly in order to learn patterns. I am skeptical that it applies to learning the commits of one code base.
cool project mate, gj
[flagged]
[dead]
[flagged]
[flagged]
[flagged]
[dead]