Hacker Newsnew | past | comments | ask | show | jobs | submit | JosephjackJR's commentslogin

hit this building agent systems - context gone on restart. Built a dedicated memory layer for agent workloads, in-process, survives restarts via WAL recovery. Happy to share: https://github.com/RYJOX-Technologies/Synrix-Memory-Engine


Concurrent agent memory is hard most solutions aren't designed for multiple agents writing at once. Built something in-process. Happy to share: https://github.com/RYJOX-Technologies/Synrix-Memory-Engine


Ran into the same thing. SQLite works until you need cold start recovery or WAL contention with concurrent agents. Built a dedicated memory layer for agent workloads - happy to share: https://github.com/RYJOX-Technologies/Synrix-Memory-Engine


Or, when you have a Django project and started out on SQLite, but then begrudgingly introduce M-to-N relationships, but then suddenly notice, that many things you might want to do or implement with those M-to-N relationships are not supported by SQLite. Then you suddenly wish you had started with Postgres right away.


There are definitely some caveats / tradeoffs with SQLite, but I can't think of any that are specifically related to many to many relationships. Which features did you find missing? Lateral joins maybe?


I only remember from my last Django project, that I started out thinking: "OK, I will do things properly, no many-to-many relationships..." then at some point saw the need for them, or manually creating the intermediate relation and at that point using the Django way was what I was supposed to do. But then I got errors for some things I wanted about not being supported by SQLite.

The project is here: https://codeberg.org/ZelphirKaltstahl/web-app-vocabulary-tra... But I left it unfinished and a quick grep does not yield comments or something that explains why at some place I do something to circumvent the SQLite problems. I remember though, that I basically swore to myself, that I would not ever use SQLite in production with Django ORM. And if I am not using it in production, then testing also better should not be using it, because one should test with the same RDBMS as runs in production, or risk unexpected issues suddenly only happening in production. So SQLite is out for anything serious in Django projects for me.


super easy; we are currently trialing it with people who have this issue, check out our website or github, we will happily let you try anything you want with it. Really want feedback ideally atm.


I built a local-first AI memory engine for agents and edge systems. It uses a Binary Lattice instead of vectors. Fixed-size nodes with arithmetic addressing, so lookups are O(1) by ID and O(k) by prefix where k is result count not corpus size. Scales to 50M+ nodes beyond RAM via mmap with no performance cliff.

Real numbers from my machine. Direct node lookup: 19us. Prefix queries over 10k nodes: 28-80us with zero embedding model. 280x faster than local vector DB at 10k nodes. Full agent context rebuilt from cold start in under 1ms. ACID durable via WAL tested across 60 crash scenarios with zero data loss. Validated on Jetson Orin Nano at 192ns hot reads.

The core idea is that most agent memory is structured not fuzzy. User preferences, learned facts, task stores, conversation history. You know what you're looking for. Prefix-semantic naming replaces vector similarity entirely for these workloads. No embedding model. No GPU. No cloud call.

The robotics use case is what I find most interesting. A robot learns its environment during operation. Which door sticks, which patient has a latex allergy, which corridor is slippery. Power cuts out. Robot reboots cold. Every memory restores in milliseconds via WAL recovery. No internet required. Works in a Faraday cage, underground, on a factory floor.

It is not a vector DB replacement. For fuzzy similarity search over unstructured documents Qdrant and Chroma are the right tools. Synrix is the memory layer for structured agent workloads where you control the naming.

Curious whether anyone has hit the structured vs fuzzy memory problem in production and how you solved it.


Over the past year I’ve been working on a local-first memory engine for AI systems.

This started from debugging agent workflows that behaved differently after restarts. In several cases the memory layer relied on embeddings and approximate search through a vector database. It worked, but recall was not deterministic and restarting the process sometimes changed behaviour in subtle ways.

I wanted something simpler and predictable.

So I built a restart-persistent local memory engine that behaves more like SQLite than a cloud vector database. It runs as a single binary, stores data locally, and retrieval is deterministic. If you kill the process and restart it, the same query returns the same IDs in the same order.

It is not an LLM, not an agent framework, and not a SaaS product. It is meant to sit underneath those systems as a low-level memory primitive.

This is not a replacement for semantic search in every case. If you genuinely need approximate similarity over unstructured text, embeddings make sense. But I have a suspicion that in many structured agent and infra workflows, deterministic storage would be simpler and cheaper.

I would really appreciate feedback from people building ML or data infrastructure. In what cases is approximate search actually required, and where is it just become default?


If anyone is working on something similar please drop link below so i can check it out!


Retrieval performance is becoming a silent bottleneck for local first AI agents. While context windows are expanding, the latency involved in querying traditional cloud vector databases still sits in the 10ms to 50ms range due to network hops and pointer heavy graph structures.

I have spent the last few months building SYNRIX to see if we could reach sub microsecond retrieval by being extremely opinionated about hardware. Instead of a flexible graph, the engine uses a binary lattice—a rigid structure that relies on arithmetic addressing instead of chasing pointers.

This architectural rigidity leads to several unique properties:

Query time scales with the number of results you want rather than the total size of your database. We have validated this at 50 million nodes running smoothly on a standard 8GB RAM machine using memory mapped storage to scale beyond physical memory.

Because it runs entirely on your own hardware, there are no per query fees or subscription costs. This makes it a viable local first alternative for high volume applications that would otherwise face six figure cloud bills at scale.

The system is built for production reliability with ACID style guarantees. It uses a Write Ahead Log and deterministic recovery to ensure 100% success in surviving restarts and crashes without data loss.

The engine is designed for cache line alignment and CPU prefetching. This approach ensures the software works with hardware realities to maintain sub microsecond hot path retrieval even as the memory substrate grows.

We have built compatibility layers for LangChain and Qdrant so it can act as a drop in replacement for existing stacks. The project has already seen about 40 clones since yesterday, so the need for low latency, offline first memory seems to be hitting a nerve. I am curious to hear from others working on high frequency agent queries—is retrieval latency currently a bottleneck for your workflows, or are you more concerned with inference time?


It is a good point. However a lot of IT companies have differing Industries within it, not easy to paint with a simple brush of 'IT Company' like what would you define an IT company as?


Just read a CIO article saying cloud costs are now the second biggest expense for midsize IT companies, behind labor and ahead of pretty much everything else.

That matches what I keep hearing from people running real systems. Cloud spend doesn’t feel like a line item you control anymore, it feels like something you react to after the fact. Bills go up, dashboards light up, and then everyone scrambles to shave a few percent without touching the parts that are actually painful.

What stood out to me is that a lot of this cost doesn’t seem to come from raw compute or storage anymore. It comes from all the things glued around the system to make it work at scale. Remote caches, coordination layers, metadata services, control planes, cross region calls. Stuff that exists because there isn’t a good local place for certain kinds of state to live.

Once those pieces sit on the critical path, they get hit constantly, they add latency, and they quietly become some of the most expensive parts of the system. At that point cloud cost stops being an optimization problem and starts feeling like a structural one.

I’m curious how this lines up with other people’s experience. How much of your cloud bill is tied to coordination and state rather than actual business logic. Have you had to add external services just to keep latency acceptable. Have you reached the point where you’d rather rethink architecture than keep paying the tax.

Genuinely interested in what people are seeing in practice, not vendor takes or budgeting advice.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: