Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent
A new community project, Memory OS, has been released, offering a 6-layer open-source memory stack built on top of the Hermes Agent, enhancing its memory capabilities with features like vector database, structured facts, and auto-curated knowledge wiki.

Hermes Agent">
The Hermes Agent, an open-source agent from Nous Research, has been capable of remembering across sessions since its inception. It ships with curated memory files and a full-text session search feature. However, a new community project argues that the built-in memory is too shallow for serious work.
Enter Memory OS, a library released under an MIT license by developer ClaudioDrews, which stacks six memory layers onto Hermes. Memory OS is not a simple plugin that toggles on; rather, it's a layered system that sits beside Hermes Agent's own memory. While Hermes provides workspace files and a session database, Memory OS keeps those and adds four more layers above them.
The full stack runs locally using Docker, Qdrant, Redis, and Python 3.11+, and works with any LLM provider Hermes supports, including OpenRouter, OpenAI, Anthropic, and Ollama. The system operates by intercepting memory read and write operations. On pre_llm_call, Memory OS performs what it calls surgical recall, pulling from four sources simultaneously: Fabric, Qdrant, Sessions, and Facts.
Each source is gated by a relevance threshold before anything reaches the model. Per-session deduplication prevents the same context from appearing twice, while a social-closer filter skips trivial messages, such as a plain "thanks." On post_llm_call and on_session_end, the system extracts and captures new learnings automatically, with the stated goal of token efficiency, not stuffing the context window. Layer 5's retrieval uses a four-level fallback, trying hybrid search first, then dense vectors, then lexical, then SQLite.
This design ensures recall works even when the vector database struggles. Memory OS also runs a weekly decay scanner to age out stale entries and semantic dedup merges near-identical memories when cosine similarity exceeds 0.92. Memory OS positions itself against cloud memory services like mem0, Zep, and Letta, pitching itself as a local memory infrastructure solution.
The memory data stays local, with no memory subscription required. LLM calls still go to whichever provider you choose. For teams with data-residency rules, a local memory store can matter.
The project has been open-sourced, offering a complete hierarchical persistent memory architecture for the Hermes Agent, featuring six layers, including structured facts, hybrid vector search, and a self-curating LLM Wiki.
Source: MarkTechPost