MEMO: A Modular Framework for Training a Dedicated Memory Model on New Knowledge Without Modifying LLM Parameters
Researchers propose MEMO, a modular framework that trains a dedicated memory model to internalize new knowledge without modifying large language model (LLM) parameters.

Large language models become static after pretraining, and their knowledge does not update as the world changes. Retraining a full LLM is too expensive at modern scales, and fine-tuning risks degrading previously learned knowledge. Retrieval-augmented generation (RAG) struggles when answers require reasoning across many documents.
A team of researchers from the National University of Singapore, MIT CSAIL, A*STAR, and the Singapore-MIT Alliance for Research and Technology (SMART) proposes a new approach called MEMO (Memory as a Model). MEMO separates memory from reasoning, using a small, dedicated language model, called the MEMORY model, to internalize knowledge from a target corpus. The EXECUTIVE model, the main LLM, is frozen and queried only through its standard input-output interface.
The MEMORY model is trained via supervised fine-tuning (SFT) on a reflection QA dataset created using a five-step data synthesis pipeline guided by a GENERATOR model. At inference, the EXECUTIVE model queries the MEMORY model through a structured multi-turn protocol with three sequential stages: grounding, entity identification, and answer seeking and synthesis. In experiments, MEMO achieves state-of-the-art results on three benchmarks: BrowseComp-Plus, NarrativeQA, and MuSiQue.
With Gemini-3-Flash as the EXECUTIVE model, MEMO achieves 53.58% on NarrativeQA, outperforming HippoRAG2's 23.21%. MEMO also demonstrates robustness to retrieval noise and supports incremental knowledge updates through model merging. The research team tests model merging on NarrativeQA using TIES merging (ρ=0.3), which reduces compute by 33% at K=2 corpora and 5.5× at K=10 corpora compared to full retraining.
Switching the EXECUTIVE model from Qwen2.5-32B-Instruct to Gemini-3-Flash yields gains of +12.45%, +26.73%, and +11.90% across the three benchmarks without retraining the MEMORY model. The MEMO framework addresses the limitations of existing methods for integrating new knowledge into LLMs, providing a modular and efficient approach to updating knowledge without modifying LLM parameters.
Source: MarkTechPost