AI Research

New MRAgent Framework Uses 118K Tokens Per Query, Outperforms LangMem

AI News Desk

VentureBeat

Jun 26, 2026

4 min read

MRAgent framework reduces token consumption and runtime costs for long-horizon reasoning tasks in AI agents.

New MRAgent Framework Uses 118K Tokens Per Query, Outperforms LangMem

Long-horizon reasoning exposes a core weakness in AI agents: context windows fill up fast, and retrieval pipelines return noise instead of signal. To solve this, researchers at the National University of Singapore developed MRAgent, a framework that abandons the static "retrieve-then-reason" approach. Instead, it uses a mechanism that allows an agent to dynamically develop its memory based on accumulating evidence.

This multi-step memory reconstruction is integrated into the reasoning process of the large language model (LLM). The limits of passive retrieval in long-horizon tasks are significant. In classic retrieval pipelines, documents are retrieved through vector search or graph traversal and passed on to an LLM for reasoning.

This passive approach fails because it cannot combine reasoning with memory access, creating three major bottlenecks: These systems cannot revise their retrieval strategy mid-reasoning. If an agent fetches a document and discovers a crucial missing cue — a specific date or person — it has no way to issue a new query based on that finding. Fixed similarity scores and predefined graph expansions return surface-level matches that flood the LLM's context window with irrelevant noise, degrading reasoning.

Current systems rely heavily on pre-constructed structures such as top-k results and static relevance functions, limiting the flexibility required to scale across unpredictable, long-horizon user interactions. The researchers argue that to overcome these limitations, developers must shift toward an "active and associative reconstruction process," a concept inspired by cognitive neuroscience. Under this paradigm, memory recall unfolds sequentially rather than operating as a passive read-out of a static database.

The system starts with small, specific triggers from the user's prompt, such as a person's name, an action, or a place. These initial hints point to connecting concepts or categories instead of massive blocks of text. By following these metadata stepping stones, the agent gathers small pieces of evidence one by one.

It uses each new piece of information to guide its next step until it successfully pieces together the full, accurate story. MRAgent implements active memory reconstruction by treating memory as an interactive environment. When processing a complex query, the agent uses the backbone LLM's reasoning abilities to explore multiple candidate retrieval paths across a structured memory graph.

At each step, the LLM evaluates the intermediate evidence it has gathered and uses it to iteratively optimize its search. It infers new search constraints, pursues the paths with the best information, and prunes irrelevant branches. The framework organizes its database using a "Cue-Tag-Content" mechanism.

Share this article

X LinkedIn Telegram

Source: VentureBeat