Meet OpenJarvis: A Local-First Framework for On-Device Personal AI Agents
Researchers at Stanford University and Lambda Labs unveil OpenJarvis, an open-source framework for on-device personal AI agents with tools, memory, and learning.

Researchers at Stanford University and Lambda Labs have published a research paper introducing OpenJarvis, an open-source framework designed to run inference, agents, memory, and learning entirely on-device. This innovative framework is poised to revolutionize the way personal AI agents operate, making local-first the default and relegating cloud API queries to when absolutely necessary. The research team has made significant strides in optimizing OpenJarvis, achieving results that are remarkably close to those of cloud-based models.
According to their findings, open-weight models configured through OpenJarvis perform within 3.2 percentage points of the best cloud model on average. Moreover, they accomplish this at roughly 800× lower marginal API cost per query and approximately 4× lower latency under the research's benchmark protocol. OpenJarvis is not a single model but a flexible framework that composes any supported model with a configurable agent stack.
It has been evaluated across 11 local models from four families and decomposes a personal AI system into five typed primitives. These primitives are independently swappable and can be composed through a single declarative configuration object called a spec. This allows for seamless integration and flexibility across different models and engines.
One of the key contributions of OpenJarvis is its LLM-guided spec search, a local-cloud collaboration that leverages a frontier cloud model as a teacher at search time. This teacher reads traces, diagnoses failure clusters, and proposes edits across Intelligence, Engine, Agents, and Tools & Memory. The optimized spec then runs entirely on-device at inference time, eliminating the need for cloud calls.
The research team evaluated OpenJarvis across eight benchmarks spanning 508 tasks, including tool calls, agentic workflows, coding, customer service, general assistance, and deep research. The results demonstrate that OpenJarvis not only matches but in some cases exceeds cloud-based performance. For instance, the best single local model, Qwen3.5-122B, reaches 80.3% average accuracy compared to Claude Opus 4.6 at 83.5%, a mere 3.2 pp gap.
OpenJarvis is designed with user-friendliness in mind. Installation is a one-command process, and the framework ships with a desktop GUI for macOS, Linux, and Windows. It also comes with eight built-in agents across three execution modes and connects to over 25 data sources and 32+ messaging channels.
Source: MarkTechPost