Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents That Outperform OpenAI Operator and Gemini 2.5
Microsoft Research's AI Frontiers lab releases Fara1.5, a family of computer-use agent models for the browser that outperform OpenAI's Operator and Google's Gemini 2.5 Computer Use on Online-Mind2Web.

["Microsoft Research's AI Frontiers lab has released Fara1.5, a family of computer-use agent (CUA) models designed for the browser. The release includes three sizes: Fara1.5-4B, Fara1.5-9B, and Fara1.5-27B, all integrated with MagenticLite, Microsoft's sandboxed browser interface for these agents. These models are capable of driving a real browser by reading screenshots and emitting mouse and keyboard actions to complete tasks.", "Computer-use agents, like Fara1.5, are pixel-to-action models that fall into the same category as recent agent products such as OpenAI's Operator and Google's Gemini 2.5 Computer Use.
According to the benchmarks, Fara1.5-27B scores 72% task success on Online-Mind2Web, a benchmark that covers 300 tasks across 136 popular sites. In comparison, OpenAI's Operator scores 58.3%, and Gemini 2.5 Computer Use scores 57.3%. Other notable scores include Yutori's Navigator n1 at 64.7% and Fara1.5-9B at 63.4%, which nearly doubles the predecessor Fara-7B's score of 34.1% on the same benchmark.", 'The Fara1.5 models utilize Qwen3.5 base checkpoints in their 4B, 9B, and 27B variants and operate through an observe-think-act loop.
At each step, the model considers the prior conversation history and the three most recent browser screenshots, then emits thoughts and a single next action. The action space for these models includes standard mouse and keyboard inputs, web-specific actions like web search, and meta-actions for context management, such as memorizing facts for later use and asking the user clarification questions.', 'Training for Fara1.5 involved supervised fine-tuning on roughly two million samples, comprising 60% web trajectories, 12.8% synthetic environments, 12.5% form filling and user interactions, 8.8% grounding, and 4.9% VQA, among others. The synthetic pipeline, FaraGen1.5, produced the training trajectories and consists of three modular components: environments, solvers, and verifiers.
The environments include both open-internet tasks and gated-domain tasks that require authenticated sessions or irreversible actions, such as sending an email.', "The development of Fara1.5 also emphasized safety and security. The models are trained to stop and ask the user in specific situations, such as when a task requires personal information not provided, when task descriptions are ambiguous, or when an irreversible action is about to be performed. Safety training uses public safety datasets and internal tasks aligned with Microsoft's Responsible AI Policy.
All agent actions within MagenticLite are logged and auditable, providing a security boundary between the agent and the user's machine.", 'In additional benchmarks, Fara1.5-27B scored 88.6% on WebVoyager, while the 9B and 4B variants scored 86.6% and 80.8%, respectively. On WebTailBench v1.5, which targets long-tail web tasks, Fara1.5-9B achieved 64.5% process success and 32.3% outcome success.']
Source: MarkTechPost