AI Models

Corti's new Symphony for Speech-to-Text model beats OpenAI at medical terminology accuracy, highlighting the value of specialized AI

AI News Desk

VentureBeat

May 20, 2026

3 min read

Corti launches Symphony for Speech-to-Text, a clinical-grade speech recognition model that outperforms OpenAI and other generalist models in medical terminology accuracy, demonstrating the value of specialized AI in healthcare.

Corti's new Symphony for Speech-to-Text model beats OpenAI at medical terminology accuracy, highlighting the value of specialized AI

['Today, Copenhagen-based healthcare AI Corti is launching Symphony for Speech-to-Text, a new generation of clinical-grade speech recognition models engineered specifically for real-time dictation, conversational transcription, and batch audio processing — and their accuracy rate is the highest for this specific use case yet recorded. "We are focused on ensuring our AI scribes can be trusted by physicians, medical practitioners and patients...the entire healthcare system," said Andreas Cleve, co-founder and CEO of Corti, in an exclusive video call interview with VentureBeat.', 'The performance data the company is bringing to the table paints a stark picture of the current state of enterprise AI: when it comes to highly regulated, specialized industries, domain-specific models can beat out the foundation model providers. In a newly published research paper , Corti revealed that its new clinical-grade speech models reduced word error rates (WER) by up to 93% when compared against leading generalist speech models and APIs on medical terminology.

On English medical terminology, its Symphony for Speech-to-Text achieved a remarkably low 1.4% WER . By comparison, OpenAI’s speech model registered a 17.7% WER , ElevenLabs hit 18.1% , Whisper recorded 17.4% , and Parakeet scored 18.9% .', 'Corti’s announcement serves as a critical inflection point for healthcare builders. While general-purpose APIs like OpenAI’s whisper are sufficient for broad-domain transcription, they frequently stumble over medical acronyms, complex medication dosages, shorthand, and noisy emergency room environments.

Symphony for Speech-to-Text aims to solve this by providing developers with a highly specialized, production-grade API designed from the ground up for clinical workflows. The agentic era demands flawless data inputs', 'The launch of Symphony for Speech-to-Text highlights a fundamental shift in how healthcare uses voice technology. For decades, medical speech recognition was primarily about generating a static text document for human doctors to review—a digital replacement for a notepad.

But as the healthcare industry hurtles into what technologists call the "agentic era," where autonomous AI agents actively assist in clinical decision-making, EHR navigation, and real-time support, the transcript is no longer the final product. It is the foundational data layer.', 'Corti’s architecture mitigates this risk by producing structured, clinically usable output directly from the API, helping downstream AI applications reason over clean facts rather than messy, unformatted text. Nowhere is this more evident than in Corti’s entity recall benchmarks.

Symphony for Speech-to-Text reached an astonishing 98.3% recall rate on formatted clinical entities— such as dosages, measurements, and dates. In contrast, Corti reported that the strongest general-purpose baseline model maxed out at just 44.3% recall f or the same entities.', 'By providing this level of accuracy via an API endpoint, Corti is enabling third-party developers, EHR vendors, and virtual care platforms to build their own custom dictation and ambient listening tools that outperform the industry\'s legacy incumbent. "We want people to build apps atop our models," Cleve said.

Share this article

X LinkedIn Telegram

Source: VentureBeat