Osaurus brings both local and cloud AI models to your Mac
Osaurus offers an open-source, Apple-only LLM server that lets users switch between local and cloud AI models while keeping files and tools on their own hardware.
As AI models become increasingly commoditized, startups are racing to build the software layer that sits on top of them. One interesting entrant into this space is Osaurus, an open-source, Apple-only LLM server that lets users move between different local AI models, either locally or in the cloud, while keeping their files and tools all on their own hardware. Osaurus evolved out of the idea for a desktop AI companion called Dinoki, which Osaurus co-founder Terence Pae described as a sort of "AI-powered Clippy." Dinoki's customers had asked Pae why they should buy the app if they still had to pay for tokens — the usage units AI companies charge for processing prompts and generating responses.
That got Pae thinking more deeply about running AI locally. "That's how Osaurus started," Pae, previously a software engineer at Tesla and Netflix, told TechCrunch over a call. The idea, he explained, was to try to run an AI assistant locally.
"You can do pretty much everything on your Mac locally, like browsing your files, accessing your browser, accessing your system configurations. I figured this would be a great way to position Osaurus as a personal AI for individuals." Pae began building the tool in public as an open-source project, adding features and fixing bugs along the way. Today, Osaurus can flexibly connect with locally hosted AI models or cloud providers like OpenAI and Anthropic.
Users can freely choose which AI models they're using, and keep other aspects of the AI experience on their own hardware, like the models' own memory, or their files and tools. Given that different AI models have different strengths, the advantage of this system is that users can switch to the AI model that best fits their needs. Osaurus presents an easy-to-use interface that consumers can use, and addresses security concerns by running things in a hardware-isolated, virtual sandbox.
This limits the AI to a certain scope, keeping your computer and data safe. The practice of running AI models on your machine is still in its early days, given that it's heavily resource-intensive and hardware-dependent. To run local models, your system will need at least 64 GB of RAM.
For running larger models, like DeepSeek v4, Pae recommends systems with about 128 GB of RAM. However, Pae believes local AI's needs will come down in time. "I can see the potential of it, because the intelligence per wattage — which is like the metric for local AI — has been going up significantly.
It's on its own curve of innovation. Last year, local AI could barely finish sentences, but today it can actually run tools, write code, access your browser, and order stuff from Amazon [...] it's just getting better and better," he said. Osaurus today can run MiniMax M2.5, Gemma 4, Qwen3.6, GPT-OSS, Llama, DeepSeek V4, and other models.
Source: TechCrunch