Perplexity AI Introduces Hybrid Local-Server Inference Orchestrator for Personal Computer: Automatic On-Device and Cloud Task Routing
Perplexity AI announces the first hybrid local-server inference orchestrator, enabling automatic routing of AI tasks between local devices and cloud-based models.

Perplexity AI has unveiled a groundbreaking hybrid local-server inference orchestrator at Computex 2026, designed to revolutionize how AI tasks are handled on personal computers. This innovative system automatically determines whether to process AI tasks on a user's local device or in the cloud, eliminating the need for manual decisions. The development of this technology is a response to the three-way tension AI systems face: the need for accuracy, which demands the most capable models; the importance of privacy, which requires certain data to remain on-device; and the necessity for cost and energy efficiency, which dictates that tasks be handled by the smallest suitable model.
At the heart of Perplexity's solution is what they call hybrid agentic inference. A compact AI model runs locally on the user's device, evaluating each incoming task to decide whether it can be handled on-device or if it requires the capabilities of a frontier model in the cloud. This local model assesses whether a task involves sensitive data, such as financial records, health information, or personal files, and requests user permission before sending sensitive tasks to the cloud.
Perplexity's hybrid orchestrator is a significant advancement for their Personal Computer product, launched on Mac in April 2026, with Windows support planned. Previously, tasks were divided relatively fixedly between on-device processing and cloud-based computation. The new orchestrator enables more flexible and efficient task routing, allowing the system to reason about where each piece of a task should be executed.
The Perplexity Computer, a cloud-based multi-model agentic product launched in February 2026, coordinates up to 20 AI models in a single workflow. The hybrid local-server inference orchestrator extends this capability to include the optimization of compute location, further enhancing the system's efficiency and adaptability.
Source: MarkTechPost