Harvard and Perplexity Study Reveals AI Agents Perform 26 Minutes of Autonomous Work per Session
New research from Harvard and Perplexity compares AI agents and search in knowledge work, finding agents perform 26 minutes of autonomous work per session.

A new working research paper from Perplexity and Harvard offers field evidence on what AI agents do to knowledge work. The study draws on production data from two Perplexity products: Search and Computer. The setup is a natural comparison.
Search is a conversational answer engine, while Computer is an agent that plans and executes tasks end to end. The same users interact with both products, allowing the team to hold the task roughly constant. The research study covers a 90-day window, February 27 through May 27, 2026, with Computer launching two days before that window opened.
The core method matches near-identical query pairs across the two products. The research team found 10,000 session pairs with cosine similarity above 0.99, effectively the same task attempted both ways. Computer pairs are gated to sessions that invoke an execution tool, including code execution, browser actions, file writes, and connector calls.
This gate ensures every Computer session does real autonomous work. Adoption rose over the window, with cumulative Computer queries reaching 84 times their first-week total. A matched analysis found Computer adoption also raised users' daily Search queries by 1.05, indicating complementarity rather than substitution.
The research grounds its data in a simple task-based model, where each task has a step count and longer tasks carry weakly higher value. Agents change the cost structure, charging a higher fixed cost per task for delegation and review but a lower marginal cost per step since the system executes. This produces a breakeven step count, below which the conversational mode is cheaper and above which the agent mode wins.
Short lookups stay manual, while long workflows move to the agent. The first autonomy measure is execution time, with Computer running 26 minutes of machine work per session and Search running 33 seconds, a 48 times gap. Medians show the same pattern: 9 minutes versus 14 seconds.
The gap varies by domain, with local tasks showing 75 times and Science showing 26 times. Higher autonomy did not lower quality, with the research team scoring next-turn dissatisfaction from what users do next. Computer's meaningful dissatisfaction rate was 1.3 percent, against 2.9 percent for Search, a 55 percent reduction.
Follow-up turns also shift toward review and extension on Computer, though the changes are small. Connector usage rose more clearly, with Computer invoking at least one connector in 7.9 percent of sessions, versus 1.8 percent for Search. Computer chains external tools that Search users would otherwise run by hand.
Source: MarkTechPost