Fable 5 just set a new AI freelance work performance record - but it can't replace humans yet
Follow ZDNET: Add us as a preferred source on Google.

Follow ZDNET: Add us as a preferred source on Google.
After a brief hiatus , Anthropic's lauded Fable 5 model is back, and it's resetting the bar for automating work.
The US government re-authorized the model -- which Anthropic said shares capability similarities with Mythos 5, still only available for select organizations' use -- on June 30. But before it was pulled, the Center for AI Safety (CAIS) tested Fable 5 on its Remote Labor Index (RLI), released in October 2025. It blew Anthropic's Opus 4.8 and OpenAI's GPT-5.5 , each relatively new and considered impressive, out of the water.
Also: How to beat the AI algorithm and get the job of your dreams
RLI measures "how often AI agents can complete real, economically valuable freelance projects [...] at a quality a paying client would actually accept," CAIS explained in the study. These can include computer-assisted and graphic design, data analysis, video work, and more. As in other similar human ability tests, each deliverable the models create is evaluated by humans against a professional standard deliverable. The resulting automation rate reflects the distribution of projects where evaluators found what the AI produced to be as good as or better than human professional work.
CAIS asked Fable 5, GPT-5.5, and Opus 4.8 to design a 3D mockup of an engagement ring, create a video ad, and map a floor plan, among other tests. Researchers gave each model human-generated input files to get started, similarly to how you'd prep a human freelancer with relevant documents and information for a job.
Also: Anthropic's Mythos is evolving faster than expected, reports AI safety agency
Fable 5 hit an automation rate of 16.1%, a record for the benchmark -- and double Opus 4.8, which scored 8.3%. GPT‑5.5 came in third at 6.3%, but CAIS noted that all three models scored higher than every model it's evaluated thus far.
"For context, the previous published leader sat at 4.17% (Opus 4.6 with the Claude Cowork scaffold), and the field topped out at 2.5% when RLI was released," CAIS said. "The frontier has more than quadrupled in under eight months, a concrete signal of how quickly economically capable AI agents are advancing."
Automation rates measured by CAIS against its RLI benchmark.
CAIS noted that its testing was cut short by the government shutting down Fable 5 in mid-June, but that even these partial results set the model apart.
Source: ZDNet