Z.ai's GLM-5.2 Open-Weights Model Beats GPT-5.5 on Coding Benchmarks at Fraction of Cost
Chinese AI startup Z.ai releases GLM-5.2, a 753-billion parameter open-weights LLM that outperforms GPT-5.5 on multiple long-horizon coding benchmarks at 1/6th the cost.

Benchmarks at Fraction of Cost">
Chinese AI startup Z.ai (formerly Zhipu AI) has announced the immediate release of GLM-5.2, a 753-billion parameter open-weights large language model (LLM) designed to excel in 'long-horizon' autonomous coding and engineering tasks. The model is available on Hugging Face, the Z.ai API, and over 20 third-party coding environments, with enterprise subscription tiers starting at $12.60 per month. GLM-5.2 features a highly stable 1-million-token context window and is released under an unrestricted MIT open-source license, allowing enterprises to download, customize, and run the model freely for only the cost of compute and electricity.
This approach is increasingly appealing to cost and security-conscious businesses, especially as state-of-the-art American proprietary models face uncertain regulatory futures. The model's architecture includes a major optimization called 'IndexShare,' which reduces compute needs by reusing an identical indexer across every four sparse attention layers. This innovation decreases per-token compute FLOPs by 2.9 times at the maximum 1-million-token context length.
GLM-5.2 also features an upgraded Multi-Token Prediction (MTP) layer for speculative decoding and flexible 'Thinking Modes' that allow users to toggle the model's reasoning effort between 'Max' and 'High.' On industry-standard benchmark tests, GLM-5.2 performs above most open-source flagship models and matches or beats proprietary leaders like OpenAI's GPT-5.5 and Anthropic's Claude Opus 4.8. The model excels in agentic tool use and long-horizon software engineering tasks, achieving top scores on several benchmarks, including SWE-bench Pro, FrontierSWE, and MCP-Atlas. The impact of Z.ai's new selectable 'thinking modes' is evident in the data, with the 'Max' effort level pushing the model to peak intelligence but utilizing nearly 85k output tokens per task.
Switching to the 'High' effort setting sacrifices only a few points in performance while effectively halving the required token output. Z.ai has also launched the GLM Coding Plan, which offers out-of-the-box support for third-party coding harnesses and tools. The plan's pricing tiers are highly competitive, with the Lite plan starting at $12.60 per month, the Pro plan at $50.40 per month, and the Max plan at $112.00 per month.
For enterprise developers integrating the raw model into their applications, Z.ai's API pricing is significantly lower than Western rivals, at $1.40 per million input tokens and $4.40 per million output tokens. The company also offers a cached input rate of $0.26 per million tokens and a limited-time offer for free cached input storage. The release of GLM-5.2 has been met with a warm reception from the developer community, with many praising the model's performance, licensing, and cost-effectiveness.
Source: VentureBeat