AI Models

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch

AI News Desk

MarkTechPost

Jun 15, 2026

3 min read

GLM-5.2 is the latest large language model from Z.ai, becoming the third major release in the GLM-5 line.

ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch">

GLM-5.2 is the latest large language model from Z.ai, becoming the third major release in the GLM-5 line. It follows GLM-5 (February 11), GLM-5-Turbo (March 15), and GLM-5.1 (April 7). That makes four flagship-tier coding releases in roughly four months.

GLM-5.2’s standout spec is a 1,000,000-token context window. Z.ai labels the variant glm-5.2[1m] in its own configuration. Each response can return up to 131,072 output tokens. That is roughly a 5x jump from GLM-5.1’s 200,000-token window.

A 1M-token window changes how a coding agent works in practice. The agent can hold an entire mid-sized repository in working memory. That includes source files, tests, configuration, and conversation history. It avoids the constant summarization that smaller windows force.

The release also adds two thinking-effort levels: High and Max. Z.ai recommends Max effort for complex, multi-step coding work. In Claude Code, the /effort command controls this setting. The xhigh, max, and ultracode options all map to GLM-5.2’s Max effort.

Z.ai did not specify GLM-5.2’s architecture in its launch materials. But based on community notes, the GLM-5 base is a 744-billion-parameter Mixture-of-Experts model. It activates 40 billion parameters per token. GLM-5.1 kept that same backbone with retargeted post-training.

Pick your agent and effort mode. Copy the exact config. See what 1M tokens buys you.

Here is the important caveat. Z.ai published no benchmark scores for GLM-5.2 at launch. There is no SWE-bench, Terminal-Bench, or Code Arena number yet. The announcement focused on availability, context, and the open-source roadmap.

For Claude Code, edit ~/.claude/settings.json . Point the Sonnet and Opus slots at the 1M variant. Raise the auto-compact window so the agent uses the full context.

Alternatively, set the endpoint through environment variables. The Anthropic-compatible endpoint accepts a base-URL swap.

Then run /effort in a session and select max . Run /status to confirm GLM-5.2 is active. For Cline, choose the OpenAI Compatible provider. Set the base URL to https://api.z.ai/api/coding/paas/v4 . Enter the custom model glm-5.2 and set context to 1,000,000.

GLM-5.2 is compatible with eight agentic coding tools from day one. The list includes Claude Code, Cline, OpenCode, and OpenClaw.

Share this article

X LinkedIn Telegram

Source: MarkTechPost