
Z.ai released GLM-5.2 on June 16, 2026, and Featherless is a Day Zero launch partner: the model has been available to run on Featherless since the day it shipped. Across three long-horizon coding benchmarks, FrontierSWE, PostTrainBench, and SWE-Marathon, GLM-5.2 is the highest-ranked open-source model, and the only open-weight model that ranks alongside Claude Opus 4.8 and GPT-5.5 on that class of work. It is released under an MIT license and is available through the standard OpenAI-compatible API.
What changed from GLM-5.1
GLM-5.2 is roughly the same size as GLM-5.1, about 753B parameters in a Mixture-of-Experts design that activates around 39B per token. The gains come from a revised long-context architecture and training focused on coding agents, not from added scale.
Benchmark results improved across the board. Terminal-Bench 2.1 rose from 63.5 to 81.0, SWE-bench Pro from 58.4 to 62.1, FrontierSWE from 30.5 to 74.4, and SWE-Marathon from 1.0 to 13.0. Reasoning scores also improved: AIME 2026 moved from 95.3 to 99.2 and GPQA-Diamond from 86.2 to 91.2.
Architecture and long context
GLM-5.2 is designed to maintain quality across long, complex coding-agent sessions rather than simply to accept more tokens. The architecture introduces IndexShare, which reuses a single lightweight indexer across every four sparse-attention layers and reduces per-token compute by 2.9x at long context lengths. An improved multi-token-prediction layer raises speculative-decoding acceptance by about 20%. The model also provides a thinking-effort control, with High and Max levels, to balance reasoning depth against latency and compute.
How it compares to the closed frontier
On FrontierSWE, GLM-5.2 (74.4) trails Claude Opus 4.8 (75.1) by about a point and edges out GPT-5.5 (72.6). On PostTrainBench it outperforms GPT-5.5 and ranks second only to Opus 4.8. On SWE-bench Pro (62.1) it surpasses both GPT-5.5 (58.6) and Gemini 3.1 Pro (54.2). On Terminal-Bench 2.1, its 81.0 is within a few points of Opus 4.8 (85.0) and ahead of Gemini 3.1 Pro (74.0). Across long-horizon coding work, it is the highest-ranked open-weight model.
Availability on Featherless
As a Day Zero launch partner, Featherless has served GLM-5.2 since its June 16 release. The model is available through the OpenAI-compatible API, in FP8 with up to a 256K context window, on the Premium and Agent Pro plans. Existing integrations require only the model identifier:
from openai import OpenAI
client = OpenAI(
base_url="https://api.featherless.ai/v1",
api_key="FEATHERLESS_API_KEY",
)
response = client.chat.completions.create(
model="zai-org/GLM-5.2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
],
)
print(response.choices[0].message.content)
Featherless hosts the model directly, so no GPU provisioning, quantization, or inference-server setup is required. Prompts and completions are not logged.
License
GLM-5.2 is released under the MIT license, with no regional limits, usage caps, or monthly-active-user thresholds. It can be fine-tuned, integrated into commercial products, and redistributed without restriction.
Limitations
GLM-5.2 is text-only and does not accept image or audio input. On SWE-Marathon, the most demanding long-horizon benchmark, it trails Opus 4.8 by 13 points, so the hardest multi-hour tasks remain stronger on the closed-source frontier. For open-weight coding and agentic workloads, it is currently the strongest available option.
Get started
GLM-5.2 is available now on Featherless. Sign up or sign in at featherless.ai to start building, and open the GLM-5.2 model page for details.
Related articles
Start building under 3 minutes



