Z.ai: GLM 4.6
z-ai/glm-4.6Upcoming deprecation
This model will be deprecated on 2026-05-14. Please plan to migrate to an alternative model before this date.
Description
Compared with GLM-4.5, this generation brings several key improvements:
Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks. Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better real-world performance in applications such as Claude Code、Cline、Roo Code and Kilo Code, including improvements in generating visually polished front-end pages. Advanced reasoning: GLM-4.6 shows a clear improvement in reasoning performance and supports tool use during inference, leading to stronger overall capability. More capable agents: GLM-4.6 exhibits stronger performance in tool using and search-based agents, and integrates more effectively within agent frameworks. Refined writing: Better aligns with human preferences in style and readability, and performs more naturally in role-playing scenarios.
Specifications
Pricing
| Type | Price / 1M tokens |
|---|---|
| Input | $0.60 |
| Output | $2.20 |
| Cache Read | $0.11 |
Quick Start
curl https://api.ominigate.ai/v1/chat/completions \
-H "Authorization: Bearer sk-omg-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "z-ai/glm-4.6",
"messages": [{"role": "user", "content": "Hello!"}]
}'