Z.ai

Z.ai

glm-5

GLM-5

High-end GLM lane for production reasoning and long-context workflows

Public model detailLimited previewMoE Transformer

Params

744B / 40B active

Context

198K

Max Output

64K

License

GLM

TTFT

520ms

Throughput

42 tok/s

Why pick it

  • Standard $2.25/M with $1.80/M batch
  • Useful for GLM-heavy enterprise builds

Pricing

This model does not currently expose public self-serve pricing. Public rates appear only after backend verification.
TierPublicCachedPrice sourceNote
RealtimeNot publicNot publicSiliconFlow lanePublic price reflects the runtime catalog without claimed savings comparisons
BatchNot publicNot publicSiliconFlow laneBatch public pricing follows the same runtime source

Quick start

OpenAI-compatible surface. Swap the base URL and ship

Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.luminapath.tech/v1",
    api_key="BATCHIN_API_KEY"
)

resp = client.chat.completions.create(
    model="glm-5",
    messages=[{"role": "user", "content": "Summarize why this model is a fit for my workload"}]
)

print(resp.choices[0].message.content)
JavaScript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.luminapath.tech/v1",
  apiKey: process.env.BATCHIN_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "glm-5",
  messages: [{ role: "user", content: "Summarize why this model is a fit for my workload" }],
});

console.log(resp.choices[0]?.message?.content);
cURL
curl https://api.luminapath.tech/v1/chat/completions \
  -H "Authorization: Bearer ***" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-5",
    "messages": [{"role":"user","content":"Summarize why this model is a fit for my workload"}]
  }'

Specs

Architecture

MoE Transformer

Vendor group

Z.ai

Context window

198K

Max output

64K

Best for

glm

Related models

Back to model center
Z.ai

Z.ai

glm-5-1

GLM-5.1

Open flagship route for coding, reasoning, and long-horizon agent execution

View detail
Z.ai

Z.ai

glm-4-7

GLM-4.7

Mid-tier GLM route for cost-aware engineering teams

View detail
Qwen

Qwen

qwen3-5-397b

Qwen3.5-397B

Large Qwen route for premium general reasoning with a public batch discount

View detail
DeepSeek

DeepSeek

deepseek-v4-flash

DeepSeek V4 Flash

Fast production DeepSeek route with standard, Asia, and batch pricing lanes

View detail