Inception: Mercury 2

inception/mercury-2

Mar 4, 2026128K 上下文50K 最大输出$0.25/M in · $0.75/M out推理

Mercury 2 是一款速度极快的 reasoning LLM，也是首个 reasoning diffusion LLM（dLLM）。与顺序生成 token 不同，Mercury 2 并行生成并精修多个 token，在标准 GPU 上可达 >1,000 tokens/sec。相比 Claude 4.5 Haiku、GPT 5 Mini 等速度优化的主流 LLM，Mercury 2 速度快 5 倍以上，成本仅为前者的一小部分。 Mercury 2 支持可调 reasoning 级别、128K context、原生 tool use 和 schema-aligned JSON 输出。专为延迟敏感的编程工作流、实时语音/搜索和 agent 循环而设计，兼容 OpenAI API。详见博客。

供应商

inception

上下文长度

128K

最大输出

50K

模态类型

输入text

输出text

定价

类型	价格 / 百万 Token
输入	$0.25
输出	$0.75
缓存读取	$0.03

快速开始

curl https://api.ominigate.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-omg-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "inception/mercury-2",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Inception: Mercury 2

描述

技术规格

定价

快速开始