Back to Models
Xiaomi: MiMo-V2-Omni
xiaomi/mimo-v2-omniMar 18, 2026262.1K context65.5K max output$0.40/M in · $2.00/M outReasoning
Description
MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step planning, tool use, and code execution - making it well-suited for complex real-world tasks that span modalities, 256K context window.
Specifications
Provider
xiaomi
Context Length
262.1K
Max Output
65.5K
Modality
Intextaudioimagevideo
Outtext
Pricing
| Type | Price / 1M tokens |
|---|---|
| Input | $0.40 |
| Output | $2.00 |
| Cache Read | $0.08 |
Quick Start
curl https://api.ominigate.ai/v1/chat/completions \
-H "Authorization: Bearer sk-omg-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "xiaomi/mimo-v2-omni",
"messages": [{"role": "user", "content": "Hello!"}]
}'