ai/kimi-k2

Verified Publisher

By Docker

Updated 2 months ago

Kimi K2 Thinking: open-source agent with deep reasoning, stable tool use, fast INT4, 256k context.

Model
0

10K+

ai/kimi-k2 repository overview

Kimi K2

GGUF version by Unsloth

logo

Description

Kimi K2 Thinking is the latest, most capable version of open-source thinking model. Starting with Kimi K2, we built it as a thinking agent that reasons step-by-step while dynamically invoking tools. It sets a new state-of-the-art on Humanity's Last Exam (HLE), BrowseComp, and other benchmarks by dramatically scaling multi-step reasoning depth and maintaining stable tool-use across 200–300 sequential calls. At the same time, K2 Thinking is a native INT4 quantization model with 256k context window, achieving lossless reductions in inference latency and GPU memory usage.

Key Features

  • Deep Thinking & Tool Orchestration: End-to-end trained to interleave chain-of-thought reasoning with function calls, enabling autonomous research, coding, and writing workflows that last hundreds of steps without drift.
  • Native INT4 Quantization: Quantization-Aware Training (QAT) is employed in post-training stage to achieve lossless 2x speed-up in low-latency mode.
  • Stable Long-Horizon Agency: Maintains coherent goal-directed behavior across up to 200–300 consecutive tool invocations, surpassing prior models that degrade after 30–50 steps.
FieldValue
ArchitectureMixture-of-Experts (MoE)
Total Parameters1T
Activated Parameters32B
Number of Layers (Dense layer included)61
Number of Dense Layers1
Attention Hidden Dimension7168
MoE Hidden Dimension (per Expert)2048
Number of Attention Heads64
Number of Experts384
Selected Experts per Token8
Number of Shared Experts1
Vocabulary Size160K
Context Length256K
Attention MechanismMLA
Activation FunctionSwiGLU

Use this AI model with Docker Model Runner

docker model run kimi-k2

Benchmarks

Reasoning Tasks
BenchmarkSettingK2 ThinkingGPT-5 (High)Claude Sonnet 4.5K2 0905 (Thinking)DeepSeek-V3.2Grok-4
HLEno tools23.926.319.8*7.919.825.4
HLEw/ tools44.941.7*32.0*21.720.3*41.0
HLEheavy51.042.0---50.7
AIME25no tools94.594.687.051.089.391.7
AIME25w/ python99.199.6100.075.258.1*98.8
AIME25heavy100.0100.0---100.0
HMMT25no tools89.493.374.6*38.883.690.0
HMMT25w/ python95.196.788.8*70.449.5*93.9
HMMT25heavy97.5100.0---96.7
IMO-AnswerBenchno tools78.676.0*65.9*45.876.0*73.1
GPQAno tools84.585.783.474.279.987.5
General Tasks
BenchmarkSettingK2 ThinkingGPT-5 (High)Claude Sonnet 4.5K2 0905 (Thinking)DeepSeek-V3.2
MMLU-Prono tools84.687.187.581.985.0
MMLU-Reduxno tools94.495.395.692.793.7
Longform Writingno tools73.871.479.862.872.5
HealthBenchno tools58.067.244.243.846.9
Agentic Search Tasks
BenchmarkSettingK2 ThinkingGPT-5 (High)Claude Sonnet 4.5K2 0905 (Thinking)DeepSeek-V3.2
BrowseCompw/ tools60.254.924.17.440.1
BrowseComp-ZHw/ tools62.363.0*42.4*22.247.9
Seal-0w/ tools56.351.4*53.4*25.238.5*
FinSearchComp-T3w/ tools47.448.5*44.0*10.427.0*
Framesw/ tools87.086.0*85.0*58.180.2*
Coding Tasks
BenchmarkSettingK2 ThinkingGPT-5 (High)Claude Sonnet 4.5K2 0905 (Thinking)DeepSeek-V3.2
SWE-bench Verifiedw/ tools71.374.977.269.267.8
SWE-bench Multilingualw/ tools61.155.3*68.055.957.9
Multi-SWE-benchw/ tools41.939.3*44.333.530.6
SciCodeno tools44.842.944.730.737.7
LiveCodeBenchV6no tools83.187.0*64.0*56.1*74.1
OJ-Bench (cpp)no tools48.756.2*30.4*25.5*38.2*
Terminal-Benchw/ simulated tools (JSON)47.143.851.044.537.7

Tag summary

Content type

Model

Digest

sha256:3c92c21eb

Size

601.9 GB

Last updated

2 months ago

docker model pull ai/kimi-k2:1T-thinking

This week's pulls

Pulls:

2,685

Last week