DeepSeek

Into the unknown

扫码下载 DeepSeek APP

获取手机 App

DeepSeek 官方推出的免费 AI 助手
搜索写作阅读解题翻译工具

DeepSeek-V3 的综合能力

Generate text, image, code, chat and even more with

Access to valuable user insight, analytics and activity.

Securely process credit card, debit card, or other methods.

Ability to understand and generate content in different languages

Add unlimited number of custom prompts for your customers.

Access and manage your support tickets from your dashboard.

	Benchmark (Metric)	DeepSeek V3	DeepSeek V2.5	Qwen2.5	Llama3.1	Claude-3.5	GPT-4o
	Benchmark (Metric)		0905	72B-Inst	405B-Inst	Sonnet-1022	0513

	Architecture	MoE	MoE	Dense	Dense	-	-

	# Activated Params	37B	21B	72B	405B	-	-

	# Total Params	671B	236B	72B	405B	-	-
English	MMLU (EM)	88.5	80.6	85.3	88.6	88.3	87.2
	MMLU-Redux (EM)	89.1	80.3	85.6	86.2	88.9	88.0
	MMLU-Pro (EM)	75.9	66.2	71.6	73.3	78.0	72.6
	DROP (3-shot F1)	91.6	87.8	76.7	88.7	88.3	83.7
	IF-Eval (Prompt Strict)	86.1	80.6	84.1	86.0	86.5	84.3
	GPQA-Diamond (Pass@1)	59.1	41.3	49.0	51.1	65.0	49.9
	SimpleQA (Correct)	24.9	10.2	9.1	17.1	28.4	38.2
	FRAMES (Acc.)	73.3	65.4	69.8	70.0	72.5	80.5
	LongBench v2 (Acc.)	48.7	35.4	39.4	36.1	41.0	48.1
Code	HumanEval-Mul (Pass@1)	82.6	77.4	77.3	77.2	81.7	80.5
	LiveCodeBench (Pass@1-COT)	40.5	29.2	31.1	28.4	36.3	33.4
	LiveCodeBench (Pass@1)	37.6	28.4	28.7	30.1	32.8	34.2
	Codeforces (Percentile)	51.6	35.6	24.8	25.3	20.3	23.6
	SWE Verified (Resolved)	42.0	22.6	23.8	24.5	50.8	38.8
	Aider-Edit (Acc.)	79.7	71.6	65.4	63.9	84.2	72.9
	Aider-Polyglot (Acc.)	49.6	18.2	7.6	5.8	45.3	16.0
Math	AIME 2024 (Pass@1)	39.2	16.7	23.3	23.3	16.0	9.3
	MATH-500 (EM)	90.2	74.7	80.0	73.8	78.3	74.6
	CNMO 2024 (Pass@1)	43.2	10.8	15.9	6.8	13.1	10.8
Chinese	CLUEWSC (EM)	90.9	90.4	91.4	84.7	85.4	87.9
	C-Eval (EM)	86.5	79.5	86.1	61.5	76.7	76.0
	C-SimpleQA (Correct)	64.1	54.1	48.4	50.4	51.3	59.3

10 Feb

Admin