
Semper Unitas
Add a review FollowOverview
-
Sectors Marketing
-
Posted Jobs 0
-
Viewed 30
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation designs, attaining performance comparable to OpenAI-o1 across math, code, and thinking jobs.
Models
DeepSeek-R1
Distilled models
DeepSeek group has demonstrated that the thinking patterns of bigger designs can be distilled into smaller designs, leading to much better efficiency compared to the thinking patterns found through RL on little models.
Below are the designs created by means of fine-tuning versus numerous dense models widely utilized in the research community utilizing reasoning information produced by DeepSeek-R1. The evaluation results show that the distilled smaller sized thick designs carry out extremely well on criteria.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The model weights are licensed under the MIT License. DeepSeek-R1 series support commercial usage, permit any modifications and derivative works, consisting of, however not restricted to, distillation for training other LLMs.