
Chhaylong
Add a review FollowOverview
-
Sectors Recreational Services
-
Posted Jobs 0
-
Viewed 10
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning designs, attaining performance comparable to OpenAI-o1 across math, code, and thinking tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek team has actually demonstrated that the reasoning patterns of larger models can be distilled into smaller models, resulting in much better performance compared to the thinking patterns found through RL on small models.
Below are the designs developed by means of fine-tuning against several dense models commonly utilized in the research study neighborhood using thinking data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense designs carry out well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are licensed under the MIT License. DeepSeek-R1 series assistance business use, enable any adjustments and derivative works, including, but not restricted to, distillation for training other LLMs.