Overview

  • Sectors Recreational Services
  • Posted Jobs 0
  • Viewed 10

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation reasoning designs, attaining performance comparable to OpenAI-o1 across math, code, and thinking tasks.

Models

DeepSeek-R1

Distilled models

DeepSeek team has actually demonstrated that the reasoning patterns of larger models can be distilled into smaller models, resulting in much better performance compared to the thinking patterns found through RL on small models.

Below are the designs developed by means of fine-tuning against several dense models commonly utilized in the research study neighborhood using thinking data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller sized dense designs carry out well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The design weights are licensed under the MIT License. DeepSeek-R1 series assistance business use, enable any adjustments and derivative works, including, but not restricted to, distillation for training other LLMs.