
Yellowheronpress
Add a review FollowOverview
-
Founded Date 27/02/1911
-
Sectors Health Care
-
Posted Jobs 0
-
Viewed 5
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking models, attaining efficiency similar to OpenAI-o1 throughout mathematics, code, and reasoning jobs.
Models
DeepSeek-R1
Distilled models
DeepSeek team has demonstrated that the thinking patterns of larger models can be distilled into smaller designs, resulting in better performance compared to the reasoning patterns found through RL on little designs.
Below are the models produced via fine-tuning against numerous dense models widely utilized in the research using thinking information generated by DeepSeek-R1. The examination results show that the distilled smaller sized dense designs perform extremely well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The model weights are accredited under the MIT License. DeepSeek-R1 series support industrial use, allow for any adjustments and derivative works, consisting of, however not limited to, distillation for training other LLMs.