Yellowheronpress

Overview

  • Founded Date 27/02/1911
  • Sectors Health Care
  • Posted Jobs 0
  • Viewed 5
Bottom Promo

Company Description

DeepSeek’s First-generation Reasoning Models

DeepSeek’s first-generation thinking models, attaining efficiency similar to OpenAI-o1 throughout mathematics, code, and reasoning jobs.

Models

DeepSeek-R1

Distilled models

DeepSeek team has demonstrated that the thinking patterns of larger models can be distilled into smaller designs, resulting in better performance compared to the reasoning patterns found through RL on little designs.

Below are the models produced via fine-tuning against numerous dense models widely utilized in the research using thinking information generated by DeepSeek-R1. The examination results show that the distilled smaller sized dense designs perform extremely well on standards.

DeepSeek-R1-Distill-Qwen-1.5 B

DeepSeek-R1-Distill-Qwen-7B

DeepSeek-R1-Distill-Llama-8B

DeepSeek-R1-Distill-Qwen-14B

DeepSeek-R1-Distill-Qwen-32B

DeepSeek-R1-Distill-Llama-70B

License

The model weights are accredited under the MIT License. DeepSeek-R1 series support industrial use, allow for any adjustments and derivative works, consisting of, however not limited to, distillation for training other LLMs.

Bottom Promo
Bottom Promo
Top Promo