On January 27, 2025, the technology stock exchange experienced an unexpected crash: The NVIDIA stock price plummeted by over 17%, temporarily wiping out nearly $600 billion in market value and setting a new historical record in the stock market. Many other US tech companies also faced massive losses that day. In total, about $1 trillion in market capitalization was burned within hours [1]. The trigger? The introduction of DeepSeek-R1, an open-source language model by the Chinese AI startup DeepSeek, sent the entire industry into a frenzy. This language model not only rivals OpenAI-o1, the flagship model from OpenAI. According to DeepSeek’s claims, it can also be cost-effectively trained with comparatively little hardware and is, in contrast to proprietary models like those from OpenAI, a true "open AI".
In this article, we will take a closer look at what makes DeepSeek-R1 so special, how it was trained, and why it is equally interesting for developers and companies. We will explain the technology behind the model and its impact on the fiercely competitive AI market, describe our practical experiences with it, and take a look at future implications.
Disclaimer: This article was originally written in German. Only LLMs (mainly DeepSeek-R1 and a bit of GPT-4o) were used for the translation 😉
DeepSeek Logo
What is DeepSeek-R1?
DeepSeek-R1 is an open-source Large Language Model (LLM), released under the MIT license, allowing free use for both private and commercial purposes. Similar to OpenAI-o1, DeepSeek-R1 is a "reasoning" model. This means the model is trained to generate the correct answer using a Chain-of-Thoughts (CoT) process. Specifically, it not only generates the answer to the user's request but also describes its path to the answer. This significantly improves the quality of responses, especially in mathematical problem solving or programming tasks, compared to traditional LLMs. However, this ability comes with some disadvantages. Training and inference become more expensive and resource-intensive. In the case of o1, this means up to 30 times longer generation times compared to GPT-4o and significantly higher token costs. DeepSeek-R1 challenges this paradigm by delivering comparable performance at far lower reported costs.
Brief Technical Overview
Overview of the training pipeline of DeepSeek-R1-Zero and DeepSeek-R1
The DeepSeek team describes the training process of DeepSeek-R1 in their paper [2]. Initially, a model was trained through Reinforcement Learning (RL) directly on the pre-trained base model DeepSeek-V3-Base without using Supervised Fine-Tuning (SFT). The training process began with a simple template prompting the base model to generate a thought process, followed by a final answer. This template was limited to specifying a structural format without any content-specific guidelines to observe the model's natural development during the RL process. During training, the model showed continuous performance improvement. It learned to spend more time on complex thought processes and reconsider its approach, resulting in increasingly longer responses. However, several issues were identified during the training of DeepSeek-R1-Zero, ultimately leading to the development of DeepSeek-R1. Two major problems were the lack of readability of the generated content and the frequent mixing of different languages.
The researchers then developed a new model to address these issues. They began training with a "Cold Start" phase, using a small amount of high-quality Chain-of-Thought data to fine-tune the model. This approach aimed to bypass the initial unstable phase of RL training, drastically improving the readability of responses compared to the DeepSeek-R1-Zero model. After fine-tuning, the model was trained through the same large-scale RL process as DeepSeek-R1-Zero. With each subsequent SFT and RL step, the reasoning capabilities and response quality were further improved. Additionally, smaller Llama and Qwen models were fine-tuned using DeepSeek-R1 to teach them the reasoning abilities of DeepSeek-R1 and thus improve those models.
Performance and Training Cost
DeepSeek-R1 demonstrated very impressive performance in benchmark tests. In text comprehension and education-oriented knowledge benchmarks, DeepSeek-R1 significantly outperformed earlier versions like DeepSeek-V3 and was only slightly behind or even on par with OpenAI-o1. In mathematical tasks, DeepSeek-R1 achieved scores that surpassed those of o1. Similar performance was observed in solving programming tasks, as evidenced by benchmark results.
Benchmark results of DeepSeek-R1 compared to OpenAI-o1 and others [2]
Particularly noteworthy is the development efficiency: According to DeepSeek, the model was developed within only two months and with a budget of under $10 million — a fraction of the billions invested by competitors like OpenAI. If these numbers hold true, it challenges the assumption that AI advancement requires exorbitant investments, potentially disrupting the revenue models of chip giants like NVIDIA.
Self-Hosting and API
The open-source DeepSeek-R1 model can be downloaded for free and used at no cost. Both the large 671B model and various distilled versions of Llama and Qwen, ranging from 1.5B to 70B, are available. These models can be used on appropriate hardware with programs like Ollama or vLLM. Anyone with the computing power can run the 671B model on their own or rented hardware and thus have a personal state-of-the-art LLM with full control over it. Alternatively, DeepSeek-R1 can also be used via an API provided by DeepSeek. DeepSeek also offers a chat on their company website. In both cases, whether using the API or their web chat, all interactions run through the AI startup's servers based in China. This should be considered with regard to privacy matters among other things. However, the remarkable price difference compared to OpenAI o1’s API costs should be noted. For one million generated output tokens, OpenAI charges nearly 30 times more compared to DeepSeek-R1.
Pricing Comparison of DeepSeek-R1 and OpenAI-o1 via their respective APIs [3]
Our Experience with DeepSeek-R1
At codecentric, we have already experimented with DeepSeek-R1 and achieved overall good results. For instance, the model successfully solved several complex programming tasks where even OpenAI-o1 failed. It consistently handled more complex inquiries and questions without issues. The model's output clearly shows how the reasoning process works, enabling it to correctly answer complex queries. We also noted that although the smaller distilled models had some problems with the German language, the large 671B model was very good at understanding and generating German texts. Additionally, we are integrating a self-hosted DeepSeek-R1 model into our own CompanyGPT solution for use within our company.
The Future of AI Models
This development is not entirely unexpected. Some experts have long anticipated that open-source models would eventually match or surpass the performance and quality of proprietary models like those from OpenAI. There is also the opinion that pay-as-you-go pricing models, like the paid APIs from OpenAI or hyperscalers, will not survive in the long term. The release of DeepSeek-R1 and its implications show that not only could these theses become reality, but also that the US is not the only one dominating the AI world. Of course, OpenAI, Google, and others will not rest. With Project Stargate and further investments, the United States will do everything to continue to stay at the top of the AI market in the future.
Example request with DeepSeek-R1 (671B)
User Input
>>> You're escaping a labyrinth, and there are three exits. Exit A leads to an inferno. Exit B leads to an assassin. Exit C leads to a lion that hasn't eaten in 3 years. Which exit do you pick?
Model Output
<think> Okay, so I need to figure out which exit to choose to escape the labyrinth. There are three exits: A, B, and C. Each one has a different danger. Let me break down each option. Exit A leads to an inferno. An inferno is like a huge fire, right? So if I go through there, I'd probably be burned alive. That sounds immediately dangerous. Fire spreads quickly, and there's not much you can do once you're in it without proper protection. So maybe Exit A is not the best choice unless there's some hidden trick. Exit B leads to an assassin. Assassins are trained killers. If I go through here, I might get attacked by someone who's skilled in killing. But maybe there's a chance to negotiate or fight back? It depends on the situation. Assassins might be waiting to ambush, but they are human, so maybe there's a way to outsmart them or find a weakness. Still, facing a professional killer is super risky. Exit C leads to a lion that hasn't eaten in 3 years. Wait a second, a lion that hasn't eaten in three years? That seems impossible. Lions can't survive that long without food. A lion would die of starvation in a matter of weeks, not years. So if the lion hasn't eaten in three years, it's probably dead. Therefore, Exit C might actually be safe because the lion is no longer alive. That could be a trick question where the danger is already gone. So putting it all together: A is fire (instant death), B is an assassin (high risk but maybe manageable), and C is a lion that's likely dead. The correct choice would be Exit C because the lion couldn't have survived three years without food. Therefore, the path is safe. </think> The correct exit to choose is Exit C. The lion is dead, so it’s the safest path. 🔍🦁⚰️
References
[2] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. DeepSeek-AI - https://arxiv.org/pdf/2501.12948
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Daniel Ladischenski
AI Consultant & Engineer
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.