Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Deepseek has become viral.
The Chinese Ai Lab Deepseek broke into the mainstream awareness this week The chatbot app rose at the top of the Apple App Store charts ((and Google Play too). Deepseek’s AI models that were trained with calculation-efficient techniques, Wall Street have led analysts – and technologists -to ask whether the United States can maintain its leadership in the AI race and whether the demand for AI chips will be maintained.
But where did Deepseek came from and how did it become international fame so quickly?
Deepseek is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund that uses AI to inform his trading decisions.
Ki -enthusiast Liang Wenfeng Co-founding of high-flyers in 2015. Wengeng, who reported as a student at Zhejiang University, with trade in trade, started a hedge fund in 2019 that focused on the development and provision of AI algorithms.
In 2023, High-Flyer Deepseek started as a laboratory for researching AI tools that are separated from his financial business. With high flyers as one of its investors, the laboratory was called his own company, also Deepseek.
From day one on the first day, Deepseek built his own data center cluster for model training. But like other AI companies in China, Deepseek was affected by US export bans on hardware. In order to train one of its newer models, the company had to use NVIDIA H800 chips, a less power of a chip, the H100 that is available to the US company.
Deepseek’s technical team should be young. The Company According to reports aggressively recruits Doctoral students -KI researcher from top Chinese universities. Deepseek also stops people without computer science background To help his technology better understand a wide range of topics, according to the New York Times.
In November 2023, Deepseek presented its first models Vor-Depseek Coder, Deepseek LLM and Deepseek Chat.
Deepseek-V2, a general text and image analyzing system, has achieved a good performance in various AI benchmarks-and at that time it was much cheaper than comparable models. It forced Deepseek’s domestic competition, including bytedance and Alibaba, to reduce the usage prices for some of their models and to make others completely free.
Deepseek-V3Started in December 2024 and only contributed to Deepseek’s fame.
According to the internal benchmark test by Deepseek, Deepseek V3 exceeds both downloadable, openly available models such as Meta’s lama and “closed” models, which can only be accessed by an API, such as Openais GPT-4O.
Deepseek’s R1 argumentation model is also impressive. Deepseek was published in January and claimed R1, like the O1 model from Openai, carries out on important benchmarks.
As an argumentation model, R1 checks the facts for itself, which contributes to avoiding some of the pitfalls that normally stumble models. The argumentation models last a little longer and more seconds to minutes to get to solutions compared to a typical non-limitation model. The advantage is that they tend to be more reliable in areas such as physics, natural sciences and mathematics.
However, there is a disadvantage of R1, Deepseek V3 and Deepseek’s other models. They are subject to Chinese-developed AI Benchmarking through China’s Internet regulator to ensure that his answers “core core core socialist values”. In Deepseek’s Chatbot -app, for example, R1 will not answer any questions about Tiananmen Square or Taiwan’s autonomy.
If Deepseek has a business model, it is not clear what this model is. The company evaluates its products and services far below the market value – and gives away others for free. There is also no investor benefitDespite a lot of VC interest.
The way Deepseek says has enabled the breakthroughs of efficiency to maintain extreme cost competitions. Some experts dispute However, the company’s numbers have delivered.
Whatever the case may be, developers have entered Deepseek’s models that are not open source, since the expression is generally understood, but is available under permissible licenses that enable commercial use. According to Clem Delangue, the CEO of Sugging Face, one of the platforms on which Deepseek’s models are organized. Developers on the embrace face have created over 500 “derivative” models from R1 That gave up 2.5 million downloads.
Deepseek’s success with larger and more established competitors was described as “emerging AI” And “Overflow.” The success of the company was at least partially responsible for Nvidia’s share price dropped by 18% in January and for trigger a public answer from the Openai CEO Sam Altman.
Microsoft announced that Deepseek is available for his Azure Ai Foundry serviceMicrosoft platform that brings AI services for companies as part of a single banner. When CEO Mark Zuckerberg had an impact on Deepseek’s impact on Meta’s AI expenses during his profit in the first quarter, said CEO Mark Zuckerberg The expenses for the KI infrastructure will continue to be a “strategic advantage” for meta. In March, Openaai called Deepseek “state-substantiated” and “state-controlled”, And recommends the US government to ban models from Deepseek.
While Nvidia’s earnings discussion in the fourth quarter, CEO Jensen Huang emphasized the “excellent innovation” by Deepseek ,, To say that it and other “argumentation” models are great for Nvidia because they need so much more calculation.
At the same time, Some companies prohibit DeepseekAnd so it is completely Countries And GovernmentsPresent including South Korea. New York State too Deepseek forbidden to be used on government devices.
As far as Deepseek’s future is concerned, it is not clear. Improved models are a matter of course. But the US government seems to be Carefully growth with what it perceives as a harmful stranger. In March the Wall Street Journal reported that The United States will probably ban Deepseek on government devices.
This story was originally released on January 28, 2025 and is updated regularly.