DeepSeek is going to raise funds!

Home > MarketWatch > Industry News

Time：2026-04-26

26560198-euaInt.jpg?auth_key=1777219199-

According to foreign media The Information, DeepSeek, a domestic large model company, is seeking its first external financing!

The company has insisted on "not taking external investment" in the past, but now intends to break this practice - it plans to raise at least $300 million at a valuation of no less than $10 billion

The reason is simple: the research and development of AI large models is extremely expensive, and even DeepSeek needs to enrich its "ammunition arsenal" and reserve funds for the next competition.

Initiate the first external financing

DeepSeek used to be a hardcore player who was "not bad for money or financing". It belongs to the domestic hedge fund giant Magic Square Quant, and after launching the DeepSeek R1 large model in 2025, it became a blockbuster, shaking even Silicon Valley and Wall Street.

Although many top venture capital and technology giants took the initiative to invest at the time, DeepSeek rejected them all. Founder Liang Wenfeng has always been a technology idealist, hoping that the company can remain independent, free from capital interference, and focus on doing what it really wants to do

But now the situation has changed - it's been a year and a half since the release of the R1, and the industry is moving forward at a rapid pace, while DeepSeek's new model, the V4, has not been launched

At the same time, global AI competition is fierce: OpenAI, Google, and Meta in the United States, Baidu, Alibaba, Byte, and Zhipu in China...... These giants rely on strong funds to frantically invest in computing power, talent, and product iteration, and continue to seize the high ground

In the face of pressure, Liang Wenfeng finally relented: launched his first external financing. If successful, DeepSeek will be able to: buy more computing power (the most expensive part of training large models) offer higher salaries and retain top AI talent. However, as a Chinese AI startup, some American investors may be hesitant and worried about geopolitical risks

|V4 is not just an "upgrade", but a comprehensive leap

According to multiple media reports, DeepSeek's new generation of large model V4 is likely to be officially launched at the end of April, although it has bounced tickets many times (originally scheduled to be released in February 2026).

1. Bigger and smarter, but not more expensive

The total number of parameters reaches 1 trillion and adopts an MoE (Hybrid Expert) architecture. Only about 37 billion parameters are activated per inference, so the operating cost is similar to that of the previous generation V3. It continues DeepSeek's consistent "efficiency first" philosophy: strong performance, but no waste of money

2. Can remember very long content - up to 1 million words in context

Introducing a new architecture, Engram (memory system), can quickly and accurately retrieve key information from millions of words. Internal tests show that at a length of 1 million tokens, the information recall rate is as high as 97% (V3 is far inferior to this level at 128,000 tokens).

3. Support multimodality for the first time: text, pictures, and videos can be processed

In the past, DeepSeek only made plain text models, and V4 was the first version to natively support image and video generation. This step allowed it to finally keep up with international mainstream models such as GPT-4 and Claude

4. The code capability has been greatly upgraded, specializing in complex bugs

V4 has been aiming for the "strongest code model" since its inception. The internal test results are impressive: SWE-bench (real-world programming tasks) scores exceed 80%, and HumanEval (algorithm questions) reach 90%. It can even understand the entire code repository, automatically fix complex bugs, and is expected to surpass GPT and Claude in long-context code reasoning

Two versions to suit different needs

Full version: More than one trillion parameters, designed for difficult reasoning and code tasks, adapted to Huawei's Ascend chips. Lightweight version (V4 Lite): about 200 billion parameters, used for daily conversations and API services, can run on other domestic chips

Interestingly, on March 9, V4 Lite was briefly launched and then withdrawn; in early April, developers discovered another beta version in the API - inference speed increased by 30%, and 128K context recall soared from 45% to 94%.

V4 will continue to be open source, and model weights will be open under the Apache 2.0 protocol. DeepSeek also recently recruited server O&M and delivery managers in Ulanqab, Inner Mongolia - this is the first time the company has hired on-site computing infrastructure personnel, indicating that V4 has stepped out of the lab and entered the countdown to large-scale deployment

Based on multiple sources, DeepSeek V4 is likely to be officially released at the end of April 2026. If launched as scheduled, it will become the first flagship product in China's large model camp with ultra-long context, multi-modality, top code capabilities and efficient reasoning

Break away from the NVIDIA ecosystem?

The problem with the delay in the release of V4 is not in the model itself, but in the "change of heart" - DeepSeek is migrating the entire technology base from NVIDIA chips to Huawei Ascend chips.

In the past, all DeepSeek models ran on NVIDIA GPUs and relied on CUDA, a mature ecosystem. But V4 is different: it is fully adapted to Huawei's CANN software architecture

This means that engineers have to rewrite a lot of underlying code, which is equivalent to replacing the engine of a sports car from imported to domestically produced, and ensuring that the performance does not deteriorate.

This is not only a technical challenge, but also a strategic choice: DeepSeek did not provide Nvidia or AMD with V4 for optimization in advance; instead, it gave early testing rights exclusively to domestic chip manufacturers

If V4 can finally run on Huawei chips comparable to or even close to Nvidia's performance, it will become the world's first top large model that does not rely on Nvidia.

Even Nvidia CEO Jensen Huang ("Lao Huang") couldn't sit still about this matter. He said directly in a recent interview: "DeepSeek's new model based on the Huawei platform will be a bad result for the United States."

Why?

Because once the most cutting-edge AI models run better, cheaper, and more efficient on Chinese chips, Nvidia's long-standing technology moat may be broken.

What is the 300 million dollars?

DeepSeek has always been known for being "cost-saving and efficient", but now it is not enough to rely on parent company Magic Square to quantify "blood transfusion". According to Stanford University's "2026 AI Index Report", as of March 2026, the performance gap between the top AI models in China and the United States is only 2.7 percentage points

The closer you get, the higher the cost for each little increase. In contrast: OpenAI has just raised $40 billion, with a valuation of up to $300 billion; DeepSeek wants to raise $300 million, which may seem like a lot, but the goal is clear - not to burn more money, but to completely get rid of dependence on Nvidia

Jingtai Views|Focus on three links

A year and a half ago, DeepSeek R1 showed the world that Chinese teams can make amazing models without huge amounts of money. A year and a half later, DeepSeek V4 wants to prove something even more difficult: it can run world-class AI without NVIDIA

1. AI Model Layer: If successful, DeepSeek will reshape the global landscape

If V4 performs close to GPT-5/Claude 4 on Ascend, the domestic AI ecosystem will gain a key fulcrum; the open source strategy is expected to attract global developers and form a "Chinese version of Hugging Face"

2. Computing power infrastructure: Huawei's Ascend industry chain benefits

Huawei Ascend: Core beneficiary, verifying its ability to support cutting-edge large models; Domestic computing power service providers: such as Cambrian, Haiguang, Biren, etc., if the ecology is opened, opportunities can be expected; Data centers and liquid cooling: The demand for computing power infrastructure in Ulanqab and other places will surge

3. AI application layer: Code and multimodality are breakthroughs

V4 is strong in code generation and long-context understanding, which is good for AI programming tools, intelligent customer service, scientific research assistance and other scenarios; after the completion of multi-modal capabilities, it may be used in education, content creation, industrial design and other fields

Risk warning: geopolitical restrictions, technology implementation falling short of expectations, and brain drain.

[PREV]：Anthropic is stealing the show from OpenAI across the board!

[NEXT]：Why did CATL spend 30 billion yuan to set up a new investment platform?

[Back to List]

TEL：

18117862238

Email：yumiao@jt-capital.com.cn
Address：20th floor, Taihe · international financial center, high tech Zone, Chengdu

LINKS

Wechat
Tel

18117862238
Top