There has been a lot of noise in Silicon Valley lately, thanks to a Chinese AI startup called DeepSeek. The Zhejiang-based AI lab has achieved what many thought impossible: building a state-of-the-art AI model at a fraction of the cost of its Western competitors. Even more surprisingly, the company behind this breakthrough isn’t a tech giant but a quantitative hedge fund.
DeepSeek began as Fire-Flyer, a deep-learning research division of High-Flyer, one of China’s most successful quant hedge funds. Unlike hedge funds that rely on traditional human analysis, quant hedge funds use mathematical models and automated algorithms to make trading decisions.
In 2023, under the leadership of Liang Wenfeng, the division transformed into an independent company with an ambitious goal of developing artificial general intelligence, regardless of commercial viability. What sets DeepSeek apart is its unorthodox approach to both talent and technology.
Instead of hiring seasoned engineers, Liang recruited fresh PhD graduates from top Chinese universities like Peking and Tsinghua. These young researchers, free from industry conventions and driven by scientific curiosity, were given ample computing resources to pursue unorthodox research projects.
The company’s pivotal moment came in the wake of US export controls that restricted Chinese firms’ access to advanced AI chips like Nvidia’s H100. Rather than letting this limitation become a roadblock, DeepSeek turned it into an opportunity for innovation.
The team developed highly efficient training methods that allowed them to build their latest model, DeepSeek-V3, using only 2,000 specialized chips, a huge contrast to the 16,000 or more chips typically used by leading AI companies.
Suffice it to say, the results have been remarkable. DeepSeek’s model matches the capabilities of advanced chatbots from OpenAI and Google, performing well on tasks like answering questions, solving logic problems, and writing computer programs. What’s really impressive is they achieved this with just $6 million in computing costs, which is roughly one-tenth of what Meta spent on its comparable model.
The company’s dedication to open-source development has earned it considerable respect within the global AI research community. By freely sharing their code and innovations, DeepSeek has placed itself at the forefront of a growing movement that challenges the notion that advanced AI development requires massive resources and corporate backing.
This success story points to an unintended consequence of the tech cold war between the US and China. While export controls were meant to maintain America’s AI advantage, they’ve instead ignited innovative approaches to model development that could reshape the industry.
In fact, DeepSeek’s achievements imply that the future of AI might not belong exclusively to tech giants with unlimited resources, but to nimble teams who can do more with less.
As one San Francisco-based engineer put it, DeepSeek’s rise represents a shift in the center of gravity of open-source AI development toward China, a trend that could have far-reaching implications for the global tech scene.