James A. Dorn
James A. Dorn
When Chinese AI firm DeepSeek released its innovative R1 model in late January—an open-source model that performs strongly with models developed at much higher cost by leading tech firms—the AI world was taken by surprise.
US venture capitalist Marc Andreessen called R1 “one of the most amazing and impressive breakthroughs I’ve ever seen.” It is built on DeepSeek’s V3 model, released in 2024, using lower-cost chips and optimized to run predictive large language models (LLMs) with reasoning capabilities. R1 roiled tech stocks on January 27, with superstar chip maker Nvidia’s market value falling by nearly $600 billion on the expectation that demand for its top-rated chips would fall in light of DeepSeek’s ability to economize on Nvidia chips while performing at high levels.
The Rise of DeepSeek
The emergence of DeepSeek as an important player in the tech sector is due to the efforts of 40-year-old Chinese billionaire Liang Wenfeng, who used capital from his quantitative hedge fund, High-Flyer, to launch DeepSeek in May 2023. As a nonstate/private enterprise, DeepSeek made progress on its own by hiring talent from some of China’s leading universities and paying highly competitive salaries. The firm was set up as a research operation to advance AI models and eventually to build models to match human learning processes (known as Artificial General Intelligence or AGI).
Early work led to LLMs V1 and V2, but the real breakthrough that brought worldwide attention to DeekSeek was the release of V3 in December 2024 and R1 a month later. The training costs for V3 were less than $6 million, far below the costs of training LLMs at major AI firms. AI models consist of algorithms and the data used to train them. Training consists of “the process of feeding an AI model curated data sets to evolve the accuracy of its output” (Chen 2023). » Read More
https://www.cato.org/blog/chinas-march-imitation-innovation-case-deepseek