A Chinese artificial intelligence (AI) startup, DeepSeek, has created a powerful AI system that rivals American competitors by using programming shortcuts and taking advantage of a loophole in U.S. export controls, according to a Wall Street Journal.
DeepSeek's leader, Liang Wenfeng, based in Hangzhou, took an unusual approach by hiring mostly recent graduates or people with minimal experience.
He believed that inexperience could lead to more innovative solutions.
A detailed report from The Guardian noted that DeepSeek erased $1 trillion from the value of major U.S. tech firms.
The emergence of this cheaper Chinese rival particularly affected Nvidia, which lost $600 billion in market value in the biggest one-day fall in U.S. stock market history.
According to DeepSeek’s repository on GitHub, the model's performance has reportedly surpassed various benchmarks set by OpenAI's o1-mini model in overall quality.
DeepSeek's base model contains 670 billion parameters, making it the largest open-source large language model available, according to Scientific American.
DeepSeek founder Liang Wenfeng, previously led a hedge fund called High-Flyer that uses AI for quantitative trading.
His background combines technical expertise and financial acumen, setting him apart from typical Silicon Valley entrepreneurs.
The DeepSeek assistant quickly gained popularity, surpassing ChatGPT in downloads from Apple's App Store.
The Wall Street Journal story reported that DeepSeek's team developed a different method for training AI models that required less data processing.
Instead of having their AI process massive amounts of data in advance, they designed it to search for relevant information only when asked a question.
They also used a technique called "mixture of experts," which splits tasks among specialized components rather than having one system handle everything.
This approach significantly reduced the computing power needed.
The company managed to acquire less-advanced Nvidia H800 chips during a one-year window before stricter U.S. export controls took effect.
These chips, while modified to comply with initial U.S. restrictions, were still nearly as capable as Nvidia's best chips at the time.
DeepSeek’s success has drawn the attention of major industry figures, with OpenAI's CEO Sam Altman acknowledging DeepSeek's impressive capabilities, particularly its ability to deliver high performance at a lower cost.
The company's approach differs from its American competitors as it focuses on research rather than commercial products, making its assistant and underlying code freely available.
Meta has assembled four teams to analyze how DeepSeek developed an AI model that may rival its own, The Information reported.
Meta AI infrastructure director Mathew Oldham has reportedly warned colleagues that DeepSeek’s latest model could outperform the next version of Meta’s Llama AI, set for release in early 2025.
(0) comments
Welcome to the discussion.
Log In
Keep it Clean. Please avoid obscene, vulgar, lewd, racist or sexually-oriented language.
PLEASE TURN OFF YOUR CAPS LOCK.
Don't Threaten. Threats of harming another person will not be tolerated.
Be Truthful. Don't knowingly lie about anyone or anything.
Be Nice. No racism, sexism or any sort of -ism that is degrading to another person.
Be Proactive. Use the 'Report' link on each comment to let us know of abusive posts.
Share with Us. We'd love to hear eyewitness accounts, the history behind an article.