All of a sudden, DeepSeek is everywhere.
Its R1 model is open source, allegedly trained for a fraction of the cost of other AI models, and is just as good, if not better than ChatGPT.
This lethal combination hit Wall Street hard, causing tech stocks to tumble, and making investors question how much money is needed to develop good AI models. DeepSeek engineers claim R1 was trained on 2,788 GPUs which cost around $6 million, compared to OpenAI's GPT-4 which reportedly cost $100 million to train.
DeepSeek's cost efficiency also challenges the idea that larger models and more data leads to better performance. Amidst the frenzied conversation about DeepSeek's capabilities, its threat to AI companies like OpenAI, and spooked investors, it can be hard to make sense of what's going on. But AI experts with veteran experience have weighed in with valuable perspectives.
Hampered by trade restrictions and access to Nvidia GPUs, China-based DeepSeek had to get creative in developing and training R1. That they were able to accomplish this feat for only $6 million (which isn't a lot of money in AI terms) was a revelation to investors.
But AI experts weren't surprised. "At Google, I asked why they were fixated on building THE LARGEST model. Why are you going for size? What function are you trying to achieve? Why is the thing you were upset about that you didn't have THE LARGEST model? They responded by firing me," posted Timnit Gebru, who was famously terminated from Google for calling out AI bias, on X.
Tweet may have been deleted
Hugging Face's climate and AI lead Sasha Luccioni pointed out how AI investment is precariously built on marketing and hype. "It's wild that hinting that a single (high-performing) LLM is able to achieve that performance without brute-forcing the shit out of thousands of GPUs is enough to cause this," said Luccioni.
Tweet may have been deleted
DeepSeek R1 performed comparably to OpenAI o1 model on key benchmarks. It marginally surpassed, equaled, or fell just below o1 on math, coding, and general knowledge tests. That's to say, there are other models out there, like Anthropic Claude, Google Gemini, and Meta's open source model Llama that are just as capable to the average user.
But R1 causing such a frenzy because of how little it cost to make. "It's not smarter than earlier models, just trained more cheaply," said AI research scientist Gary Marcus.
Tweet may have been deleted
The fact that DeepSeek was able to build a model that competes with OpenAI's models is pretty remarkable. Andrej Karpathy who co-founded OpenAI, posted on X, "Does this mean you don't need large GPU clusters for frontier LLMs? No, but you have to ensure that you're not wasteful with what you have, and this looks like a nice demonstration that there's still a lot to get through with both data and algorithms."
Tweet may have been deleted
Wharton AI professor Ethan Mollick said it's not about it's capabilities, but models that people currently have access to. "DeepSeek is a really good model, but it is not generally a better model than o1 or Claude" he said. "But since it is both free and getting a ton of attention, I think a lot of people who were using free 'mini' models are being exposed to what a early 2025 reasoner AI can do and are surprised."
Tweet may have been deleted
DeepSeek R1 breakout is a huge win for open source proponents who argue that democratizing access to powerful AI models, ensures transparency, innovation, and healthy competition. "To people who think 'China is surpassing the U.S. in AI,' the correct thought is 'open source models are surpassing closed ones,'" said Yann LeCun, chief AI scientist at Meta, which has supported open sourcing with its own Llama models.
Tweet may have been deleted
Computer scientist and AI expert Andrew Ng didn't explicitly mention the significance of R1 being an open source model, but highlighted how the DeepSeek disruption is a boon for developers, since it allows access that is otherwise gatekept by Big Tech.
"Today's 'DeepSeek selloff' in the stock market -- attributed to DeepSeek V3/R1 disrupting the tech ecosystem -- is another sign that the application layer is a great place to be," said Ng. "The foundation model layer being hyper-competitive is great for people building applications."
Tweet may have been deleted
文章
7
浏览
455
获赞
221
Apple's iPhone 12 studio lets you mix and match iPhone colors and accessories
Unsure which iPhone 12 color would match best with a Saddle Brown MagSafe Wallet? Apple has a fix.OvEthereum Mining GPU Benchmark
Today we're going to take a quick look at how current generation GPUs, along with a few older ones,SpaceX will try to achieve 2 impressive feats on Monday
UPDATE: April 30, 2017, 7:21 a.m. EDT SpaceX aborted its launch at the last minute on Sunday due toHow an Australian VR gaming studio scored a gig with Boeing to train astronauts
For Australian game studio Opaque Space, you could say life has been imitating art recently.The MelbCloudflare goes down, and takes the internet's security blanket with it
When Cloudflare has problems, the rest of the internet can't be that far behind. The company whose eAdobe Premiere Pro CC CPU & GPU Performance
Editor's Note:Matt Bach is the head of Puget Labs and has been part of Puget Systems, a boutique buiSophia the robot taught a STEM class to the generation that's ready to embrace AI
It was a typical online class in the Covid era. Emoji, ranging from heart-eyes to clapping hands, flTouching grass: what it means and how to do it
Have you ever looked up from your phone, eyes stinging and mind cloudy, realising that hours have pa5 ways to charge your new iPhone 12
Apple did the unthinkable with its new iPhone 12: It stopped including a charging brick in the box.Trump's climate order puts 'China first' in clean energy
President Donald Trump is vowing to put "America First" -- just not when it comes to the global cleaDon't Bother with A320 Motherboards, Go for AMD's B350 Instead for Raven Ridge
We've learned most of what there is to know about AMD's new Vega-infused CPUs, but those of you thinThe new pink iPad is truly, gloriously pink
Apple has always been shy when it comes to bright, vivid colors. Gadgets with the Apple logo are traChrissy Teigen accidentally leaks her email address on Twitter, styles it out
If you're an average Joe who accidentally tweets out their personal email address, chances are not aApple's newest ad makes a haunting plea to take climate change seriously
Apple's latest commercial is advertising the Earth.In a rare topical turn for the company, the ad maTwitter is finally testing an edit button
Pigs have taken flight and hell is a frozen tundra because Twitter is testing an edit button. On Thu