Soundtrack: The Hives — Hate To Say I Told You So
In the last week or so, but especially over the weekend, the entire generative AI industry has been thrown into chaos.
This won’t be a lengthy, technical write-up — although there will be some inevitable technical complexities, just because the nature of the subject demands it.. Rather, I will address the elephant in the room, namely why the Western tech giants have been caught so flat-footed.
In short, the recent AI bubble (and, in particular, the hundreds of billions of spending behind it) hinged on the idea that we need bigger models, which are both trained and run on bigger and even larger GPUs almost entirely sold by NVIDIA, and are based in bigger and bigger data centers owned by companies like Microsoft and Google. There was an expectation that this would always be the case, and that generative AI would always be energy and compute hungry, and thus, incredibly expensive.
But then, a Chinese artificial intelligence company that few had heard of called DeepSeek came along with multiple models that aren’t merely competitive with OpenAI's, but undercut them in several meaningful ways. DeepSeek’s models are both open source and significantly more efficient — 30 times cheaper to run — and can even be run locally on relatively modest hardware.
As a result, the markets are panicking, because the entire narrative of the AI bubble has been that these models have to be expensive because they're the future, and that's why hyperscalers had to burn $200 billion in capital expenditures for infrastructure to support generative AI companies like OpenAI and Anthropic. The idea that there was another way to do this — that, in fact, we didn't need to spend all that money, had any of the hyperscalers considered a different approach beyond "throw as much money at the problem as possible" — simply wasn’t considered.
And then came an outsider to upend the conventional understanding and, perhaps, dethrone a member of America’s tech royalty — a man who has crafted, if not a cult of personality, then a public image of an unassailable visionary that will lead the vanguard in the biggest technological change since the Internet. I am, of course, talking about Sam Altman.
DeepSeek isn't just an outsider, but a company that emerged as a side project from a tiny, tiny hedge fund — at least, by the standards of hedge funds — and whose founding team have nowhere near the level of fame and celebrity as Altman. Humiliating.
On top of all of that, DeepSeek's biggest, ugliest insult is that its model, DeepSeek R1, is competitive with OpenAI's incredibly expensive o1 "reasoning" model, yet significantly (96%~) cheaper to run, and can even be run locally. Speaking to a few developers I know, one was able to run DeepSeek's R1 model on their 2021 MacBook Pro with an M1 chip. Worse still, DeepSeek’s models are made freely available to use, with the source code published under the MIT license, along with the research on how they were made (although not the training data), which means they can be adapted and used for commercial use without the need for royalties or fees.
By contrast, OpenAI is anything but open, and its last LLM to be released under the MIT license was 2019’s GPT-2.
No, wait. Let me correct that. DeepSeek’s biggest, ugliest secret is that it’s obviously taking aim at every element in OpenAI’s portfolio. As the company was already dominating headlines, it quietly dropped its Janus-Pro-7B image generation and analysis model, which the company says outperforms both StableDiffusion and OpenAI’s DALL-E 3. And, as with its other code, is also freely available to both commercial and personal users alike, whereas OpenAI has largely paywalled Dall-E 3.
It’s a cynical, vulgar version of David and Goliath, where a tech startup backed by a shadowy Chinese hedge fund with $5.5 billion dollars under management is somehow the plucky upstart against the lumbering, lossy, oafish $150 billion startup backed by a public tech company with a market capitalization of $3.2 trillion.
DeepSeek's V3 model — which is comparable (and competitive with) both OpenAI's GPT 4o and Anthropic's Claude Sonnet 3.5 models (which has some reasoning features) — is 53 times cheaper to run when using the company’s own cloud services. And, as noted above, said model is effectively free for anyone to use — locally or in their own cloud instances — and can be taken by any commercial enterprise and turned into a product of their own, should they so desire.
In essence, DeepSeek — and I'll get into its background and the concerns people might have about its Chinese origins — released two models that perform competitively (and even beat) models from both OpenAI and Anthropic, undercut them in price, and made them open, undermining not just the economics of the biggest generative AI companies, but laying bare exactly how they work. That last point is particularly important when it comes to OpenAI's reasoning model, which specifically hid its chain of thought for fear of "unsafe thoughts" that might "manipulate the customer," then muttered under their breath that the actual reason was that it was a "competitive advantage."
And let's be completely clear: OpenAI's literal only competitive advantage against Meta and Anthropic was its "reasoning" models (o1 and o3, which is currently in research preview). Although I mentioned that Anthropic’s Claude Sonnet 3.5 model has some reasoning features, they’re comparatively more rudimentary than those in o1 and o3.
In an AI context, reasoning works by breaking down a prompt into a series of different steps with "considerations" of different approaches — effectively a Large Language Model checking its work as it goes, with no thinking involved, because these models do not "think" or "know" stuff. OpenAI rushed to launch its o1 reasoning model last year because, and I quote Fortune, Sam Altman was "eager to prove to potential investors in the company's latest funding round that OpenAI remains at the forefront of AI development." And, as I noted at the time, it was not particularly reliable, failing to accurately count the number of times the letter ‘r’ appeared in the word “strawberry.”
At this point, it's fairly obvious that OpenAI wasn’t anywhere near the “forefront of AI development,” and now that its competitive advantage is effectively gone, there are genuine doubts about what comes next for the company.
As I'll go into, there are many questionable parts of DeepSeek's story — its funding, what GPUs it has, and how much it actually spent training these models — but what we definitively understand to be true is bad news for OpenAI, and, I would argue, every other large US tech firm that jumped on the generative AI bandwagon in the past few years.
DeepSeek’s models actually exist, they work (at least, by the standards of hallucination-prone LLMs that don’t, at the risk of repeating myself, know anything in the true meaning of the word), they've been independently verified to be competitive in performance, and they are magnitudes cheaper in price than those from both hyperscalers (EG: Google's Gemini, Meta's Llama, Amazon Q, etc.) and from OpenAI and Anthropic.
DeepSeek's models don't require massive new data centers (they run on the GPUs currently used to run services like ChatGPT, and can even work on more austere hardware), nor do they require an endless supply of bigger, faster NVIDIA GPUs every year to progress. The entire AI bubble was inflated based on the premise that these models were simply impossible to build without burning massive amounts of cash, straining the power grid, and blowing past emissions goals, and that these were necessary costs to create "powerful AI."
Obviously, that wasn’t true. Now the markets are asking a very reasonable question: “did we just waste $200 billion?”
What Is DeepSeek?
First of all, if you want a super deep dive into DeepSeek, I can't recommend VentureBeat's writeup enough. I'll be quoting it liberally, because it deserves the credit for giving a very succinct and well-explained background.
First, some background on how DeepSeek got to where it did. DeepSeek, a 2023 spin-off from Chinese hedge-fund High-Flyer Quant, began by developing AI models for its proprietary chatbot before releasing them for public use. Little is known about the company’s exact approach, but it quickly open sourced its models, and it’s extremely likely that the company built upon the open projects produced by Meta, for example the Llama model, and ML library Pytorch.
To train its models, High-Flyer Quant secured over 10,000 Nvidia GPUs before U.S. export restrictions, and reportedly expanded to 50,000 GPUs through alternative supply routes, despite trade barriers. This pales compared to leading AI labs like OpenAI, Google, and Anthropic, which operate with more than 500,000 GPUs each.
Now, you've likely seen or heard that DeepSeek "trained its latest model for $5.6 million," and I want to be clear that any and all mentions of this number are estimates. In fact, the provenance of the "$5.58 million" number appears to be a citation of a post made by NVIDIA engineer Jim Fan in an article from the South China Morning Post, which links to another article from the South China Morning Post, which simply states that "DeepSeek V3 comes with 671 billion parameters and was trained in around two months at a cost of US$5.58 million" with no additional citations of any kind. As such, take them with a pinch of salt.
While there are some that have estimated the cost (DeepSeek's V3 model was allegedly trained using 2048 NVIDIA h800 GPUs, according to its paper), as Ben Thompson of Stratechery made clear, the "$5.5 million" number only covers the literal training costs of the official training run (and this is made fairly clear in the paper!) of V3, meaning that any costs related to prior research or experiments on how to build the model were left out.
While it's safe to say that DeepSeek's models are cheaper to train, the actual costs — especially as DeepSeek doesn't share its training data, which some might argue means its models are not really open source — are a little harder to guess at. Nevertheless, Thompson (who I, and a great deal of people in the tech industry, deeply respect) lays out in detail how the specific way that DeepSeek describes training its models suggests that it was working around the constrained memory of the NVIDIA GPUs sold to China (where NVIDIA is prevented by US export controls from selling its most capable hardware over fears they’ll help advance the country’s military development):
Here’s the thing: a huge number of the innovations I explained above are about overcoming the lack of memory bandwidth implied in using H800s instead of H100s. Moreover, if you actually did the math on the previous question, you would realize that DeepSeek actually had an excess of computing; that’s because DeepSeek actually programmed 20 of the 132 processing units on each H800 specifically to manage cross-chip communications. This is actually impossible to do in CUDA. DeepSeek engineers had to drop down to PTX, a low-level instruction set for Nvidia GPUs that is basically like assembly language. This is an insane level of optimization that only makes sense using H800s.
DeepSeek's models — V3 and R1 — are more efficient (and as a result cheaper to run), and can be accessed via its API at prices that are astronomically cheaper than OpenAI's. DeepSeek-Chat — running DeepSeek's GPT-4o competitive V3 model — costs $0.07 per 1 million input tokens (as in commands given to the model) and $1.10 per 1 million output tokens (as in the resulting output from the model), a dramatic price drop from the $2.50 per 1 million input tokens and $10 per 1 million output tokens that OpenAI charges for GPT-4o. DeepSeek-Reasoner — its "reasoning" model — costs $0.55 per 1 million input tokens, and $2.19 per 1 million output tokens compared to OpenAI's o1 model, which costs $15 per 1 million input tokens and $60 per 1 million output tokens.
Now, there's a very obvious "but" here. We do not know where DeepSeek is hosting its models, who has access to that data, or where that data is coming from or going. We don't even know who funds DeepSeek, other than that it’s connected to High-Flyer, the hedge fund that it split from in 2023. There are concerns that DeepSeek could be state-funded, and that DeepSeek's low prices are a kind of geopolitical weapon, breaking the back of the generative AI industry in America.
I have no idea whether that’s the case. It’s certainly true that China has long treated AI as a strategic part of its national industrial policy, and is reported to help companies in sectors where it wants to catch up with the Western world. The Made In China 2025 initiative saw a reported hundreds of billions of dollars provided to Chinese firms working in industries like chipmaking, aviation, and yes, artificial intelligence. The extent of the support isn’t exactly transparent, and so it’s not entirely out of the realm of possibility that DeepSeek is the recipient of state aid. The good news is that we're gonna find out fairly quickly. American AI infrastructure company Groq is already bringing DeepSeek's models online, meaning that we'll at the very least get a confirmation of whether these prices are realistic, or heavily-subsidized by whomever it is that backs DeepSeek.
It’s also true that DeepSeek is owned by a hedge fund, which likely isn’t short of cash to pump into the venture.
Aside: Given that OpenAI is the benefactor of millions in cloud compute credits, and gets reduced pricing for Microsoft’s Azure cloud services, it’s a bit tough for them to complain about a rival being subsidized by a larger entity with the ability to absorb the costs of doing business, should that be the case. And yes, I know Microsoft isn’t a state, but with a market cap of $3.2 trillion and quarterly revenues larger than the combined GDPs of some EU and NATO nations, it’s the next best thing.
Whatever concerns there may be about malign Chinese influence are bordering on irrelevant, outside of the low prices offered by DeepSeek itself, and even that is speculative at this point. Once these models are hosted elsewhere, and once DeepSeek's methods (which I'll get to shortly) are recreated (which won't take long), I believe we'll see that these prices are indicative of how cheap these models are to run.
How The Hell Is This So Much Cheaper?
That's a bloody good question, and because I'm me, I have a hypothesis: I do not believe that the companies making foundation models (such as OpenAI and Anthropic) have been incentivized to do more with less, and because their chummy relationships with hyperscalers were focused almost entirely on "make the biggest, most hugest models possible, using the biggest, most hugest chips," and because the absence of profitability didn’t stop them from raising more money, efficiency was never a major problem for them.
Let me put it in simpler terms: imagine living on $1,500 a month, and then imagine how you'd live on $150,000 a month, and you have to, Brewster's Millions style, spend as much of it as you can to complete the mission of "live your life." In the former example, your concern is survival — you have a limited amount of money and must make it go as far as possible, with real sacrifices to be made with every dollar you spend. In the latter, you're incentivized to splurge, to lean into excess, to pursue a vague remit of "living" your life. Your actions are dictated not by any existential threats — or indeed future planning — but by whatever you perceive to be an opportunity to "live."
OpenAI and Anthropic are emblematic of what happens when survival takes a backseat to “living.” They have been incentivized by frothy venture capital and public markets desperate for the next big growth market to build bigger models and sell even bigger dreams, like Dario Amodei of Anthropic saying that your AI "could surpass almost all humans at almost everything" "shortly after 2027." Both OpenAI and Anthropic have effectively lived their existence with the infinite money cheat from The Sims, with both companies bleeding billions of dollars a year after revenue and still operating as if the money will never run out. If they were worried about it, they would have certainly tried to do what DeepSeek has done, except they didn't have to, because both of them had endless cash and access to GPUs from either Microsoft, Amazon or Google.
OpenAI and Anthropic have never been made to sweat, receiving endless amounts of free marketing from a tech and business media happy to print whatever vapid bullshit they spout, raising money at will (Anthropic is currently raising another $2 billion, valuing the company at $60 billion), all off of a narrative of "we need more money than any company has ever needed before because the things we're doing have to cost this much."
Do I think they were aware that there were methods to make their models more efficient? Sure. OpenAI tried (and failed) in 2023 to deliver a more efficient model to Microsoft. I'm sure there are teams at both Anthropic and OpenAI that are specifically dedicated to making things "more efficient." But they didn't have to do it, and so they didn’t.
As I've written before, OpenAI simply burns money, has been allowed to burn money, and up until recently likely would've been allowed to burn even more money, because everybody — all of the American model developers — appeared to agree that the only way to develop Large Language Models was to make the models as big as humanly possible, and work out troublesome stuff like "making them profitable" later, which I presume is when "AGI happens," a thing they are still in the process of defining.
DeepSeek, on the other hand, had to work out a way to make its own Large Language Models within the constraints of the hamstrung NVIDIA chips that can be legally sold to China. While there is a whole cottage industry of selling chips in China using resellers and other parties to get restricted Silicon into the country, as Thompson over at Stratechery explains, the entire way in which DeepSeek went about developing its models suggests that it was working around very specific memory bandwidth constraints (meaning the amount of data that can be fed to and from chips). In essence, doing more with less wasn’t something it chose, but something it had to do.
While it's certainly possible that DeepSeek had unrestrained access to American silicon, the actual work it’s done (which is well-documented in the research paper accompanying the V3 model) heavily suggests it was working within the constraints of lower memory bandwidth. Basically, it wasn’t able to move as much data around the chips, which is a problem because the reason why GPUs are so useful in AI is because they can move a lot of data at the same time and then process it in parallel (running multiple tasks simultaneously). Lower bandwidth means less data moving, which means things like training and inference take longer.
And so, it had to get creative. DeepSeek combined numerous different ways to reduce the amount of the model it loaded into memory at any given time. This included using Mixture of Experts architecture (where models are split into different "experts" that handle different kinds of inputs and outputs — a similar technique to what OpenAI's GPT-4o does) and multi-head latent attention, where DeepSeek compresses the key-value cache (think of it as a place where a Large Language Model writes down everything it's processed so far from an input as it generates) into something called a "latent vector." Essentially, instead of writing down all the information, it just caches what it believes is the most important information.
In simpler terms, DeepSeek's approach breaks the Large Language Model into a series of different experts — specialist parts of the model — to handle specific inputs and outputs, and it’s found a way to take shortcuts with the amount of information it caches without sacrificing performance. Yes, there is a more complex explanation here, but this is so you have a frame of reference.
There's also the training data situation — and another mea culpa. I've previously discussed the concept of model collapse, and how feeding synthetic data (training data created by an AI, rather than a human) to an AI model can end up teaching it bad habits, but it seems that DeepSeek succeeded in training its models using generative data, but specifically for subjects (to quote GeekWire's Jon Turow) "...like mathematics where correctness is unambiguous," and using "...highly efficient reward functions that could identify which new training examples would actually improve the model, avoiding wasted compute on redundant data."
It seems to have worked. Though model collapse may still be a possibility, this approach — extremely precise use of synthetic data — is in line with some of the defenses against model collapse I've heard from LLM developers I've talked to. This is also a situation where we don't know its exact training data, and it doesn’t negate any of the previous points made about model collapse. Synthetic data might work where the output is something that you could figure out on a TI-83 calculator, but when you get into anything a bit more fuzzy (like written text, or anything with an element of analysis) you’ll likely start to encounter unhappy side effects..
There's also some scuttlebutt about where DeepSeek got this data. Ben Thompson at Stratechery suggests that DeepSeek's models are potentially "distilling" other models' outputs — by which. I mean having another model (say, Meta's Llama, or OpenAI's GPT-4o, which is why DeepSeek identified itself as ChatGPT at one point) spit out outputs specifically to train parts of DeepSeek.
Distillation is a means of extracting understanding from another model; you can send inputs to the teacher model and record the outputs, and use that to train the student model. This is how you get models like GPT-4 Turbo from GPT-4. Distillation is easier for a company to do on its own models, because they have full access, but you can still do distillation in a somewhat more unwieldy way via API, or even, if you get creative, via chat clients.
Distillation obviously violates the terms of service of various models, but the only way to stop it is to actually cut off access, via IP banning, rate limiting, etc. It’s assumed to be widespread in terms of model training, and is why there are an ever-increasing number of models converging on GPT-4o quality. This doesn’t mean that we know for a fact that DeepSeek distilled 4o or Claude, but frankly, it would be odd if they didn’t.
OpenAI has reportedly found “evidence” that DeepSeek used OpenAI’s models to train its rivals, according to the Financial Times, although it failed to make any formal allegations, though it did say that using ChatGPT to train a competing model violates its terms of service. David Sacks, the investor and Trump Administration AI and Crypto czar, says “it’s possible” that this occurred, although he failed to provide evidence.
Personally, I genuinely want OpenAI to point a finger at DeepSeek and accuse it of IP theft, purely for the hypocrisy factor. This is a company that exists purely from the wholesale industrial larceny of content produced by individual creators and internet users, and now it’s worried about a rival pilfering its own goods?
Cry more, Altman, you nasty little worm.
So, Why's Everybody Freaking Out?
As I've written about many, many, many times, the Large Language Models run by companies like OpenAI, Anthropic, Google and Meta are unprofitable and unsustainable, and the transformer-based architecture they run on has peaked. They're running out of training data, and the actual capabilities of these models were peaking as far back as March 2024.
Nevertheless, I had assumed — incorrectly — that there would be no way to make them more efficient, because I had assumed — also incorrectly — that the hyperscalers (along with OpenAI and Anthropic) would be constantly looking for ways to bring down the ruinous costs of their services. After all, OpenAI lost $5 billion (after $3.7 billion in revenue, too!), and Anthropic just under $3 billion in 2024.
What I didn't wager was that, potentially, nobody was trying. My mistake was — if you can believe this — being too generous to the AI companies, assuming that they didn’t pursue efficiency because they couldn’t, and not because they couldn’t be bothered.
You see, the pre-DeepSeek status quo was one where several truths allowed the party to keep going:
- These models were incredibly expensive to train — $100 million in the middle of 2024, and as high as $1 billion for future models — and that training future models would thus necessitate spending billions on both data centers and the GPUs necessary to keep training even bigger models.
- These models had to be large, because making them large — pumping them full of training data and throwing masses of compute about them — would unlock new features, such as "[an] AI that helps us accomplish much more than we ever could without AI," such as having "a personal AI team, full of virtual experts in different areas, working together to create almost anything we can imagine."
- These models were incredibly expensive to run, but it was worth it, because making these models powerful was more important than making them efficient, because "once the price of silicon came down" (a refrain I've heard from multiple different people as a defense of the ruinous cost of generative AI) we would have these powerful models that were, uh, cheaper, because of silicon.
- As a result of this need to make bigger, huger models, the most powerful ones, big, beautiful models, we would of course need to keep buying bigger, more powerful GPUs, which would continue American excellence™.
- By following this roadmap, "everybody" wins — the hyperscalers get the justification they needed to create more sprawling data centers and spend massive amounts of money, OpenAI and their ilk continue to do the work to "build powerful models," and NVIDIA continues to make money selling GPUs. It’s a kind of capitalist death cult that ran on plagiarism and hubris, the assumption being that at some point all of this would make sense.
Now, I've argued for a while that the latter plan was insane — that there was no path to profitability for these Large Language Models, as I believed there simply wasn't a way to make these models more efficient.
In a way, I was right. The current models developed by both the hyperscalers (Gemini, Llama, et. al) and multi-billion-dollar "startups" like OpenAI and Anthropic are horribly inefficient, I had just made the mistake of assuming that they'd actually tried to make them more efficient.
What we're witnessing is the American tech industry's greatest act of hubris — a monument to the barely-conscious stewards of so-called "innovation," incapable of breaking the kayfabe of "competition" where everybody makes the same products, charges about the same amount, and mostly "innovates" in the same direction.
Somehow nobody — not Google, not Microsoft, not OpenAI, not Meta, not Amazon, not Oracle — thought to try, or was capable of creating something like DeepSeek, which doesn't mean that DeepSeek's team is particularly remarkable, or found anything new, but that for all the talent, trillions of dollars of market capitalization and supposed expertise in America's tech oligarchs, not one bright spark thought to try the things that DeepSeek tried, which appear to be "what if we didn't use as much memory and what if we tried synthetic data."
And because the cost of model development and inference was so astronomical, they never assumed that anyone would try to usurp their position. This is especially bad, considering that China’s focus on AI as a strategic part of its industrial priority was no secret — even if the ways it supported domestic companies was. In the same way that the automotive industry was blindsided by China’s EV manufacturers, the same is now happening to AI.
Fat, happy and lazy, and most of all, oblivious, America's most powerful tech companies sat back and built bigger, messier models powered by sprawling data centers and billions of dollars of NVIDIA GPUs, a bacchanalia of spending that strains our energy grid and depletes our water reserves without, it appears, much consideration of whether an alternative was possible. I refuse to believe that none of these companies could've done this — which means they either chose not to, or were so utterly myopic, so excited to burn so much money and so many parts of the Earth in pursuit of further growth, that they didn't think to try.
This isn't about China — it's so much fucking easier if we let it be about China — it's about how the American tech industry is incurious, lazy, entitled, directionless and irresponsible. OpenAi and Anthropic are the antithesis of Silicon Valley. They are incumbents, public companies wearing startup suits, unwilling to take on real challenges, more focused on optics and marketing than they are on solving problems, even the problems that they themselves created with their large language models.
By making this "about China" we ignore the root of the problem — that the American tech industry is no longer interested in making good software that helps people.
DeepSeek shouldn't be scary to them, because they should've come up with it first. It uses less memory, fewer resources, and uses several quirky workarounds to adapt to the limited compute resources available — all things that you'd previously associate with Silicon Valley, except Silicon Valley's only interest, like the rest of the American tech industry, is The Rot Economy. It cares about growth at all costs, even if said costs were readily mitigable, or if the costs are ultimately self-defeating.
To be clear, if the alternative is that all of these companies simply didn't come up with this idea, that in and of itself is a damning indictment of the valley. Was nobody thinking about this stuff? If they were, why didn't Sam Altman, or Dario Amodei, or Satya Nadella, or anyone else put serious resources into efficiency? Was it because there was no reason to? Was it because there was, if we're honest, no real competition between any of these companies? Did anybody try anything other than throwing as much compute and training data at the model as possible?
It's all so cynical and antithetical to innovation itself. Surely if any of this shit mattered — if generative AI truly was valid and viable in the eyes of these companies — they would have actively worked to do something like DeepSeek.
Don't get me wrong, it appears DeepSeek employed all sorts of weird tricks to make this work, including taking advantage of distinct parts of both CPU and GPU to create a virtual Digital Processing Unit, essentially redefining how data is communicated within the servers running training and inference. It had to do things that a company with unrestrained access to capital and equipment wouldn’t have to do.
Nevertheless, OpenAI and Anthropic both have enough money and hiring power to have tried — and succeeded — in creating a model this efficient and capable of running on older GPUs, except what they actually wanted was more rapacious growth and the chance to build even bigger data centers with even more compute. OpenAI has pledged $19 billion to fund the "Stargate" data center — an amount it is somehow going to raise through further debt and equity raises, despite the fact that it’s likely already in the process of raising another round as we speak just to keep the company afloat.
OpenAI is as much a lazy, cumbersome incumbent as Google or Microsoft, and it’s just as innovative too. The launch of its "Operator" "agent" was a joke — a barely-functional product that is allegedly meant to control your computer and take distinct actions, but doesn't seem to work. Casey Newton, a man so gratingly credulous that it makes me want to scream, of course wrote that it was a "compelling demonstration" that "represented an extraordinary technological achievement" that also somehow was "significantly slower, more frustrating, and more expensive than simply doing any of these tasks yourself."
Casey, of course, had some thoughts about DeepSeek — that there were reasons to be worried, but that "American AI labs [were] still in the lead," saying that DeepSeek was "only optimizing technology that OpenAI and others invented first," before saying that it was "only last week that OpenAI made available to Pro plan users a computer that can use itself," a statement bordering on factually incorrect.
Let's be frank: these companies aren't building shit. OpenAI and Anthropic are both limply throwing around the idea that "agents are possible" in an attempt to raise more money to burn, and after the launch of DeepSeek, I have to wonder what any investor thinks they're investing in.
OpenAI can't simply "add on" DeepSeek to its models, if not just for the optics. It would be a concession. An admittal that it slipped and needs to catch up, and not to its main rival, or to another huge tech firm, but to a company that few, before last weekend, had even heard of. And this, in turn, will make any investor think twice about writing the company a blank check — which, as I’ve said ad nauseum, is potentially fatal, as OpenAI needs to continually raise more money than any startup ever has in history, and it has no path to breaking even.
If OpenAI wants to do its own cheaper, more-efficient model, it’ll likely have to create it from scratch, and while it could do distillation to make it "more OpenAI-like" using OpenAI's own models, that's effectively what DeepSeek already did. Even with OpenAI's much larger team and more powerful hardware, it's hard to see how creating a smaller, more-efficient, and almost-as-powerful version of o1 benefits the company, because said version has, well, already been beaten to market by DeepSeek, and thanks to DeepSeek will almost certainly have a great deal of competition for a product that, to this day, lacks any real killer apps.
And, again, anyone can build on top of what DeepSeek has already built. Where is OpenAI's moat? Where is Anthropic's moat? What are the things that truly make these companies worth $60 or $150 billion? What is the technology they own, or the talent they have that justifies these valuations, because it's hard to argue that their models are particularly valuable anymore.
Celebrity, perhaps? Altman, as discussed previously, is an artful bullshitter, having built a career out of being in the right places, having the right connections, and knowing exactly what to say — especially to a credulous tech media without the spine or inclination to push back on his more fanciful claims. And already, Altman has tried to shrug off DeepSeek’s rise, admitting that while “deepseek's r1 is an impressive model,” particularly when it comes to its efficiency, “[OpenAI] will obviously deliver much better models and also it's legit invigorating to have a new competitor!”
He ended with “look forward to bringing you all AGI and beyond” — something which, I add, has always been close on the horizon in Altman’s world, although curiously has yet to materialize, or even come close to materializing.
Altman is, in essence, the Muhammad Saeed al-Sahhaf of tech — the Saddam-era Iraqi Minister of Information who, as Abrams tanks entered Baghdad and gunfire could be heard in the background, proclaimed an entirely counterfactual world where the coalition forces weren’t merely losing, but American troops were “committing suicide by the hundreds on the gates of Baghdad.” It’s adorable, and yes, it’s also understandable, but nobody should — or could — believe that OpenAI hasn’t just suffered some form of existential wound.
DeepSeek has commoditized the Large Language Model, publishing both the source code and the guide to building your own. Whether or not someone chooses to pay DeepSeek is largely irrelevant — someone else will take what it’s created and build their own, or people will start running their own DeepSeek instances renting GPUs from one of the various cloud computing firms.
While NVIDIA will find other ways to make money — Jensen Huang always does — it's going to be a hard sell for any hyperscaler to justify spending billions more on GPUs to markets that now know that near-identical models can be built for a fraction of the cost with older hardware. Why do you need Blackwell? The narrative of "this is the only way to build powerful models" no longer holds water, and the only other selling point it has is "what if the Chinese do something?"
Well, the Chinese did something, and they've now proven that they can not only compete with American AI companies, but do so in such an effective way that they can effectively crash the market.
It still isn't clear if these models are going to be profitable — as discussed, it's unclear who funds DeepSeek and whether its current pricing is sustainable — but they are likely going to be a damn sight more profitable than anything OpenAI is flogging. After all, OpenAI loses money on every transaction — even its $200-a-month "ChatGPT Pro" subscription. And if OpenAI cuts its prices to compete with DeepSeek, its losses will only deepen.
And as I’ve said above, this is all so deeply cynical, because it’s obvious that none of this was ever about the proliferation of generative AI, or making sure that generative AI was “accessible.”
Putting aside my personal beliefs for a second, it’s fairly obvious why these companies wouldn’t want to create something like DeepSeek — because creating an open source model that uses less resources means that OpenAI, Anthropic and their associated hyperscalers would lose their soft monopoly on Large Language Models.
I’ll explain.
Before DeepSeek, to make a competitive Large Language Model — as in one that you can commercialize — required exceedingly large amounts of capital, and to make larger ones effectively required you to kiss the ring of Microsoft, Google or Amazon. While it isn’t clear what it cost to train OpenAI’s o1 reasoning model, we know that GPT-4o cost around $100 million, and o1, as a more complex model, would likely cost even more.
We also know that OpenAI’s training and inference costs in 2024 were around $7 billion, meaning that either refining current models or building new ones is quite costly.
The mythology of both OpenAI and Anthropic is that these large amounts of capital weren’t just necessary, but the only way to do this. While these companies ostensibly “compete,” neither of them seemed concerned about doing so as actual businesses that made products that were, say, cheaper and more efficient to run, because in doing so they would break the illusion that the only way to create “powerful artificial intelligence” was to hand billions of dollars to one of two companies, and build giant data centers to build even larger language models.
This is artificial intelligence’s Rot Economy — two lumbering companies claiming they’re startups creating a narrative that the only way to “build the future” is to keep growing, to build more data centers, to build larger language models, to consume more training data, with each infusion of capital, GPU purchase and data center buildout creating an infrastructural moat that always leads back to one of a few tech hyperscalers.
OpenAI and Anthropic need the narrative to say “buy more GPUs and build more data centers,” because in doing so they create the conditions of an infrastructural monopoly, because the terms — forget about “building software” that “does stuff” for a second — were implicitly that smaller players can’t enter the market because “the market” is defined as “large language models that cost hundreds of millions of dollars and require access to more compute than any startup could access without the infrastructure of a public tech company.”
Remember, neither of these companies has ever marketed themselves based on the products they actually build. Large Language Models are, in and of themselves, a fairly bland software product, which is why we’re yet to see any killer apps. This isn’t a particularly exciting pitch to investors or the public markets, because there’s no product, innovation or business model to point to, and if they’d actually try and productize it and turn it into a business, it’s quite obvious at this point that there really isn’t a multi-trillion dollar industry for generative AI.
Indeed, look at the response to Microsoft’s strong-arming of co-pilot on Office 365 users, both personal and commercial. Nobody said “wow, this is great.” Lots of people asked “why am I being charged significantly more for a product that I don’t care about?”
OpenAI only makes 27% of its revenue from selling access to its models — around a billion dollars in annual recurring revenue — with the rest ($2.7 billion or so) coming from subscriptions to ChatGPT. If you ignore the hype, OpenAI and Anthropic are deeply boring software businesses with unprofitable, unreliable products prone to hallucinations, and their new products — such as OpenAI’s Sora — cost way too much money to both run and train to get results that, well, suck. Even OpenAI’s push into the federal government, with the release of ChatGPT Gov, is unlikely to reverse its dismal fortunes.
The only thing that OpenAI and Anthropic could do is sell the market a story about a thing it’s yet to build (such as AI that will somehow double human lifespans), and heavily intimate (or outright say) that the only way to build these made-up things was to keep funnelling billions to their companies and, by extension, that hyperscalers would have to keep funnelling billions of dollars to NVIDIA and into building data centers to crunch numbers in the hope that this wonderful, beautiful and entirely fictional world would materialize.
To make this *more* than a deeply boring software business, OpenAI and Anthropic needed models to get larger, and for the story to always be that there was only one way to build the future, that it cost hundreds of billions of dollars, and that only the biggest geniuses (who all happen to work at the same two or three places) were capable of doing it.
Post-DeepSeek, there isn’t really a compelling argument for investing hundreds of billions of capital expenditures in data centers, buying new GPUs, or even pursuing Large Language Models as they currently stand. It’s possible — and DeepSeek, through its research papers, explained in detail how — to build models competitive with both of OpenAI’s leading models, and that’s assuming you don’t simply build on top of the ones DeepSeek released.
It also seriously calls into question what it is you’re paying OpenAI for in its various subscriptions — most of which (other than the $200-a-month “Pro” subscription) have hard limits on how much you can use OpenAI’s most advanced reasoning models.
One thing we do know is that OpenAI and Anthropic will now have to either drop the price of accessing their models, and potentially even the cost of their subscriptions. I’d argue that despite the significant price difference between o1 and DeepSeek’s r1 reasoning model, the real danger to both OpenAI and Anthropic is DeepSeek v3, which competes with GPT-4o.
DeepSeek’s narrative shift isn’t just commoditizing LLMs at large, but commoditizing the most expensive ones run by two monopolists backed by three other monopolists.
Fundamentally, the magic has died. There’s no halo around Sam Altman or Dario Amodei’s head anymore, as their only real argument was “we’re the only ones that can do this,” something that nobody should’ve believed in the first place.
Up until this point, people believed that the reason these models were so expensive was because they had to be, and that we had to build more data centers and buy more silicon because that was just how things were. They believed that “reasoning models” were the future, even if members of the media didn’t really seem to understand what they did or why they mattered, and that as a result they had to be expensive, because OpenAI and their ilk were just so smart, even though it wasn’t obvious what it was that “reasoning” allowed you to do.
Now we’re going to find out, because reasoning is commoditized, along with Large Language Models in general. Funnily enough, the way that DeepSeek may have been trained — using, at least in part, synthetic data — also pushes against the paradigm that these companies even need to use other people’s training data, though their argument, of course, will be that they “need more.”
We also don’t know the environmental effects, because even if it’s cheaper, these models still require expensive, energy-guzzling GPUs to run at full-tilt.
In any case, if I had to guess, the result will be the markets accepting that generative AI isn’t the future. OpenAI and Anthropic no longer have moats to raise capital with. Sure, they could con another couple of billion dollars out of Masayoshi Son and other gormless billionaires, but what’re they offering, exactly? The chance to continue an industry-wide con? The chance to participate in the capitalist death cult? The chance to burn money at a faster rate than WeWork ever did?
Or will this be the time that Microsoft, Amazon and Google drop OpenAI and Anthropic, making their own models based on DeepSeek’s work? What incentive is there for them to keep funding these companies? The hyperscalers hold all the cards — the GPUs and the infrastructure, and in the case of Microsoft, non-revocable licenses that permits it unfettered use and access to OpenAI’s tech — and there’s little stopping them building their own models and dumping GPT and Claude.
As I’ve said before, I believe we’re at peak AI, and now that generative AI has been commoditized, the only thing that OpenAI and Anthropic have left is their ability to innovate, which I’m not sure they’re capable of doing.
And because we sit in the ruins of Silicon Valley, with our biggest “startups” all doing the same thing in the least-efficient way, living at the beck and call of public companies with multi-trillion-dollar market caps, everyone is trying to do the same thing in the same way based on the fantastical marketing nonsense of a succession of directionless rich guys that all want to create America’s Next Top Monopoly.
It’s time to wake up and accept that there was never an “AI arms race,” and that the only reason that hyperscalers built so many data centers and bought so many GPUs because they’re run by people that don’t experience real problems and thus don’t know what problems real people face. Generative AI doesn’t solve any trillion-dollar problems, nor does it create outcomes that are profitable for any particular business.
DeepSeek’s models are cheaper to run, but the real magic trick they pulled is that they showed how utterly replaceable a company like OpenAI (and by extension any Large Language Model company) really is. There really isn’t anything special about any of these companies anymore — they have no moat, their infrastructural advantage is moot, and their hordes of talent irrelevant.
What DeepSeek has proven isn’t just technological, but philosophical. It shows that the scrappy spirit of Silicon Valley builders is dead, replaced by a series of different management consultants that lead teams of engineers to do things based on vibes.
You may ask if all of this means generative AI suddenly gets more prevalent — after all, Satya Nadella of Microsoft quoted Jevons paradox, which posits that when resources are made more efficient their use increases.
Sadly, I hypothesize that something else happens. Right now, I do not believe that there are companies that are stymied by the pricing that OpenAI and their ilk offer, nor do I think there are many companies or use cases that don’t exist because Large Language Models are too expensive. AI companies took up a third of all venture capital funding last year, and on top of that, it’s fairly easy to try reasoning models like o1 and make a proof of concept without having to make an entire operational company. I don’t think anyone has been “on the sidelines” of generative AI due to costs, (and remember, few seemed to be able to come up with a great use case for o1 or other reasoning models), and DeepSeek’s models, while cheaper, don’t have any new functionality.
Chaos Hypothetical! One way in which this entire facade could fall is if Mark Zuckerberg decides that he wants to simply destroy the entire market for Large Language Models. Meta has already formed four separate war rooms to break down how DeepSeek did it, and apparently, to quote The Information, “In pursuing Llama, CEO Mark Zuckerberg wants to commoditize AI models so that the applications that use such models, including Meta’s, generate more money than the sales of the AI models themselves. That could hurt Meta’s AI rivals such as OpenAI and Anthropic, which are on pace to generate billions of dollars in revenue from such sales.”
I could absolutely see Meta releasing its own version of DeepSeek’s models — it has the GPUs and Zuckerberg can never be fired, meaning that if he decided to simply throw billions of dollars into specifically creating his own deep-discounted LLMs to wipe out OpenAI he absolutely could. After all, last Friday Zuckerberg said that Meta would spend between $60 billion and $65 billion in capital expenditures this year — before the DeepSeek situation hit fever pitch — and I imagine the markets would love a more modest proposal that involves Meta offering a ChatGPT-beater simply to fuck over Sam Altman.
As a result, I don’t really see anything changing, beyond the eventual collapse of the API market for companies like Anthropic and OpenAI. Large Language Models (and reasoning models) are niche. The only reason that ChatGPT became such a big deal is because the tech industry has no other growth ideas, and despite the entire tech industry and public markets screaming about it, I can’t think of any major mass-market product that really matters.
ChatGPT is big because “everybody is talking about AI,” and ChatGPT is the big brand in AI. It’s not essential, and it’s only been treated as such because the media (and the markets) ran away with a narrative they barely understood. DeepSeek pierced that narrative because believing it also required you to believe that Sam Altman is a magician, versus an extremely shitty CEO that burned a bunch of money.
Sure, you can argue that “DeepSeek just built on top of software that already existed thanks to OpenAI,” which begs a fairly obvious question: why didn’t OpenAI? And another fairly obvious question: why does it matter?
In any case, the massive expense of running generative models hasn’t been the limiter on their deployment or success — you can blame that on the fact that they, as a piece of technology, are neither artificial intelligence nor capable of providing the kind of meaningful outcomes that would make them the next Smartphone.
It’s all been a con, a painfully-obvious one, one I’ve been screaming about since February 2024, trying to explain that beneath the hype was an industry that provided modest-at-best outcomes rather than resembling any kind of “next big thing.”
Without “reasoning” as its magical new creation, OpenAI has nothing left. “Agents” aren’t coming. “AGI” isn’t coming. It was all flimflam to cover up how mediocre and unreliable the fundament of the supposed “AI revolution” really was.
All of this money, time, energy and talent was wasted thanks to a media industry that fails to hold the powerful to account, and markets run by executives that don’t know much of anything, and it looks like it got broken in two the moment that a few hundred Chinese engineers decided to compete.
It’s utterly sickening.