Buy an LLM subscription, pay an API bill, rent a high-spec server — that’s probably what comes to mind when you think about the cost of AI. But the truth is, the next time you go to buy a laptop, the price tag already quietly includes an AI bill you never noticed.
On June 25, 2026, Apple raised prices across its hardware lineup by 15% to 25%, meaning a single device could now cost up to $300 more. Faced with a memory shortage, Tim Cook publicly described the situation as a “once-in-a-century flood” (NYT, Bloomberg). Dell, HP, and Lenovo soon followed with hikes of their own. You might assume tariffs or logistics are to blame, but the real force behind this is the production schedule of three giants: Samsung, SK Hynix, and Micron. These three control more than 90% of the world’s high-end memory capacity, and in the quarter just past, their profits were so high they left TSMC behind — Micron’s 84.9% gross margin putting even the foundry king to shame.
In this fight for capacity, the cash-flush AI data centers grabbed the most prioritized supply slots, while ordinary electronics production got pushed to the back of the queue. The extra $300 you pay for a 16GB unified-memory MacBook traces back to one fact: the memory makers are reserving their wafers for the higher-bidding AI clients, and Apple doesn’t get to keep that margin itself. This chain reaction won’t show up directly on your LLM subscription bill, but it gets quietly amortized into every electronic product you buy.
The reason Apple and Dell have to raise prices comes down to one thing: there simply aren’t enough memory chips available on the market to buy.
Set two pieces of hardware side by side and the problem becomes clear. A thin-and-light laptop needs 16GB of unified memory, while a single H200 accelerator card demands up to 141GB of high-bandwidth memory (HBM). Both are made from the same DRAM wafers, but HBM uses a multi-layer vertical chip-stacking process. Producing 1GB of HBM consumes enough wafer area to make 3 to 4GB of ordinary memory (Wikipedia). So when the big players send their wafers off to be packaged into HBM, there’s less material left over for the LPDDR5 memory that goes into laptops. Every chunk of raw material the AI clients eat up is a chunk the consumer hardware supply loses.
It all comes down to profit. HBM’s gross margin has long since cleared 50%, while ordinary memory’s long-term profit hovers around 10% to 30%. Memory fabs can technically switch between producing the two (SemiWiki), but no company is going to voluntarily pass up a high-margin business to make a low-return product. Total capacity is what it is, and selling scarce chips to the best-funded clients is just sound business logic.
In barely twenty months, cloud servers went from being an ordinary buyer to the single largest customer, consuming roughly 70% of global memory output. In 2022, servers accounted for only 20% to 30% of memory production; by 2026 that figure has shot up to around 70% (Tech Insider). The memory demand from ordinary computers and phones hasn’t vanished — it has simply lost out entirely on supply priority to the deep-pocketed AI buyers.
A 15% price hike on phones and computers — you might think it’s just supply chain material shortages. By normal commercial logic, supply tightness means the whole chain shares the cost: suppliers earn thin margins, buyers pay a small premium. But that’s not how the memory market works anymore.
Micron reported in its latest earnings that its gross margin has soared to 84.9%, up from just 39% in the same period a year earlier (Investing.com). Samsung’s memory division posted a 66% profit margin, and SK Hynix came in at 72%. All three are now out-earning TSMC, whose long-term gross margin sits around 60% (Tom’s Hardware). Memory used to be treated as a cyclical commodity, and now its money-making efficiency has overtaken the most sophisticated chip foundry on the planet.
A gross margin near 85% means that for every dollar of product sold, the direct cost is under fifteen cents. If the price hikes were merely covering rising raw material and equipment costs, there’s no way these companies’ margins would have doubled. The margin explosion shows a massive gap has opened between what these products sell for and what they actually cost to make. Margins this high prove one thing: these giants are exercising their pricing power to set prices far above production cost, not just passively absorbing cost increases.
This pricing power comes from extreme industry concentration. Samsung, SK Hynix, and Micron together control over 90% of global DRAM share — a moat built over decades of accumulated capital and technology. For AI clients, HBM is a strategic necessity for building compute capacity, so they have to buy no matter how high prices climb, and their purchase volumes don’t shrink with the price tag. The monopolist memory makers naturally follow the profit motive and tilt the capacity balance toward the AI clients paying more.
Micron disclosed that it has signed 16 multi-year long-term agreements with key customers, 14 of which lock in roughly $100 billion in guaranteed revenue. To secure their allocation in a tight market, these buyers have prepaid nearly $22 billion in credit and funding guarantees — including $18 billion in hard-cash prepayments and $4 billion in letters of credit (Securities Times). Locking up factory capacity years in advance and putting down this much deposit money is almost unheard of in the memory industry’s history. Since AI clients are willing to absorb purchase costs far above the average to guarantee on-time delivery, the highly monopolistic memory giants naturally prioritize these mega-buyers. The result: makers of ordinary computers and phones get shuffled to the back of the waiting list.
The price hikes you face when buying a computer or phone are, at root, the direct result of the three giants reserving large chunks of their production-line allocation for AI clients when memory runs short. The extra money ordinary consumers fork over at checkout is, in effect, us paying for this allocation system.
With memory prices climbing steadily, ordinary consumers have already started walking away.
In 2026, global smartphone shipments contracted 13.9% year over year, and PC shipments fell 11.3% — both hitting historic lows (IDC). Faced with ever-rising prices, people have responded in their own ways: keep their old computer or phone going another couple years, settle for a base-model device, or just skip the upgrade entirely. Even though the average smartphone selling price hit a record $550, total shipments are shrinking sharply, leaving many brands focused on the mid- to low-end market struggling (Tech Insider).
This supply-demand imbalance exposes a massive gap in risk tolerance between the two sides. Ordinary consumers are highly price-sensitive when it comes to electronics — cross their psychological line and they simply stop buying. Tech giants buying HBM, by contrast, hardly look at short-term premiums. Compute centers are an absolute necessity; once a project breaks ground, it can’t be paused midway just because memory got more expensive. For the wafer-holding memory makers, taking the same material, processing it deeper, and selling it to the AI whales means both higher margins and remarkably stable orders. Feeding material into the consumer-electronics market, by contrast, means accepting worse prices and shouldering the risk of cancellations at any time.
The asymmetry between the two sides’ demand has carved out a clear dividing line. Ordinary consumers choosing to wait and see can’t shake the upstream chipmakers’ production plans. The big AI developers’ procurement contracts are already signed through 2027 and 2028, and the massive deposits have arrived upfront. So even if the consumer end-market sells a few hundred thousand fewer computers and phones, that’s a rounding error that can’t budge the factory planning of giants like Samsung. This is the brutal reality: the clients who can pay the highest prices directly determine where resources flow, and budget-constrained ordinary buyers either pick over what’s left or get pushed out entirely. Demand fluctuations from ordinary consumers get filtered out automatically at the source — the production-planning stage.
When will this across-the-board price hike finally hit an inflection point? Right now, two diametrically opposed yet equally plausible analyses coexist in the market.
The camp arguing “the shortage won’t ease anytime soon” points out that HBM’s share of DRAM wafer consumption has climbed from 19% in 2025 to 23% in 2026, and analysts predict it could push as high as 35% going forward (EnkiAI). The three memory giants’ HBM capacity for 2026 is already sold out, with many large orders running straight into 2027 and 2028. Micron CEO Sanjay Mehrotra said on the latest earnings call that the AI wave is only just beginning, and offered a reference data point: a single intelligent humanoid robot requires ten times the memory of a mainstream L2+ smart car (Investing.com). Once this hardware starts mass production, society’s demand for memory will surge even more dramatically. Industry analyst IDC has gone so far as to define the current situation as a “systemic reallocation of capacity” — not some transient supply-demand wobble (IDC). Meanwhile, Samsung and SK Hynix have struck a similar note, expecting the memory shortage to last at least through 2027, with some pressure not easing until 2030 (TechPowerUp).
But the argument that “the cycle always wins, prices always come down” is just as well-grounded. Over 70 years of development, the semiconductor industry has weathered more than 15 downturns, and a typical DRAM price cycle runs 3 to 5 years — a pattern no one has broken yet (Fabricated Knowledge, UncoverAlpha). Right now, all three memory giants are racing to expand capacity. Samsung plans to boost output 50% in 2026, SK Hynix has committed over $30 billion in total investment, and Micron has locked its full-year capital spending at around $20 billion. These new production lines now under construction are expected to deliver a concentrated capacity surge in 2027 and 2028. This closely mirrors the global battery-grade lithium shortage of 2021 to 2022: the EV boom drove raw material prices up fivefold, triggering widespread shortages of battery materials. But within three years, as upstream mining and processing caught up, prices embarked on a clear downward trajectory (ICCT). On top of that, the downstream is pushing hard on algorithm optimization and memory-footprint reduction. Google’s TurboQuant compression algorithm, for instance, is said to cut a large model’s memory footprint to a sixth of its original size (Traderverse). The one-two punch of aggressive upstream expansion and desperate downstream belt-tightening is exactly how every previous shortage cycle has ended.
Weighing both sides, memory prices will most likely begin to ease in 2027 to 2028. That doesn’t mean they’ll drop back to the 2024 trough, though. AI’s rigid consumption of HBM has effectively put a new, higher step underneath chip prices. As a result, the “entry-level” configurations on store shelves — an 8GB unified-memory computer, say, or a 128GB phone — are likely to linger there much longer than they used to. As new factories come online and model-compression techniques spread, memory costs will come down some. It’s just that the floor they land on will be far higher than before.
For those of us who write code and fine-tune models, you don’t need a macro-analysis report to feel the memory and GPU price hikes — just look at your own checkout receipt.
DDR5 memory prices have now risen to four times their previous level. The spot price of an RTX 5090 GPU is carrying a 65% premium. On the cloud side, memory-intensive cloud server instances are projected to rise 10% to 15% (Bloomberg). If you’re planning to build an x86 dev machine right now and spec it reasonably well, the memory might end up costing more than the GPU.
In the current hardware market, something counterintuitive has emerged. Apple’s M-series unified-memory architecture has shown remarkable resilience in this wave of price hikes. Because M-series processors let the CPU and GPU share the same physical memory, they sidestep the steep cost of dedicated premium VRAM and external DDR5. While traditional x86 dev machines see their costs skyrocket as memory quadruples in price, the cost of upgrading a Mac’s unified memory now looks comparatively gentle. This has shifted the logic of configuring a local dev machine in a subtle way: Apple’s edge is no longer just raw chip benchmarks but better value per gigabyte of unified memory.
There’s a more hidden transmission chain quietly at work inside cloud servers. The surge in HBM prices directly raises GPU procurement costs, which gets passed on into cloud instance rental rates. This inevitably pushes up the per-compute cost for LLM providers. Faced with this, providers have two options: either raise API prices directly, or quietly degrade service quality. In practice, the latter shows up as more aggressive batching, shortened context windows, or tighter rate limits. So even if your local work doesn’t require buying a single physical chip, the memory shortage ultimately turns into more expensive, less stable APIs that hit your projects directly.
Looking at the next two to three years, when designing your technical roadmap and planning hardware, it’s best to treat “persistently high memory costs” as a default baseline assumption. During this price peak in 2026 to 2027, avoid signing long-term fixed high-price supply contracts. Even if, over the long cycle, supply and demand eventually pull prices back toward normal, don’t naively expect them to drop all the way back to 2024 lows. AI compute has already reshaped the entire memory production line, pushing underlying manufacturing and procurement costs onto a new plateau. No matter how the market fluctuates from here, the dirt-cheap memory of the past is, definitively, history.