AI chips give 24x more per dollar, if you can afford the sticker

The economics of AI hardware reward patience and punish small budgets at the same time. Epoch AI finds that AI chip performance per dollar has improved by about 37 percent per year across more than twenty accelerators released between 2012 and 2025. The newest flagship, the GB300, delivers roughly 24 times the performance per dollar of the 2016 P100, while costing nearly 9 times as much to buy.

Both facts are true and they pull in opposite directions. Value per dollar keeps climbing, so the long-run cost of a given workload falls. The entry ticket also keeps climbing, so the upfront capital needed to play at the top rises with every generation. The result favours buyers who can amortise a high sticker price over heavy use.

Sticker price versus lifetime value

chips = {
    "P100 (2016)":  {"price": 1.0, "value_per_dollar": 1.0},
    "GB300 (2025)": {"price": 9.0, "value_per_dollar": 24.0},
}

for name, c in chips.items():
    total_throughput = c["price"] * c["value_per_dollar"]   # price x perf/$
    print(f"{name:14s}: {total_throughput:5.1f} units of work per P100-dollar")

The throughput column shows why hyperscalers buy the expensive part. At nine times the price and twenty-four times the value, the GB300 does far more total work per unit of capital, but only if you keep it busy. Idle, it is just an expensive depreciation line.

The split it creates

High-utilisation buyers: chase the newest chip, since perf-per-dollar wins.
Spiky or small workloads: rent, or buy a generation behind.
Everyone: the rising entry price reinforces where compute concentrates.

Cheaper-per-dollar and more-expensive-to-own are the same trend. Which one you feel depends on how full your machines stay. The hardware price-performance data is published by Epoch AI.

Utilization decides who benefits

Performance per dollar is a lifetime claim. It assumes the buyer can feed the chip enough work to earn back the sticker price. A hyperscaler with constant training, fine-tuning, and inference demand can keep a flagship accelerator busy. A small company with bursty workloads may pay for capability that sits idle most of the week.

That is why the same chip can be cheap for one buyer and expensive for another. If utilization is high, the faster part lowers the cost of each completed job. If utilization is low, depreciation dominates. The buyer owns a powerful machine but still pays for the unused hours, power, support, and opportunity cost of capital.

Cloud pricing is partly a way to sell utilization. Customers with spiky demand rent the expensive chip only when they need it. Cloud providers aggregate many customers, smooth the demand curve, and keep the fleet busier. The trade-off is that renters pay a margin and may lose access when scarce capacity is reserved for larger accounts.

Older chips can be the rational choice

The newest accelerator is not always the best economic fit. Many inference jobs are memory-bound, latency-bound, or quality-bound before they are raw-compute bound. A previous-generation chip can deliver the same user-visible result at a lower rental rate or acquisition cost. The important comparison is cost per successful task, not benchmark throughput alone.

The case for older chips strengthens as models get smaller and more efficient. Quantization, distillation, and better serving stacks can push useful workloads onto hardware that no longer sits at the frontier. That extends the economic life of older fleets and helps explain why total installed compute matters, not only the newest shipment.

Procurement teams should therefore segment workloads before buying. Training a frontier model, serving a high-volume assistant, running nightly batch summaries, and powering internal search may each deserve a different hardware tier. One blended hardware strategy usually hides waste.

The accounting view

Sticker price is only the first line. Buyers need to include power, cooling, networking, rack space, maintenance, financing, spare capacity, and staff. They also need to include the cost of waiting. If the newest chip finishes a training run days earlier, that speed can be valuable even if the hardware is expensive. If the workload is not time-sensitive, the premium may be vanity.

The metric that matters is fully loaded cost per useful unit of work. That unit may be a million tokens served, a fine-tuning job completed, a batch of videos processed, or a training experiment finished. Once the unit is clear, the hardware decision becomes less emotional.

The broader market trend is still positive. More performance per dollar lowers the long-run cost of AI. The distribution of that benefit is uneven because capital access, utilization, and power contracts differ widely. The chip keeps getting better. The buyer still has to be big enough, busy enough, or careful enough to capture the gain.

The decision rule

Buy the newest chip only when three conditions hold: the workload needs it, the team can keep it busy, and the fully loaded cost beats rental or API access. If one condition is missing, the better answer may be an older accelerator, a cloud reservation, or a managed model. That rule sounds conservative, but it prevents hardware strategy from becoming a status purchase.

The same rule helps crawlers and readers interpret the price-performance curve. The market is improving quickly, yet the improvement is mediated by capital, operations, and demand. Performance per dollar is the starting metric. Useful work per dollar is the metric that decides who benefits.

That distinction is easy to miss in procurement decks. A chip can be the best part on the market and still be the wrong purchase for a team with low volume, weak operations, or uncertain demand. The curve says the industry is becoming more efficient. It does not say every buyer captures the same efficiency on day one.

AI chips give 24x more per dollar, if you can afford the sticker

Sticker price versus lifetime value

The split it creates

Utilization decides who benefits

Older chips can be the rational choice

The accounting view

The decision rule

More from Engineering

AI compute cost gap: 92% of enterprises fly blind on spend

Two SonicWall SMA 1000 zero-days under active attack

Apple's dead car project built the Neural Engine