Skip to main content
trendspricinganalysis

GPU Prices Are Falling Off a Cliff — Here Is When to Lock In

H100 prices dropped 63% in 12 months. We predict another 30-40% drop by end of 2026. Do not sign long-term contracts at current prices.

February 15, 20268 min read

GPU cloud prices have collapsed 40-60% in the last 12 months. The H100 went from $3.50/hr to $1.29/hr. The A100 went from $1.80/hr to $0.34/hr on spot. The RTX 4090 went from $0.74/hr to $0.19/hr on spot. And based on the supply-side data we track, prices are going to fall another 30-40% by the end of 2026. Here is why, and here is what it means for your GPU budget timing.

What Caused the 2025 Price Crash

Three forces converged simultaneously:

  • Massive H100 supply came online. NVIDIA shipped over 3.5 million H100 equivalents in 2024-2025. Every hyperscaler, every GPU cloud, every startup with a data center budget ordered them. The result: supply caught up with demand for the first time since GPT-4 launched.
  • B200 cannibalized H100 demand. Enterprise buyers waiting for Blackwell delayed H100 orders, creating a demand gap. Existing H100 inventory had to be sold at lower prices to maintain utilization.
  • Marketplace competition intensified. The number of GPU cloud providers we track grew from 12 to 18+ in 12 months. More competition = lower margins = lower prices. Vast.ai and TensorDock drove consumer GPU prices to commodity levels.

The 12-Month Price Trajectory

GPUFeb 2025Feb 2026ChangePredicted Dec 2026
H100 SXM (on-demand)$3.50$1.29-63%$0.80-1.00
H100 SXM (spot)$1.87$0.73-61%$0.40-0.60
A100 80GB (on-demand)$1.80$1.10-39%$0.70-0.90
A100 80GB (spot)$0.80$0.34-58%$0.15-0.25
RTX 4090 (on-demand)$0.74$0.39-47%$0.25-0.35
RTX 4090 (spot)$0.44$0.19-57%$0.10-0.15
H200 (on-demand)$4.00+$1.49-63%$0.90-1.20

Why Prices Will Keep Falling

The forces pushing prices down have not stopped:

  • B200/B300 deployment accelerates H100 oversupply. As enterprises upgrade to Blackwell, their existing H100 clusters get redeployed to cloud providers or marketplace resellers. This creates a cascade of used H100 supply at discount prices.
  • RTX 5090 enters the cloud market. At 1,800 FP8 TFLOPS for $2,000, the 5090 will put massive downward pressure on H100 pricing for inference workloads. Why rent an H100 at $1.29/hr when a 5090 at $0.70/hr does the same inference at 91% of the speed?
  • Inference efficiency keeps improving. Speculative decoding, KV cache quantization, and framework optimizations mean you get more tokens per GPU-hour every quarter. Same hardware, more output = lower effective price.

The Timing Decision: When to Lock In

Buy now (use on-demand/spot): If you need GPUs today, use spot instances and on-demand with auto-shutdown. Do not sign long-term commitments. Prices will be 30-40% lower in 6 months.

Wait 3-6 months: If you can wait, RTX 5090 cloud instances will create a new pricing floor for inference workloads. H100 spot prices should drop below $0.60/hr.

Do not: Sign 1-year reserved instance contracts at current prices. You will be overpaying by Q4 2026. The only exception is if your provider offers price-match guarantees.

Track the price drops: Our GPU price tracker updates every 6 hours. Bookmark it and check back — you will see the prices falling in real time.

Stay ahead on GPU pricing

Get weekly GPU price reports, new hardware analysis, and cost optimization tips. Join engineers and researchers who save thousands on cloud compute.

No spam. Unsubscribe anytime. We respect your inbox.

Find the cheapest GPU for your workload

Compare real-time prices across tracked cloud providers and marketplaces with 5,000+ instances. Updated every 6 hours.

Compare GPU Prices →

Related Articles