GPU Prices Are Falling Off a Cliff — Here Is When to Lock In

GPU cloud prices have collapsed 40-60% in the last 12 months. The H100 went from $3.50/hr to $1.29/hr. The A100 went from $1.80/hr to $0.34/hr on spot. The RTX 4090 went from $0.74/hr to $0.19/hr on spot. And based on the supply-side data we track, prices are going to fall another 30-40% by the end of 2026. Here is why, and here is what it means for your GPU budget timing.

What Caused the 2025 Price Crash

Three forces converged simultaneously:

Massive H100 supply came online. NVIDIA shipped over 3.5 million H100 equivalents in 2024-2025. Every hyperscaler, every GPU cloud, every startup with a data center budget ordered them. The result: supply caught up with demand for the first time since GPT-4 launched.
B200 cannibalized H100 demand. Enterprise buyers waiting for Blackwell delayed H100 orders, creating a demand gap. Existing H100 inventory had to be sold at lower prices to maintain utilization.
Marketplace competition intensified. The number of GPU cloud providers we track grew from 12 to 18+ in 12 months. More competition = lower margins = lower prices. Vast.ai and TensorDock drove consumer GPU prices to commodity levels.

The 12-Month Price Trajectory

GPU	Feb 2025	Feb 2026	Change	Predicted Dec 2026
H100 SXM (on-demand)	$3.50	$1.29	-63%	$0.80-1.00
H100 SXM (spot)	$1.87	$0.73	-61%	$0.40-0.60
A100 80GB (on-demand)	$1.80	$1.10	-39%	$0.70-0.90
A100 80GB (spot)	$0.80	$0.34	-58%	$0.15-0.25
RTX 4090 (on-demand)	$0.74	$0.39	-47%	$0.25-0.35
RTX 4090 (spot)	$0.44	$0.19	-57%	$0.10-0.15
H200 (on-demand)	$4.00+	$1.49	-63%	$0.90-1.20

Why Prices Will Keep Falling

The forces pushing prices down have not stopped:

B200/B300 deployment accelerates H100 oversupply. As enterprises upgrade to Blackwell, their existing H100 clusters get redeployed to cloud providers or marketplace resellers. This creates a cascade of used H100 supply at discount prices.
RTX 5090 enters the cloud market. At 1,800 FP8 TFLOPS for $2,000, the 5090 will put massive downward pressure on H100 pricing for inference workloads. Why rent an H100 at $1.29/hr when a 5090 at $0.70/hr does the same inference at 91% of the speed?
Inference efficiency keeps improving. Speculative decoding, KV cache quantization, and framework optimizations mean you get more tokens per GPU-hour every quarter. Same hardware, more output = lower effective price.

The Timing Decision: When to Lock In

Buy now (use on-demand/spot): If you need GPUs today, use spot instances and on-demand with auto-shutdown. Do not sign long-term commitments. Prices will be 30-40% lower in 6 months.

Wait 3-6 months: If you can wait, RTX 5090 cloud instances will create a new pricing floor for inference workloads. H100 spot prices should drop below $0.60/hr.

Do not: Sign 1-year reserved instance contracts at current prices. You will be overpaying by Q4 2026. The only exception is if your provider offers price-match guarantees.

Track the price drops: Our GPU price tracker updates every 6 hours. Bookmark it and check back — you will see the prices falling in real time.

H100 SXM prices →A100 80GB prices →H200 prices →All GPU trends →

GPU Prices Are Falling Off a Cliff — Here Is When to Lock In

What Caused the 2025 Price Crash

The 12-Month Price Trajectory

Why Prices Will Keep Falling

The Timing Decision: When to Lock In

Related Articles

Elon Web Services: What SpaceX's $15B Anthropic Deal Means for Cloud GPU Pricing

Cheapest GPU Cloud in 2026: 54 Providers Ranked

Best GPU for Stable Diffusion: Cloud Setup Guide