Spot GPU instances are the most underused cost-saving tool in AI infrastructure. We analyzed every live instance across 54 cloud providers — 5,124 total, of which 2,047 (39.9%) are spot instances — to measure what you actually save, which GPUs have the best spot discounts, and when spot is worth the interruption risk.
The short answer: spot saves an average of 65% vs on-demand. On individual GPU models, the gap ranges from 56% (RTX 5070 Ti) to 82% (A10). But not every workload tolerates interruption. Here is the full data.
The Dataset: 5,124 Live Instances
This analysis covers 5,124 GPU cloud instances tracked in real time across 54 providers, 78 GPU models, and 141 regions. Data is updated every 6 hours. The snapshot used for this article was taken on April 1, 2026.
Of the 3,053 on-demand instances, the average price per hour is $7.84. For the 2,047 spot instances, the average is $2.70. The blended average across all 5,124 instances is $5.74/hr.
Spot Savings by GPU Model
Spot discounts are not uniform. They reflect market supply and demand for each GPU: older or more abundant GPUs tend to have deeper spot discounts because cloud providers are more willing to offer them at cut rates when demand drops.
| GPU Model | Spot Avg | OD Avg | Savings | Spot Count |
|---|---|---|---|---|
| A10 | $0.69/hr | $3.88/hr | 82% | 102 |
| L40S | $2.17/hr | $11.08/hr | 80% | 73 |
| L4 | $1.05/hr | $3.66/hr | 71% | 160 |
| T4 | $0.70/hr | $2.33/hr | 70% | 711 |
| P100 | $0.60/hr | $1.91/hr | 68% | 66 |
| A10G | $2.10/hr | $5.80/hr | 64% | 56 |
| V100 | $5.79/hr | $15.21/hr | 62% | 245 |
| A100 | $6.46/hr | $16.64/hr | 61% | 136 |
| H100 | $2.20/hr | $5.65/hr | 61% | 83 |
| RTX 4090 | $0.54/hr | $1.38/hr | 61% | 44 |
| RTX 5090 | $0.89/hr | $2.12/hr | 58% | 28 |
| RTXPRO 6000 | $4.29/hr | $10.20/hr | 58% | 70 |
| H200 | $3.82/hr | $9.30/hr | 59% | 37 |
Which Providers Offer Spot Instances?
Not all 54 providers offer spot pricing. Marketplace platforms dominate the spot market. Vast.ai has 128 spot instances (100% of their catalog is spot/interruptible). GCP has 711 T4 spot instances alone, making hyperscaler spot a real option for commodity GPU workloads.
Notable spot providers: Vast.ai (entire catalog), RunPod (community cloud = spot), GCP (T4 preemptible), AWS (EC2 spot), Azure (spot VMs). Lambda Labs, CoreWeave, and most specialized providers are on-demand only.
When Is Spot Worth the Risk?
| Workload | Use Spot? | Why |
|---|---|---|
| Batch training (checkpointed) | ✅ Yes | Interruptions just resume from last checkpoint |
| Data preprocessing | ✅ Yes | Idempotent, re-runnable without cost |
| Hyperparameter search | ✅ Yes | Individual runs are disposable |
| Image/video generation batches | ✅ Yes | Each generation is independent |
| API inference (user-facing) | ❌ No | Interruption = downtime for users |
| Fine-tuning (<4 hours) | ⚠️ Maybe | High savings if interruption rate is low |
| Long training (no checkpoints) | ❌ No | One interruption = lost hours of compute |
| Notebooks / interactive | ⚠️ Maybe | Annoying but manageable if you save often |
The Bottom Line on Spot Savings
The 65% average spot savings figure is real, but averages can be misleading. If you're running T4 on GCP for batch jobs, you can realistically save 70%. If you need an H100 and interruption is acceptable, the A100 spot market can save you 61% — but keep hourly checkpoints.
The practical rule: any workload that can be checkpointed or restarted should use spot. The only exception is latency-sensitive production inference where SLA matters.
Find Spot Instances Live