Spot GPU Instances: Why You're Wasting Money on On-Demand

If your workload can checkpoint and restart, and you're still paying on-demand prices, you're throwing away 50–73% of your GPU budget. That's not an exaggeration — it's what the data says. We track thousands of GPU instances across the providers we cover, and a large share of them are spot instances. The savings are enormous, the availability is better than you think, and the risks are manageable if you know what you're doing.

The Real Savings Numbers

We pulled average pricing for the most popular GPU models across all providers that offer both spot and on-demand. Here's what we found:

GPU	On-Demand Avg	Spot Avg	Savings
H100 80GB	$21.39/hr	$10.26/hr	52%
L40S	$9.17/hr	$2.45/hr	73%
RTX 4090	$1.53/hr	$0.78/hr	49%

A note on the H100 numbers: the on-demand average of $21.39/hr is skewed high because it includes 8-GPU instances from hyperscalers that run $25–32/hr. The cheapest single-GPU H100 on-demand is $1.87/hr. But the spot number tells the real story — the cheapest H100 spot instance is $0.73/hr on Vast.ai. That's cheaper than most A100 on-demand prices. Read that again: you can rent an H100 for less than an A100 if you use spot.

Key stat: 2,131 spot instances are available across all providers versus 2,894 on-demand instances. Spot makes up 42% of the total GPU cloud market. This isn't a niche option anymore — it's almost half the supply.

When Spot Works Perfectly

Spot instances can be reclaimed by the provider at any time with little or no warning. That sounds scary, but for many workloads it's a non-issue. Here are the cases where spot is not just acceptable but the objectively correct choice:

Inference serving (stateless): Each request is independent. If the instance gets pulled, your load balancer routes to another one. You lose zero work. Running inference on on-demand when you could use spot is just burning money.
Training with checkpoints: If your training framework saves checkpoints every 30 minutes (which it should), the worst case when a spot instance gets reclaimed is losing 30 minutes of training. At the 52–73% savings rates we see, that math works out overwhelmingly in spot's favor. A 24-hour training run on H100 spot at $10.26/hr costs $246. On-demand at $21.39/hr costs $513. You'd need to lose more than half your training time to interruptions before on-demand breaks even.
Batch processing and data pipelines: Embedding generation, dataset preprocessing, evaluation runs — anything where you can split the work into chunks and resume after interruption. Spot was literally built for this.
Development and experimentation: If you're iterating on model architectures, debugging training code, or running experiments, the cost of an interruption is trivially low. Use spot. Always.

When Spot Doesn't Work

To be fair, spot isn't for everything. Here are the cases where on-demand is worth the premium:

Long fine-tuning without checkpoint support: Some fine-tuning frameworks or custom training loops don't support saving and resuming cleanly from checkpoints. If losing your instance means restarting the entire fine-tuning run from scratch, the math changes. Fix your checkpoint code — but until you do, use on-demand.
Guaranteed uptime SLA requirements: If you're serving inference for a production application with strict uptime SLAs (99.9%+), relying solely on spot is risky. The best practice is to run a baseline on on-demand and burst with spot, but your always-on capacity should be reserved or on-demand.
Multi-node distributed training: If you're training across 8 or more GPUs on multiple nodes, losing one node usually kills the entire job. The cost of restarting a large distributed training run can outweigh the spot savings, especially if interruptions are frequent.

Pro Tips for Spot Users

If you're going to use spot — and you should — here are some practical tips that make the difference between a smooth experience and a frustrating one:

Checkpoint aggressively: Every 15–30 minutes for training runs. Storage is cheap compared to lost compute.
Use multiple providers: Don't put all your spot capacity on one provider. Spread across Vast.ai, RunPod, and one hyperscaler so interruptions on one don't take down everything.
Monitor spot prices: Spot prices fluctuate. Check our trends page to understand pricing patterns and pick providers with the most stable spot pricing.
Automate restart logic: Write scripts that detect preemption, save state, and automatically re-provision on another spot instance. This turns a manual headache into a seamless process.

The bottom line is simple: if your job can checkpoint, use spot. Period. The savings are too significant to ignore. Check our price comparison tool and filter by spot instances to see every option available right now.

Spot GPU Instances: Why You're Wasting Money on On-Demand

The Real Savings Numbers

When Spot Works Perfectly

When Spot Doesn't Work

Pro Tips for Spot Users

Related Articles

How to Run Llama 4 Locally (Scout + Maverick)

How to Run DeepSeek R1 Locally (No GPU Required)

How to Run Gemma 4 Locally (Text, Audio, Image)