Skip to main content
trendsanalysispricing

Cloud GPU Market Report: March 2026 — What the Data Says About Where Prices Are Heading

GPU cloud prices have shifted dramatically. The H200 median is now below the H100 median. RTX 5090 instances have appeared across 4 providers. Spot savings average 62%. Here's what's actually happening in the market.

March 19, 202611 min read

GPU cloud pricing is not static. Providers add inventory, cut spot prices, and new GPU generations enter the market — all of which shift the competitive landscape. This report covers what GPU Tracker's live feed of 4,969 instances across 18 providers shows as of March 2026, and what it implies for where prices are heading.

The H200 Is Now Cheaper Than the H100 (Median)

The most striking data point in the current market: the H200's median market price ($2.29/hr) is below the H100's median ($2.59/hr). This is not a data error — it reflects supply dynamics. H100 demand remains high because it's the established choice and has the most tooling support, while H200 providers are pricing aggressively to gain market share.

The H200's advantages over the H100 are significant: 76% more VRAM (141GB vs 80GB), 43% more memory bandwidth (4.8 vs 3.35 TB/s), and better energy efficiency. For inference workloads that are memory-bandwidth limited — which is most inference workloads — the H200 is a better GPU at a lower median price. The market is mispriced, and it likely won't stay that way.

GPUProvidersSpot FromMedian PriceVRAM
H10013$0.80/hr$2.59/hr80GB
H200Better Value7$0.33/hr$2.29/hr141GB
B2004$1.67/hr$4.99/hr180GB

The RTX 5090 Has Entered the Market

The RTX 5090 now appears across 4 providers with 34 single-GPU instances tracked. Spot pricing starts at $0.13/hr with a median of $0.65/hr. On-demand instances start at $0.33/hr.

The RTX 5090's 32GB GDDR7 memory and Blackwell architecture give it approximately 3,200 tokens/sec on 8B models — competitive with the L40S but at significantly lower pricing. For 13B model inference (which requires 26GB+ at Q8), the RTX 5090 is one of the only consumer GPUs that fits the workload. At $0.13/hr spot, it's dramatically underpriced for that use case.

Market Concentration: GCP Dominates by Volume

GCP accounts for 2,026 of the 4,969 instances (41%) tracked — primarily T4, L4, and A100 instances spread across global regions. AWS has 619 instances (12%), Azure 448 (9%), and RunPod 881 (18%).

The high GCP count is partly explained by the T4: GCP offers T4 instances in over 30 regional zones, each counted separately. When filtering for datacenter-tier GPUs (H100, H200, A100, B200), the specialized providers (RunPod, Lambda, Vast.ai, Nebius) become much more prominent.

Provider TypeShare of MarketPricing Position
Hyperscalers (AWS, GCP, Azure, OCI)72% of instances2–8x market median
Specialized GPU Clouds (RunPod, Lambda, Nebius, etc.)22% of instances0.8–1.5x market median
Marketplaces (Vast.ai)3% of instances0.2–0.8x market median
New Entrants (Verda, CloudRift, Crusoe)3% of instances0.5–1.0x market median

Spot Market: 62% Savings, 42% of All Instances

Spot and interruptible instances now account for 42% of all tracked instances, with an average price of $2.83/hr versus $7.39/hr for on-demand — a 62% discount. This is the widest spot-to-on-demand gap we've tracked, driven by competitive pressure from new providers undercutting on spot markets.

The spot market is particularly attractive for H100 and H200 workloads. An H100 spot at $0.80/hr versus $2.99/hr on-demand is a 73% saving. With modern training frameworks supporting fault-tolerant checkpointing (PyTorch FSDP, Megatron-LM), spot H100/H200 instances are viable for most training workloads.

What to Watch in Q2 2026

  • H200 prices will likely rise as demand catches up with supply. The current pricing anomaly (H200 cheaper than H100 at median) won't persist. Lock in H200 contracts now if you have sustained workloads.
  • RTX 5090 inventory will expand. Currently at 34 single-GPU instances across 4 providers, this will grow significantly through Q2 as more consumer GPU hosts come online. Spot prices may compress further.
  • B200 remains premium. At $1.67/hr spot and $3.60/hr on-demand, the B200 is priced for teams that specifically need its 180GB VRAM or need Blackwell FP4 compute. Don't pay B200 prices unless you specifically need 180GB+ VRAM.
  • A100 spot prices are unlikely to fall further. At $0.08/hr, A100 spot is already at near-marginal cost for many providers. This is the floor.
  • New provider entries will continue. Verda, CloudRift, and Crusoe are each growing their inventories and competing aggressively on price. This benefits buyers.

GPU Tracker monitors all 18 providers every 6 hours. See live prices, set alerts, and filter by model, region, and commitment type at gputracker.dev. Monthly market reports are published in the blog.

Stay ahead on GPU pricing

Get weekly GPU price reports, new hardware analysis, and cost optimization tips. Join engineers and researchers who save thousands on cloud compute.

No spam. Unsubscribe anytime. We respect your inbox.

Find the cheapest GPU for your workload

Compare real-time prices across tracked cloud providers and marketplaces with 5,000+ instances. Updated every 6 hours.

Compare GPU Prices →

Related Articles

We use cookies for analytics and to remember your preferences. Privacy Policy