For Stable Diffusion, the RTX 4090 is the best value GPU for individual use. It generates SDXL images at 150/hr for $0.74/hr on RunPod — about half a cent per image. The T4 is cheapest per hour but slowest, making it poor value for batch generation.
VRAM Requirements by Model
| Model | Min VRAM | Optimal VRAM | Notes |
|---|---|---|---|
| SD 1.5 | 4 GB | 8 GB | Legacy, still widely used |
| SDXL | 8 GB | 16 GB | Higher res, 2-stage pipeline |
| SD 3.5 Medium | 8 GB | 16 GB | Better text rendering |
| SD 3.5 Large | 16 GB | 24 GB | Best quality |
| FLUX.1 Dev | 16 GB | 24 GB | State-of-art photorealism |
| FLUX.1 Schnell | 12 GB | 16 GB | Fast, 4-step generation |
Setting Up ComfyUI on a Cloud GPU
ComfyUI is the most flexible Stable Diffusion interface. Here's how to run it on a RunPod RTX 4090:
# On your RunPod instance (use the PyTorch template)
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
pip install -r requirements.txt
# Download SDXL model
mkdir -p models/checkpoints
wget -O models/checkpoints/sdxl_base.safetensors \
"https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"
# Download FLUX.1 Schnell (4-step, fast)
wget -O models/checkpoints/flux1-schnell.safetensors \
"https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors"
# Start ComfyUI with web UI accessible externally
python main.py --listen 0.0.0.0 --port 8188
# Access via: http://YOUR_INSTANCE_IP:8188Setting Up Automatic1111 WebUI
# Install dependencies
apt-get install -y libgl1 libglib2.0-0 wget git python3-pip
# Clone and install
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
pip install -r requirements.txt
# Download a model into models/Stable-diffusion/
# Then start with public access
python launch.py --listen --xformers --api
# xformers flag enables memory-efficient attention
# --api enables REST API at /sdapi/v1/
# Access at http://YOUR_IP:7860Batch Generation via API
import requests, base64, json
def generate_image(prompt, steps=20, width=1024, height=1024):
payload = {
"prompt": prompt,
"steps": steps,
"width": width,
"height": height,
"sampler_name": "DPM++ 2M Karras"
}
r = requests.post("http://YOUR_IP:7860/sdapi/v1/txt2img", json=payload)
r.raise_for_status()
img_b64 = r.json()["images"][0]
img_data = base64.b64decode(img_b64)
return img_data
# Batch 100 images
prompts = ["cyberpunk city", "mountain lake", "abstract art"] * 34
for i, prompt in enumerate(prompts):
img = generate_image(prompt)
with open(f"output_{i:04d}.png", "wb") as f:
f.write(img)