Skip to main content
guideimage-generationlocal

How to Run FLUX Image Generation Locally

Run FLUX.1 Schnell and Dev locally with ComfyUI. FP8 quantization, GGUF models, and VRAM optimization.

April 10, 20268 min read
FLUX.1 Variants Compared
FLUX.1 [schnell]
VRAM: 12 GB
Steps: 4 steps
License: Apache 2.0
Fastest, free
FLUX.1 [dev]
VRAM: 16 GB
Steps: 20-50 steps
License: Non-commercial
Best quality free
FLUX.1 [pro]
VRAM: API only
Steps:
License: Commercial
Best quality paid

FLUX.1 by Black Forest Labs (the team that created Stable Diffusion) is the current state of the art for open image generation. It handles text rendering in images far better than SDXL, produces photorealistic results, and the Schnell variant generates an image in just 4 steps. You can run it locally with 12 GB VRAM.

Requirements

ComponentSchnellDev
VRAM12 GB (FP8) / 16 GB (BF16)16 GB (FP8) / 24 GB (BF16)
RAM24 GB32 GB
GPURTX 3080 12GB, RTX 4070RTX 3090, RTX 4080/4090
Storage24 GB (model files)24 GB
Python3.10+3.10+

Method 1: Diffusers (Python)

pip install diffusers transformers accelerate sentencepiece protobuf
# For FP8 quantization (saves VRAM):
pip install optimum-quanto

from diffusers import FluxPipeline
import torch

# Load FLUX.1 Schnell (4-step, fast, Apache 2.0)
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-schnell",
    torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()  # Optional: offload to CPU when not in use

image = pipe(
    prompt="A photorealistic cat astronaut on the moon, 8K, detailed",
    num_inference_steps=4,     # Schnell only needs 4 steps
    height=1024,
    width=1024,
    guidance_scale=0.0,        # Schnell uses 0 guidance
    generator=torch.Generator("cpu").manual_seed(42)
).images[0]
image.save("flux_output.png")
print("Saved flux_output.png")

FP8 Quantization (Fits in 12 GB VRAM)

from diffusers import FluxPipeline
from optimum.quanto import freeze, qfloat8, quantize
import torch

# Load model
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.bfloat16
)

# Quantize transformer to FP8 — reduces VRAM by ~40%
quantize(pipe.transformer, weights=qfloat8)
freeze(pipe.transformer)

pipe.to("cuda")

image = pipe(
    prompt="A neon-lit Tokyo street at night, cinematic",
    num_inference_steps=28,
    guidance_scale=3.5,
    height=1024,
    width=1024,
).images[0]
image.save("flux_dev_output.png")

Method 2: ComfyUI (No-Code GUI)

# Install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI && pip install -r requirements.txt

# Download FLUX.1 Schnell GGUF (smaller, runs on 8 GB VRAM)
mkdir -p models/unet
wget -O models/unet/flux1-schnell-q8_0.gguf \
  "https://huggingface.co/city96/FLUX.1-schnell-gguf/resolve/main/flux1-schnell-Q8_0.gguf"

# Download text encoders
mkdir -p models/clip
wget -O models/clip/t5xxl_fp16.safetensors \
  "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors"
wget -O models/clip/clip_l.safetensors \
  "https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors"

# Download VAE
mkdir -p models/vae
wget -O models/vae/ae.safetensors \
  "https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/ae.safetensors"

# Start ComfyUI
python main.py --listen
# Load the FLUX workflow JSON from comfyanonymous/ComfyUI_examples

Cloud Option: When 12 GB VRAM Is Not Enough

For FLUX.1 Dev at full BF16 precision, you need 24 GB VRAM. An RTX 4090 at $0.74/hr on RunPod handles this well. At 150 images/hr with a 5-second generation time (4 steps), FLUX Schnell is practical for production batch generation on cloud.

Stay ahead on GPU pricing

Get weekly GPU price reports, new hardware analysis, and cost optimization tips. Join engineers and researchers who save thousands on cloud compute.

No spam. Unsubscribe anytime. We respect your inbox.

Find the cheapest GPU for your workload

Compare real-time prices across tracked cloud providers and marketplaces with 5,000+ instances. Updated every 6 hours.

Compare GPU Prices →

Related Articles

We use cookies for analytics and to remember your preferences. Privacy Policy