CPU-only transcoding works for testing and basic validation. Production video nodes need GPU acceleration for reliable throughput.
VRAM determines which AI pipelines the orchestrator can serve. 8 GB runs quantised LLMs only. 24 GB+ unlocks diffusion, audio, vision, and Cascade AI. See for per-model VRAM requirements.
NVENC/NVDEC (video) use dedicated silicon separate from CUDA cores (AI). Both workloads share VRAM. A 24 GB GPU supports video transcoding alongside one warm AI model. See for VRAM coexistence details.
Run these commands to confirm the prerequisites are met:
Copy
Ask AI
# GPU visible and driver installednvidia-smi# CUDA version (skip if using Docker - it bundles CUDA)nvcc --version# Docker GPU passthrough worksdocker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi# Port 8935 reachable (run from a DIFFERENT machine)curl -k https://YOUR_PUBLIC_IP:8935/status