Discover how to optimize DeepSeek-R1 (Distilled) models for CPU-only inference on AMD EPYC VPS. This comprehensive guide covers quantization, llama.cpp configuration, NUMA tuning, and thread optimization to unlock maximum tokens per second without expensive GPUs.