System

Apple Mac Mini M4 (16GB)

Mac Mini M4 with 16GB unified memory — the most affordable entry point for local AI inference on Apple Silicon.

Apple Mac Mini M4 (24GB)

Mac Mini M4 with 24GB unified memory — run 14B parameter models locally at Q4 quantization.

Apple Mac Mini M4 (32GB)

Mac Mini M4 with 32GB unified memory — the sweet spot for running 20B+ parameter models on the base M4 chip.

Apple Mac Mini M4 Pro (24GB)

Mac Mini M4 Pro with 24GB unified memory — 16-core GPU with 273 GB/s bandwidth for faster local inference.

Apple Mac Mini M4 Pro (48GB)

Mac Mini M4 Pro with 48GB unified memory — a compact local inference powerhouse. Run Llama 3.1 70B Q4 locally.

Apple Mac Mini M4 Pro (64GB)

Mac Mini M4 Pro with 64GB unified memory — run 45B+ parameter models locally with 273 GB/s bandwidth.

Apple Mac Studio M3 Ultra (256GB)

Mac Studio M3 Ultra with 256GB unified memory — the highest-capacity Apple Silicon machine for running 180B+ parameter models locally.

Apple Mac Studio M3 Ultra (96GB)

Mac Studio M3 Ultra with 96GB unified memory — 60-core GPU with 819 GB/s bandwidth for high-throughput local inference.

Apple Mac Studio M4 Max (128GB)

Mac Studio M4 Max with 128GB unified memory and 40-core GPU — run 90B+ parameter models at 546 GB/s bandwidth.

Apple Mac Studio M4 Max (36GB)

Mac Studio M4 Max with 36GB unified memory — 30-core GPU with 410 GB/s bandwidth for high-speed local inference.

Apple Mac Studio M4 Max (48GB)

Mac Studio M4 Max with 48GB unified memory — run 33B parameter models at high speed with 410 GB/s bandwidth.

Apple Mac Studio M4 Max (64GB)

Mac Studio M4 Max with 64GB unified memory — run 45B+ parameter models locally with 410 GB/s bandwidth.

Apple MacBook Pro M4 Max (128GB)

MacBook Pro M4 Max with 128GB unified memory — run 70B+ models at full precision on a laptop.