Chat

CodeLlama-13b-Instruct-hf

CodeLlama-13b-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

CodeLlama-7b-Instruct-hf

CodeLlama-7b-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-33b-instruct

deepseek-coder-33b-instruct specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-6.7b-instruct

deepseek-coder-6.7b-instruct specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-7b-instruct-v1.5

deepseek-coder-7b-instruct-v1.5 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Instruct

DeepSeek-Coder-V2-Instruct specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Instruct-0724

DeepSeek-Coder-V2-Instruct-0724 specs, VRAM requirements, and which GPUs can run it.

deepseek-moe-16b-chat

deepseek-moe-16b-chat specs, VRAM requirements, and which GPUs can run it.

Deepseek-V2 Pro

Deepseek-V2 Pro specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Chat

DeepSeek-V2-Chat specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Chat-0628

DeepSeek-V2-Chat-0628 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Lite-Chat

DeepSeek-V2-Lite-Chat specs, VRAM requirements, and which GPUs can run it.

falcon-7b-instruct

falcon-7b-instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-0.5B-Instruct

Falcon-H1-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-1.5B-Instruct

Falcon-H1-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-34B-Instruct

Falcon-H1-34B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-3B-Instruct

Falcon-H1-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-7B-Instruct

Falcon-H1-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-90M-Instruct

Falcon-H1-Tiny-90M-Instruct specs, VRAM requirements, and which GPUs can run it.

falcon-mamba-7b-instruct

falcon-mamba-7b-instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-1B-Instruct

Falcon3-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-3B-Instruct

Falcon3-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-7B-Instruct

Falcon3-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-1_8b

internlm2-chat-1_8b specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-20b

internlm2-chat-20b specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-7b-sft

internlm2-chat-7b-sft specs, VRAM requirements, and which GPUs can run it.

Jan-v3-4B-base-instruct

Jan-v3-4B-base-instruct specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct

LFM2.5-1.2B-Instruct specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-4bit

LFM2.5-1.2B-Instruct-MLX-4bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-6bit

LFM2.5-1.2B-Instruct-MLX-6bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-8bit

LFM2.5-1.2B-Instruct-MLX-8bit specs, VRAM requirements, and which GPUs can run it.

Llama 3.1 70B

Llama 3.1 70B specs, VRAM requirements, and which GPUs can run it. The sweet spot for local reasoning.

Llama 3.1 8B

Llama 3.1 8B specs, VRAM requirements, and which GPUs can run it. The go-to small model for local inference.

Llama-3.1-405B-Instruct

Llama-3.1-405B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-405B-Instruct-FP8

Llama-3.1-405B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-70B-Instruct

Llama-3.1-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-8B-Instruct-FP8

Llama-3.1-8B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-1B-Instruct-FP8

Llama-3.2-1B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-1B-Instruct-FP8-dynamic

Llama-3.2-1B-Instruct-FP8-dynamic specs, VRAM requirements, and which GPUs can run it.

llama-3.3-70b-instruct-awq

llama-3.3-70b-instruct-awq specs, VRAM requirements, and which GPUs can run it.

llm-jp-3-3.7b-instruct

llm-jp-3-3.7b-instruct specs, VRAM requirements, and which GPUs can run it.

MediPhi-Instruct

MediPhi-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3-70B-Instruct

Meta-Llama-3-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3-8B-Instruct

Meta-Llama-3-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-70B-Instruct

Meta-Llama-3.1-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct

Meta-Llama-3.1-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct-bnb-4bit

Meta-Llama-3.1-8B-Instruct-bnb-4bit specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct-FP8

Meta-Llama-3.1-8B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Mistral 7B

Mistral 7B specs, VRAM requirements, and which GPUs can run it. Efficient and fast for everyday tasks.

Mistral-7B-Instruct-v0.2

Mistral-7B-Instruct-v0.2 specs, VRAM requirements, and which GPUs can run it.

Mistral-NeMo-Minitron-8B-Instruct

Mistral-NeMo-Minitron-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Mistral-Small-24B-Instruct-2501-AWQ

Mistral-Small-24B-Instruct-2501-AWQ specs, VRAM requirements, and which GPUs can run it.

Mixtral-8x7B-Instruct-v0.1-GPTQ

Mixtral-8x7B-Instruct-v0.1-GPTQ specs, VRAM requirements, and which GPUs can run it.

Nemotron-H-4B-Instruct-128K

Nemotron-H-4B-Instruct-128K specs, VRAM requirements, and which GPUs can run it.

OLMo-2-0325-32B-Instruct

OLMo-2-0325-32B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-0425-1B-Instruct

OLMo-2-0425-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-1124-13B-Instruct

OLMo-2-1124-13B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-1124-7B-Instruct

OLMo-2-1124-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct

Olmo-3-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct-DPO

Olmo-3-7B-Instruct-DPO specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct-SFT

Olmo-3-7B-Instruct-SFT specs, VRAM requirements, and which GPUs can run it.

Olmo-Hybrid-Instruct-DPO-7B

Olmo-Hybrid-Instruct-DPO-7B specs, VRAM requirements, and which GPUs can run it.

OLMoE-1B-7B-0125-Instruct

OLMoE-1B-7B-0125-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMoE-1B-7B-0924-Instruct

OLMoE-1B-7B-0924-Instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3-medium-4k-instruct

Phi-3-medium-4k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3-mini-4k-instruct-gptq-4bit

Phi-3-mini-4k-instruct-gptq-4bit specs, VRAM requirements, and which GPUs can run it.

Phi-3-small-8k-instruct

Phi-3-small-8k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-mini-MoE-instruct

Phi-mini-MoE-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-tiny-MoE-instruct

Phi-tiny-MoE-instruct specs, VRAM requirements, and which GPUs can run it.

Qwen 2.5 72B

Qwen 2.5 72B specs, VRAM requirements, and which GPUs can run it. Strong on benchmarks, competitive with Llama 70B.

Qwen 2.5 72B Instruct

Qwen 2.5 72B Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen1.5-110B-Chat-AWQ

Qwen1.5-110B-Chat-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2 72B

Qwen2 72B specs, VRAM requirements, and which GPUs can run it.

Qwen2-0.5B-Instruct

Qwen2-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2-1.5B-Instruct

Qwen2-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2-7B-Instruct

Qwen2-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-0.5B-Instruct

Qwen2.5-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-1.5B-Instruct

Qwen2.5-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-1.5B-Instruct-AWQ

Qwen2.5-1.5B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-14B-Instruct-AWQ

Qwen2.5-14B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-32B-Instruct-AWQ

Qwen2.5-32B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-3B-Instruct

Qwen2.5-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-72B-Instruct

Qwen2.5-72B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-72B-Instruct-AWQ

Qwen2.5-72B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-7B-Instruct

Qwen2.5-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-0.5B-Instruct

Qwen2.5-Coder-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-1.5B-Instruct

Qwen2.5-Coder-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-14B-Instruct

Qwen2.5-Coder-14B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-32B-Instruct

Qwen2.5-Coder-32B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-32B-Instruct-AWQ

Qwen2.5-Coder-32B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct

Qwen2.5-Coder-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct-AWQ

Qwen2.5-Coder-7B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Qwen2.5-Coder-7B-Instruct-GPTQ-Int4 specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-VL-7B-Instruct-NVFP4

Qwen2.5-VL-7B-Instruct-NVFP4 specs, VRAM requirements, and which GPUs can run it.

Qwen3-14B-Instruct

Qwen3-14B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen3-235B-A22B-Instruct-2507-FP8

Qwen3-235B-A22B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-30B-A3B-Instruct-2507-FP8

Qwen3-30B-A3B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-4B-Instruct-2507-FP8

Qwen3-4B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-Coder-30B-A3B-Instruct-FP8

Qwen3-Coder-30B-A3B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-Next-80B-A3B-Instruct

Qwen3-Next-80B-A3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen3-Next-80B-A3B-Instruct-FP8

Qwen3-Next-80B-A3B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-VL-30B-A3B-Instruct-AWQ

Qwen3-VL-30B-A3B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

SmolLM-135M-Instruct

SmolLM-135M-Instruct specs, VRAM requirements, and which GPUs can run it.

SmolLM2-135M-Instruct

SmolLM2-135M-Instruct specs, VRAM requirements, and which GPUs can run it.

TinyLlama-1.1B-Chat-v0.3-GPTQ

TinyLlama-1.1B-Chat-v0.3-GPTQ specs, VRAM requirements, and which GPUs can run it.

TinyLlama-1.1B-Chat-v1.0

TinyLlama-1.1B-Chat-v1.0 specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-34B-Chat

Yi-1.5-34B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-34B-Chat-16K

Yi-1.5-34B-Chat-16K specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-6B-Chat

Yi-1.5-6B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-9B-Chat

Yi-1.5-9B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-9B-Chat-16K

Yi-1.5-9B-Chat-16K specs, VRAM requirements, and which GPUs can run it.

Yi-6B-Chat

Yi-6B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-Coder-9B-Chat

Yi-Coder-9B-Chat specs, VRAM requirements, and which GPUs can run it.