Chat

AI21-Jamba-Reasoning-3B

AI21-Jamba-Reasoning-3B specs, VRAM requirements, and which GPUs can run it.

AI21-Jamba2-Mini

AI21-Jamba2-Mini specs, VRAM requirements, and which GPUs can run it.

AI21-Jamba2-Mini-FP8

AI21-Jamba2-Mini-FP8 specs, VRAM requirements, and which GPUs can run it.

Athene-V2-Chat

Athene-V2-Chat specs, VRAM requirements, and which GPUs can run it.

Bielik-11B-v3.0-Instruct

Bielik-11B-v3.0-Instruct specs, VRAM requirements, and which GPUs can run it.

bitnet-b1.58-2B-4T

bitnet-b1.58-2B-4T specs, VRAM requirements, and which GPUs can run it.

Bonsai-8B-mlx-1bit

Bonsai-8B-mlx-1bit specs, VRAM requirements, and which GPUs can run it.

CD-ROM ChatGPT Thinking-Q4_0

CD-ROM ChatGPT Thinking-Q4_0 specs, VRAM requirements, and which GPUs can run it.

CodeLlama-13b-Instruct-hf

CodeLlama-13b-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

CodeLlama-34b-Instruct-hf

CodeLlama-34b-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

CodeLlama-7b-Instruct-hf

CodeLlama-7b-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

DeepHermes-3-Llama-3-8B-Preview

DeepHermes-3-Llama-3-8B-Preview specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-33b-instruct

deepseek-coder-33b-instruct specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-6.7b-instruct

deepseek-coder-6.7b-instruct specs, VRAM requirements, and which GPUs can run it.

deepseek-coder-7b-instruct-v1.5

deepseek-coder-7b-instruct-v1.5 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Base

DeepSeek-Coder-V2-Base specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Instruct

DeepSeek-Coder-V2-Instruct specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Instruct-0724

DeepSeek-Coder-V2-Instruct-0724 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-Coder-V2-Lite-Instruct

DeepSeek-Coder-V2-Lite-Instruct specs, VRAM requirements, and which GPUs can run it.

deepseek-math-7b-rl

deepseek-math-7b-rl specs, VRAM requirements, and which GPUs can run it.

deepseek-moe-16b-chat

deepseek-moe-16b-chat specs, VRAM requirements, and which GPUs can run it.

DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen-1.5B specs, VRAM requirements, and which GPUs can run it.

Deepseek-V2 Pro

Deepseek-V2 Pro specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Chat

DeepSeek-V2-Chat specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Chat-0628

DeepSeek-V2-Chat-0628 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2-Lite-Chat

DeepSeek-V2-Lite-Chat specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V2.5-1210

DeepSeek-V2.5-1210 specs, VRAM requirements, and which GPUs can run it.

DeepSeek-V3.2-Exp-Base

DeepSeek-V3.2-Exp-Base specs, VRAM requirements, and which GPUs can run it.

ESFT-vanilla-lite

ESFT-vanilla-lite specs, VRAM requirements, and which GPUs can run it.

falcon-7b-instruct

falcon-7b-instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-E-1B-Instruct

Falcon-E-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-E-3B-Instruct

Falcon-E-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-0.5B-Instruct

Falcon-H1-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-1.5B-Deep-Instruct

Falcon-H1-1.5B-Deep-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-1.5B-Instruct

Falcon-H1-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-34B-Instruct

Falcon-H1-34B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-34B-Instruct-GPTQ-Int8

Falcon-H1-34B-Instruct-GPTQ-Int8 specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-3B-Instruct

Falcon-H1-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-7B-Instruct

Falcon-H1-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-90M-Instruct

Falcon-H1-Tiny-90M-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-90M-Instruct-pre-DPO

Falcon-H1-Tiny-90M-Instruct-pre-DPO specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-Multilingual-100M-Instruct

Falcon-H1-Tiny-Multilingual-100M-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-R-0.6B

Falcon-H1-Tiny-R-0.6B specs, VRAM requirements, and which GPUs can run it.

Falcon-H1-Tiny-R-90M

Falcon-H1-Tiny-R-90M specs, VRAM requirements, and which GPUs can run it.

Falcon-H1R-7B-FP8

Falcon-H1R-7B-FP8 specs, VRAM requirements, and which GPUs can run it.

falcon-mamba-7b-instruct

falcon-mamba-7b-instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-10B-Instruct-1.58bit

Falcon3-10B-Instruct-1.58bit specs, VRAM requirements, and which GPUs can run it.

Falcon3-1B-Instruct

Falcon3-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-1B-Instruct-1.58bit

Falcon3-1B-Instruct-1.58bit specs, VRAM requirements, and which GPUs can run it.

Falcon3-3B-Instruct

Falcon3-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-7B-Instruct

Falcon3-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Falcon3-7B-Instruct-1.58bit

Falcon3-7B-Instruct-1.58bit specs, VRAM requirements, and which GPUs can run it.

Falcon3-Mamba-7B-Instruct

Falcon3-Mamba-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Faust-1

Faust-1 specs, VRAM requirements, and which GPUs can run it.

FlexOlmo-7x7B-1T-RT

FlexOlmo-7x7B-1T-RT specs, VRAM requirements, and which GPUs can run it.

Gemma 7B

Gemma 7B specs, VRAM requirements, and which GPUs can run it.

gemma-2-2b-it

gemma-2-2b-it specs, VRAM requirements, and which GPUs can run it.

gemma-2-2b-jpn-it

gemma-2-2b-jpn-it specs, VRAM requirements, and which GPUs can run it.

gemma-3-1b-it

gemma-3-1b-it specs, VRAM requirements, and which GPUs can run it.

gemma-3-1b-it-qat-int4-unquantized

gemma-3-1b-it-qat-int4-unquantized specs, VRAM requirements, and which GPUs can run it.

gemma-3-1b-it-qat-q4_0-unquantized

gemma-3-1b-it-qat-q4_0-unquantized specs, VRAM requirements, and which GPUs can run it.

gemma-3-270m-it-qat-q4_0-unquantized

gemma-3-270m-it-qat-q4_0-unquantized specs, VRAM requirements, and which GPUs can run it.

Gemma-4-31B-IT-NVFP4

Gemma-4-31B-IT-NVFP4 specs, VRAM requirements, and which GPUs can run it.

gemma-4-E4B-it-OBLITERATED

gemma-4-E4B-it-OBLITERATED specs, VRAM requirements, and which GPUs can run it.

GLM-4.5-Air

GLM-4.5-Air specs, VRAM requirements, and which GPUs can run it.

GLM-4.7-Flash

GLM-4.7-Flash specs, VRAM requirements, and which GPUs can run it.

GLM-4.7-Flash-FP8-Dynamic

GLM-4.7-Flash-FP8-Dynamic specs, VRAM requirements, and which GPUs can run it.

GLM-5-FP8

GLM-5-FP8 specs, VRAM requirements, and which GPUs can run it.

GLM-5-NVFP4

GLM-5-NVFP4 specs, VRAM requirements, and which GPUs can run it.

GLM-5.1-FP8

GLM-5.1-FP8 specs, VRAM requirements, and which GPUs can run it.

GLM-5.1-MLX-4.8bit

GLM-5.1-MLX-4.8bit specs, VRAM requirements, and which GPUs can run it.

gpt-oss-20b-MXFP4-Q8

gpt-oss-20b-MXFP4-Q8 specs, VRAM requirements, and which GPUs can run it.

gpt-oss-puzzle-88B

gpt-oss-puzzle-88B specs, VRAM requirements, and which GPUs can run it.

granite-3.3-8b-instruct

granite-3.3-8b-instruct specs, VRAM requirements, and which GPUs can run it.

Hermes-2-Theta-Llama-3-70B

Hermes-2-Theta-Llama-3-70B specs, VRAM requirements, and which GPUs can run it.

Hermes-3-Llama-3.1-70B

Hermes-3-Llama-3.1-70B specs, VRAM requirements, and which GPUs can run it.

Hermes-3-Llama-3.2-3B

Hermes-3-Llama-3.2-3B specs, VRAM requirements, and which GPUs can run it.

Hermes-4-14B-FP8

Hermes-4-14B-FP8 specs, VRAM requirements, and which GPUs can run it.

Hermes-4-405B

Hermes-4-405B specs, VRAM requirements, and which GPUs can run it.

Hermes-4-70B-FP8

Hermes-4-70B-FP8 specs, VRAM requirements, and which GPUs can run it.

internlm2_5-1_8b-chat

internlm2_5-1_8b-chat specs, VRAM requirements, and which GPUs can run it.

internlm2_5-20b-chat

internlm2_5-20b-chat specs, VRAM requirements, and which GPUs can run it.

internlm2_5-7b-chat

internlm2_5-7b-chat specs, VRAM requirements, and which GPUs can run it.

internlm2_5-7b-chat-1m

internlm2_5-7b-chat-1m specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-1_8b

internlm2-chat-1_8b specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-20b

internlm2-chat-20b specs, VRAM requirements, and which GPUs can run it.

internlm2-chat-7b-sft

internlm2-chat-7b-sft specs, VRAM requirements, and which GPUs can run it.

internlm2-math-7b

internlm2-math-7b specs, VRAM requirements, and which GPUs can run it.

internlm2-math-plus-1_8b

internlm2-math-plus-1_8b specs, VRAM requirements, and which GPUs can run it.

internlm2-math-plus-7b

internlm2-math-plus-7b specs, VRAM requirements, and which GPUs can run it.

Jan-v3-4B-base-instruct

Jan-v3-4B-base-instruct specs, VRAM requirements, and which GPUs can run it.

japanese-stablelm-2-instruct-1_6b

japanese-stablelm-2-instruct-1_6b specs, VRAM requirements, and which GPUs can run it.

japanese-stablelm-3b-4e1t-instruct

japanese-stablelm-3b-4e1t-instruct specs, VRAM requirements, and which GPUs can run it.

japanese-stablelm-instruct-beta-70b

japanese-stablelm-instruct-beta-70b specs, VRAM requirements, and which GPUs can run it.

japanese-stablelm-instruct-beta-7b

japanese-stablelm-instruct-beta-7b specs, VRAM requirements, and which GPUs can run it.

japanese-stablelm-instruct-gamma-7b

japanese-stablelm-instruct-gamma-7b specs, VRAM requirements, and which GPUs can run it.

karma-electric-llama31-8b

karma-electric-llama31-8b specs, VRAM requirements, and which GPUs can run it.

Karnak

Karnak specs, VRAM requirements, and which GPUs can run it.

KD-Tinker

KD-Tinker specs, VRAM requirements, and which GPUs can run it.

Kimi-K2-Instruct-0905

Kimi-K2-Instruct-0905 specs, VRAM requirements, and which GPUs can run it.

L3.3-GeneticLemonade-Final-v2-70B

L3.3-GeneticLemonade-Final-v2-70B specs, VRAM requirements, and which GPUs can run it.

LFM2-1.2B

LFM2-1.2B specs, VRAM requirements, and which GPUs can run it.

LFM2-24B-A2B-MLX-4bit

LFM2-24B-A2B-MLX-4bit specs, VRAM requirements, and which GPUs can run it.

LFM2-24B-A2B-MLX-5bit

LFM2-24B-A2B-MLX-5bit specs, VRAM requirements, and which GPUs can run it.

LFM2-24B-A2B-MLX-6bit

LFM2-24B-A2B-MLX-6bit specs, VRAM requirements, and which GPUs can run it.

LFM2-24B-A2B-MLX-8bit

LFM2-24B-A2B-MLX-8bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct

LFM2.5-1.2B-Instruct specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-4bit

LFM2.5-1.2B-Instruct-MLX-4bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-6bit

LFM2.5-1.2B-Instruct-MLX-6bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Instruct-MLX-8bit

LFM2.5-1.2B-Instruct-MLX-8bit specs, VRAM requirements, and which GPUs can run it.

LFM2.5-1.2B-Thinking

LFM2.5-1.2B-Thinking specs, VRAM requirements, and which GPUs can run it.

Llama 3.1 70B

Llama 3.1 70B specs, VRAM requirements, and which GPUs can run it. The sweet spot for local reasoning.

Llama 3.1 8B

Llama 3.1 8B specs, VRAM requirements, and which GPUs can run it. The go-to small model for local inference.

Llama-3.1-405B-Instruct

Llama-3.1-405B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-405B-Instruct-FP8

Llama-3.1-405B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-70B-Instruct

Llama-3.1-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-8B-Instruct

Llama-3.1-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-8B-Instruct-FP8

Llama-3.1-8B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Nemotron-70B-Instruct-HF

Llama-3.1-Nemotron-70B-Instruct-HF specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Nemotron-Nano-4B-v1.1

Llama-3.1-Nemotron-Nano-4B-v1.1 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Nemotron-Nano-8B-v1

Llama-3.1-Nemotron-Nano-8B-v1 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Nemotron-Safety-Guard-8B-v3

Llama-3.1-Nemotron-Safety-Guard-8B-v3 specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Tulu-3-70B-DPO

Llama-3.1-Tulu-3-70B-DPO specs, VRAM requirements, and which GPUs can run it.

Llama-3.1-Tulu-3.1-8B

Llama-3.1-Tulu-3.1-8B specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-1B-Instruct

Llama-3.2-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-1B-Instruct-FP8

Llama-3.2-1B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-1B-Instruct-FP8-dynamic

Llama-3.2-1B-Instruct-FP8-dynamic specs, VRAM requirements, and which GPUs can run it.

Llama-3.2-3B-Instruct

Llama-3.2-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Llama-3.3-70B-Instruct

Llama-3.3-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

llama-3.3-70b-instruct-awq

llama-3.3-70b-instruct-awq specs, VRAM requirements, and which GPUs can run it.

llama-7b

llama-7b specs, VRAM requirements, and which GPUs can run it.

Llama-Guard-3-1B

Llama-Guard-3-1B specs, VRAM requirements, and which GPUs can run it.

llm-jp-3-3.7b-instruct

llm-jp-3-3.7b-instruct specs, VRAM requirements, and which GPUs can run it.

MediPhi

MediPhi specs, VRAM requirements, and which GPUs can run it.

MediPhi-Instruct

MediPhi-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3-70B-Instruct

Meta-Llama-3-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3-8B-Instruct

Meta-Llama-3-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-70B-Instruct

Meta-Llama-3.1-70B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct

Meta-Llama-3.1-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct-bnb-4bit

Meta-Llama-3.1-8B-Instruct-bnb-4bit specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct-FP8

Meta-Llama-3.1-8B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Meta-Llama-3.1-8B-Instruct-FP8-dynamic

Meta-Llama-3.1-8B-Instruct-FP8-dynamic specs, VRAM requirements, and which GPUs can run it.

MiniMax-M2.5-NVFP4

MiniMax-M2.5-NVFP4 specs, VRAM requirements, and which GPUs can run it.

Mistral 7B

Mistral 7B specs, VRAM requirements, and which GPUs can run it. Efficient and fast for everyday tasks.

Mistral-7B-Instruct-v0.2

Mistral-7B-Instruct-v0.2 specs, VRAM requirements, and which GPUs can run it.

Mistral-NeMo-Minitron-8B-Instruct

Mistral-NeMo-Minitron-8B-Instruct specs, VRAM requirements, and which GPUs can run it.

Mistral-Small-24B-Instruct-2501-AWQ

Mistral-Small-24B-Instruct-2501-AWQ specs, VRAM requirements, and which GPUs can run it.

Mixtral-8x7B-Instruct-v0.1-GPTQ

Mixtral-8x7B-Instruct-v0.1-GPTQ specs, VRAM requirements, and which GPUs can run it.

Nemotron-3-Nano-30B-A3B

Nemotron-3-Nano-30B-A3B specs, VRAM requirements, and which GPUs can run it.

Nemotron-Cascade-2-30B-A3B

Nemotron-Cascade-2-30B-A3B specs, VRAM requirements, and which GPUs can run it.

Nemotron-H-4B-Instruct-128K

Nemotron-H-4B-Instruct-128K specs, VRAM requirements, and which GPUs can run it.

Nemotron-H-8B-Base-8K

Nemotron-H-8B-Base-8K specs, VRAM requirements, and which GPUs can run it.

NextCoder-14B

NextCoder-14B specs, VRAM requirements, and which GPUs can run it.

NextCoder-32B

NextCoder-32B specs, VRAM requirements, and which GPUs can run it.

NextCoder-7B

NextCoder-7B specs, VRAM requirements, and which GPUs can run it.

nmt_21

nmt_21 specs, VRAM requirements, and which GPUs can run it.

NousCoder-14B

NousCoder-14B specs, VRAM requirements, and which GPUs can run it.

NVIDIA-Nemotron-3-Nano-4B-FP8

NVIDIA-Nemotron-3-Nano-4B-FP8 specs, VRAM requirements, and which GPUs can run it.

NVIDIA-Nemotron-3-Super-120B-A12B-BF16

NVIDIA-Nemotron-3-Super-120B-A12B-BF16 specs, VRAM requirements, and which GPUs can run it.

NVIDIA-Nemotron-3-Super-120B-A12B-FP8

NVIDIA-Nemotron-3-Super-120B-A12B-FP8 specs, VRAM requirements, and which GPUs can run it.

NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 specs, VRAM requirements, and which GPUs can run it.

OLMo-2-0325-32B-Instruct

OLMo-2-0325-32B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-0425-1B-Instruct

OLMo-2-0425-1B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-1124-13B-DPO

OLMo-2-1124-13B-DPO specs, VRAM requirements, and which GPUs can run it.

OLMo-2-1124-13B-Instruct

OLMo-2-1124-13B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-2-1124-7B-Instruct

OLMo-2-1124-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Olmo-3-32B-Think-DPO

Olmo-3-32B-Think-DPO specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct

Olmo-3-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct-DPO

Olmo-3-7B-Instruct-DPO specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-Instruct-SFT

Olmo-3-7B-Instruct-SFT specs, VRAM requirements, and which GPUs can run it.

Olmo-3-7B-RL-Zero-Math

Olmo-3-7B-RL-Zero-Math specs, VRAM requirements, and which GPUs can run it.

OLMo-7B-0724-Instruct-hf

OLMo-7B-0724-Instruct-hf specs, VRAM requirements, and which GPUs can run it.

OLMo-7B-Instruct

OLMo-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMo-7B-SFT-hf

OLMo-7B-SFT-hf specs, VRAM requirements, and which GPUs can run it.

Olmo-Hybrid-Instruct-DPO-7B

Olmo-Hybrid-Instruct-DPO-7B specs, VRAM requirements, and which GPUs can run it.

OLMoE-1B-7B-0125-Instruct

OLMoE-1B-7B-0125-Instruct specs, VRAM requirements, and which GPUs can run it.

OLMoE-1B-7B-0924-Instruct

OLMoE-1B-7B-0924-Instruct specs, VRAM requirements, and which GPUs can run it.

OpenReasoning-Nemotron-1.5B

OpenReasoning-Nemotron-1.5B specs, VRAM requirements, and which GPUs can run it.

OpenReasoning-Nemotron-32B

OpenReasoning-Nemotron-32B specs, VRAM requirements, and which GPUs can run it.

OptiMind-SFT

OptiMind-SFT specs, VRAM requirements, and which GPUs can run it.

OTel-LLM-1B-IT

OTel-LLM-1B-IT specs, VRAM requirements, and which GPUs can run it.

OTel-LLM-270M-IT

OTel-LLM-270M-IT specs, VRAM requirements, and which GPUs can run it.

Phi-3-medium-4k-instruct

Phi-3-medium-4k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3-mini-4k-instruct

Phi-3-mini-4k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3-mini-4k-instruct-gptq-4bit

Phi-3-mini-4k-instruct-gptq-4bit specs, VRAM requirements, and which GPUs can run it.

Phi-3-small-128k-instruct

Phi-3-small-128k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3-small-8k-instruct

Phi-3-small-8k-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-3.5-mini-instruct

Phi-3.5-mini-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-mini-MoE-instruct

Phi-mini-MoE-instruct specs, VRAM requirements, and which GPUs can run it.

Phi-tiny-MoE-instruct

Phi-tiny-MoE-instruct specs, VRAM requirements, and which GPUs can run it.

Qwen 2.5 72B

Qwen 2.5 72B specs, VRAM requirements, and which GPUs can run it. Strong on benchmarks, competitive with Llama 70B.

Qwen 2.5 72B

Qwen 2.5 72B specs, VRAM requirements, and which GPUs can run it.

Qwen 2.5 72B Instruct

Qwen 2.5 72B Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen1.5-110B-Chat-AWQ

Qwen1.5-110B-Chat-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2 72B

Qwen2 72B specs, VRAM requirements, and which GPUs can run it.

Qwen2 72B Instruct

Qwen2 72B Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2-0.5B-Instruct

Qwen2-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2-1.5B-Instruct

Qwen2-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2-7B-Instruct

Qwen2-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-0.5B-Instruct

Qwen2.5-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-1.5B-Instruct

Qwen2.5-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-1.5B-Instruct-AWQ

Qwen2.5-1.5B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-14B-Instruct-AWQ

Qwen2.5-14B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-32B-Instruct-AWQ

Qwen2.5-32B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-32B-Instruct-GPTQ-Int4

Qwen2.5-32B-Instruct-GPTQ-Int4 specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-3B-Instruct

Qwen2.5-3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-72B-Instruct

Qwen2.5-72B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-72B-Instruct-abliterated

Qwen2.5-72B-Instruct-abliterated specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-72B-Instruct-AWQ

Qwen2.5-72B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-7B-Instruct

Qwen2.5-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-0.5B-Instruct

Qwen2.5-Coder-0.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-1.5B-Instruct

Qwen2.5-Coder-1.5B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-14B-Instruct

Qwen2.5-Coder-14B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-32B-Instruct

Qwen2.5-Coder-32B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-32B-Instruct-AWQ

Qwen2.5-Coder-32B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B

Qwen2.5-Coder-7B specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct

Qwen2.5-Coder-7B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct-AWQ

Qwen2.5-Coder-7B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-Coder-7B-Instruct-GPTQ-Int4

Qwen2.5-Coder-7B-Instruct-GPTQ-Int4 specs, VRAM requirements, and which GPUs can run it.

Qwen2.5-VL-7B-Instruct-NVFP4

Qwen2.5-VL-7B-Instruct-NVFP4 specs, VRAM requirements, and which GPUs can run it.

Qwen3-1.7B

Qwen3-1.7B specs, VRAM requirements, and which GPUs can run it.

Qwen3-1.7B-GPTQ-Int8

Qwen3-1.7B-GPTQ-Int8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-14B-AWQ

Qwen3-14B-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen3-14B-FP8

Qwen3-14B-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-14B-Instruct

Qwen3-14B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen3-235B-A22B-Instruct-2507-FP8

Qwen3-235B-A22B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-30B-A3B-Instruct-2507-FP8

Qwen3-30B-A3B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-30B-A3B-Thinking-2507

Qwen3-30B-A3B-Thinking-2507 specs, VRAM requirements, and which GPUs can run it.

Qwen3-4B-Base

Qwen3-4B-Base specs, VRAM requirements, and which GPUs can run it.

Qwen3-4B-Instruct-2507

Qwen3-4B-Instruct-2507 specs, VRAM requirements, and which GPUs can run it.

Qwen3-4B-Instruct-2507-FP8

Qwen3-4B-Instruct-2507-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-4B-Thinking-2507

Qwen3-4B-Thinking-2507 specs, VRAM requirements, and which GPUs can run it.

Qwen3-8B

Qwen3-8B specs, VRAM requirements, and which GPUs can run it.

Qwen3-Coder-30B-A3B-Instruct

Qwen3-Coder-30B-A3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen3-Coder-30B-A3B-Instruct-FP8

Qwen3-Coder-30B-A3B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-Coder-30B-A3B-Instruct-gptq-8bit

Qwen3-Coder-30B-A3B-Instruct-gptq-8bit specs, VRAM requirements, and which GPUs can run it.

Qwen3-Next-80B-A3B-Instruct

Qwen3-Next-80B-A3B-Instruct specs, VRAM requirements, and which GPUs can run it.

Qwen3-Next-80B-A3B-Instruct-FP8

Qwen3-Next-80B-A3B-Instruct-FP8 specs, VRAM requirements, and which GPUs can run it.

Qwen3-VL-30B-A3B-Instruct-AWQ

Qwen3-VL-30B-A3B-Instruct-AWQ specs, VRAM requirements, and which GPUs can run it.

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2 specs, VRAM requirements, and which GPUs can run it.

Qwen3.5-35B-A3B-Text-qx64-hi-mlx

Qwen3.5-35B-A3B-Text-qx64-hi-mlx specs, VRAM requirements, and which GPUs can run it.

recurrentgemma-9b-it

recurrentgemma-9b-it specs, VRAM requirements, and which GPUs can run it.

Ring-2.5-1T

Ring-2.5-1T specs, VRAM requirements, and which GPUs can run it.

sarvam-105b-uncensored

sarvam-105b-uncensored specs, VRAM requirements, and which GPUs can run it.

shieldgemma-27b

shieldgemma-27b specs, VRAM requirements, and which GPUs can run it.

SmolLM-135M-Instruct

SmolLM-135M-Instruct specs, VRAM requirements, and which GPUs can run it.

SmolLM2-135M-Instruct

SmolLM2-135M-Instruct specs, VRAM requirements, and which GPUs can run it.

SOLAR-10.7B-Instruct-v1.0

SOLAR-10.7B-Instruct-v1.0 specs, VRAM requirements, and which GPUs can run it.

stablelm-2-1_6b-chat

stablelm-2-1_6b-chat specs, VRAM requirements, and which GPUs can run it.

Step-3.5-Flash-FP8

Step-3.5-Flash-FP8 specs, VRAM requirements, and which GPUs can run it.

TinyLlama-1.1B-Chat-v0.3-GPTQ

TinyLlama-1.1B-Chat-v0.3-GPTQ specs, VRAM requirements, and which GPUs can run it.

TinyLlama-1.1B-Chat-v1.0

TinyLlama-1.1B-Chat-v1.0 specs, VRAM requirements, and which GPUs can run it.

Turkish-Gemma-9b-T1

Turkish-Gemma-9b-T1 specs, VRAM requirements, and which GPUs can run it.

txgemma-27b-chat

txgemma-27b-chat specs, VRAM requirements, and which GPUs can run it.

txgemma-9b-chat

txgemma-9b-chat specs, VRAM requirements, and which GPUs can run it.

Unsloth Llama 3 8B Instruct

Unsloth Llama 3 8B Instruct specs, VRAM requirements, and which GPUs can run it.

UserLM-8b

UserLM-8b specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-34B-Chat

Yi-1.5-34B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-34B-Chat-16K

Yi-1.5-34B-Chat-16K specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-6B-Chat

Yi-1.5-6B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-9B-Chat

Yi-1.5-9B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-1.5-9B-Chat-16K

Yi-1.5-9B-Chat-16K specs, VRAM requirements, and which GPUs can run it.

Yi-34B-Chat

Yi-34B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-34B-Chat-8bits

Yi-34B-Chat-8bits specs, VRAM requirements, and which GPUs can run it.

Yi-6B-Chat

Yi-6B-Chat specs, VRAM requirements, and which GPUs can run it.

Yi-6B-Chat-4bits

Yi-6B-Chat-4bits specs, VRAM requirements, and which GPUs can run it.

Yi-Coder-9B-Chat

Yi-Coder-9B-Chat specs, VRAM requirements, and which GPUs can run it.

zephyr-7b-alpha

zephyr-7b-alpha specs, VRAM requirements, and which GPUs can run it.

zephyr-7b-gemma-sft-v0.1

zephyr-7b-gemma-sft-v0.1 specs, VRAM requirements, and which GPUs can run it.

zephyr-orpo-141b-A35b-v0.1

zephyr-orpo-141b-A35b-v0.1 specs, VRAM requirements, and which GPUs can run it.