Model Cards - Qwen3-VL

Available Models

Qwen3-VL is available in multiple sizes and architectures to suit different deployment scenarios, from edge devices to cloud infrastructure.

Model Architectures

Dense Models: Traditional transformer architecture
MoE (Mixture of Experts): Sparse architecture for efficient scaling
Instruct Editions: Fine-tuned for instruction following
Thinking Editions: Enhanced with reasoning capabilities

Dense Models

Qwen3-VL-2B

Instruct Edition

HuggingFace: Qwen/Qwen3-VL-2B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: October 21, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-2B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: October 21, 2025

Qwen3-VL-4B

Instruct Edition

HuggingFace: Qwen/Qwen3-VL-4B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: October 15, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-4B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: October 15, 2025

Qwen3-VL-8B

Instruct Edition

HuggingFace: Qwen/Qwen3-VL-8B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: October 15, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-8B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: October 15, 2025

Qwen3-VL-32B

Instruct Edition

HuggingFace: Qwen/Qwen3-VL-32B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: October 21, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-32B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: October 21, 2025

MoE Models

Qwen3-VL-30B-A3B

Architecture: 30B total parameters, 3B active per token Instruct Edition

HuggingFace: Qwen/Qwen3-VL-30B-A3B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: October 4, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-30B-A3B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: October 4, 2025

Qwen3-VL-235B-A22B

Architecture: 235B total parameters, 22B active per token Instruct Edition

HuggingFace: Qwen/Qwen3-VL-235B-A22B-Instruct
ModelScope: Available in Qwen3-VL Collection
Released: September 23, 2025

Thinking Edition

HuggingFace: Qwen/Qwen3-VL-235B-A22B-Thinking
ModelScope: Available in Qwen3-VL Collection
Released: September 23, 2025

FP8 Quantized Version

HuggingFace: Qwen/Qwen3-VL-235B-A22B-Instruct-FP8
For efficient deployment on H100/H200 GPUs

Collections

HuggingFace Collection

All Qwen3-VL models including FP8 quantized versions:

Qwen3-VL Collection

ModelScope Collection

All Qwen3-VL models for users in mainland China:

Qwen3-VL Collection

Quantized Models

FP8 Versions

FP8 quantized models are available for all major model sizes, optimized for deployment on NVIDIA H100+ GPUs with CUDA 12+. Find all FP8 versions in:

Legacy Models

Qwen2.5-VL Series

Qwen2.5-VL-32B-Instruct

HuggingFace: Qwen/Qwen2.5-VL-32B-Instruct
Released: March 25, 2025

Other Qwen2.5-VL models: 3B, 7B, 72B

Collection: Qwen2.5-VL
Released: January 28, 2025

AWQ Quantized Versions: Available for 3B, 7B, and 72B models

Qwen2-VL Series

Qwen2-VL-72B-Instruct

HuggingFace: Qwen/Qwen2-VL-72B-Instruct
Quantized: AWQ, GPTQ-Int4, GPTQ-Int8
Released: September 19, 2024

Other sizes: 2B, 7B

Released: August 30, 2024

QvQ-72B-Preview

Experimental research model focusing on visual reasoning:

HuggingFace: Qwen/QVQ-72B-Preview
Released: December 25, 2024

Model Selection Guide

By Use Case

Edge Deployment: Qwen3-VL-2B (Instruct/Thinking)

Smallest footprint, suitable for mobile and edge devices

Balanced Performance: Qwen3-VL-4B or Qwen3-VL-8B

Good balance between performance and resource requirements

High Performance: Qwen3-VL-32B or Qwen3-VL-30B-A3B

Strong performance for demanding applications

Maximum Capability: Qwen3-VL-235B-A22B

State-of-the-art vision-language understanding
Best for research and high-end applications

Instruct vs Thinking

Instruct Editions:

Optimized for following user instructions
Better for general-purpose applications
More aligned with human preferences

Thinking Editions:

Enhanced reasoning capabilities
Better for complex problem-solving
Excels in STEM and mathematical tasks

​Available Models

​Model Architectures

​Dense Models

​Qwen3-VL-2B

​Qwen3-VL-4B

​Qwen3-VL-8B

​Qwen3-VL-32B

​MoE Models

​Qwen3-VL-30B-A3B

​Qwen3-VL-235B-A22B

​Collections

​HuggingFace Collection

​ModelScope Collection

​Quantized Models

​FP8 Versions

​Legacy Models

​Qwen2.5-VL Series

​Qwen2-VL Series

​QvQ-72B-Preview

​Model Selection Guide

​By Use Case

​Instruct vs Thinking

​Related Resources

Available Models

Model Architectures

Dense Models

Qwen3-VL-2B

Qwen3-VL-4B

Qwen3-VL-8B

Qwen3-VL-32B

MoE Models

Qwen3-VL-30B-A3B

Qwen3-VL-235B-A22B

Collections

HuggingFace Collection

ModelScope Collection

Quantized Models

FP8 Versions

Legacy Models

Qwen2.5-VL Series

Qwen2-VL Series

QvQ-72B-Preview

Model Selection Guide

By Use Case

Instruct vs Thinking

Related Resources