Available Models
Qwen3-VL is available in multiple sizes and architectures to suit different deployment scenarios, from edge devices to cloud infrastructure.Model Architectures
- Dense Models: Traditional transformer architecture
- MoE (Mixture of Experts): Sparse architecture for efficient scaling
- Instruct Editions: Fine-tuned for instruction following
- Thinking Editions: Enhanced with reasoning capabilities
Dense Models
Qwen3-VL-2B
Instruct Edition- HuggingFace: Qwen/Qwen3-VL-2B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: October 21, 2025
- HuggingFace: Qwen/Qwen3-VL-2B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: October 21, 2025
Qwen3-VL-4B
Instruct Edition- HuggingFace: Qwen/Qwen3-VL-4B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: October 15, 2025
- HuggingFace: Qwen/Qwen3-VL-4B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: October 15, 2025
Qwen3-VL-8B
Instruct Edition- HuggingFace: Qwen/Qwen3-VL-8B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: October 15, 2025
- HuggingFace: Qwen/Qwen3-VL-8B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: October 15, 2025
Qwen3-VL-32B
Instruct Edition- HuggingFace: Qwen/Qwen3-VL-32B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: October 21, 2025
- HuggingFace: Qwen/Qwen3-VL-32B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: October 21, 2025
MoE Models
Qwen3-VL-30B-A3B
Architecture: 30B total parameters, 3B active per token Instruct Edition- HuggingFace: Qwen/Qwen3-VL-30B-A3B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: October 4, 2025
- HuggingFace: Qwen/Qwen3-VL-30B-A3B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: October 4, 2025
Qwen3-VL-235B-A22B
Architecture: 235B total parameters, 22B active per token Instruct Edition- HuggingFace: Qwen/Qwen3-VL-235B-A22B-Instruct
- ModelScope: Available in Qwen3-VL Collection
- Released: September 23, 2025
- HuggingFace: Qwen/Qwen3-VL-235B-A22B-Thinking
- ModelScope: Available in Qwen3-VL Collection
- Released: September 23, 2025
- HuggingFace: Qwen/Qwen3-VL-235B-A22B-Instruct-FP8
- For efficient deployment on H100/H200 GPUs
Collections
HuggingFace Collection
All Qwen3-VL models including FP8 quantized versions:ModelScope Collection
All Qwen3-VL models for users in mainland China:Quantized Models
FP8 Versions
FP8 quantized models are available for all major model sizes, optimized for deployment on NVIDIA H100+ GPUs with CUDA 12+. Find all FP8 versions in:Legacy Models
Qwen2.5-VL Series
Qwen2.5-VL-32B-Instruct- HuggingFace: Qwen/Qwen2.5-VL-32B-Instruct
- Released: March 25, 2025
- Collection: Qwen2.5-VL
- Released: January 28, 2025
Qwen2-VL Series
Qwen2-VL-72B-Instruct- HuggingFace: Qwen/Qwen2-VL-72B-Instruct
- Quantized: AWQ, GPTQ-Int4, GPTQ-Int8
- Released: September 19, 2024
- Released: August 30, 2024
QvQ-72B-Preview
Experimental research model focusing on visual reasoning:- HuggingFace: Qwen/QVQ-72B-Preview
- Released: December 25, 2024
Model Selection Guide
By Use Case
Edge Deployment: Qwen3-VL-2B (Instruct/Thinking)- Smallest footprint, suitable for mobile and edge devices
- Good balance between performance and resource requirements
- Strong performance for demanding applications
- State-of-the-art vision-language understanding
- Best for research and high-end applications
Instruct vs Thinking
Instruct Editions:- Optimized for following user instructions
- Better for general-purpose applications
- More aligned with human preferences
- Enhanced reasoning capabilities
- Better for complex problem-solving
- Excels in STEM and mathematical tasks