Skip to main content

2025

November 27, 2025 - Technical Paper Release

Qwen3-VL Technical Paper Published We released the Qwen3-VL technical paper, providing comprehensive technical details about the model architecture, training methodology, and evaluation results. Key Topics Covered:
  • Interleaved-MRoPE architecture
  • DeepStack multi-level feature fusion
  • Text-Timestamp alignment for videos
  • Training data and methodology
  • Comprehensive benchmark evaluations
  • Ablation studies
Resources:

October 21, 2025 - Qwen3-VL 2B & 32B Models

New Model Releases Released four new models expanding the Qwen3-VL family: Highlights:
  • 2B models suitable for edge deployment and consumer GPUs
  • 32B models provide high performance with reasonable resource requirements
  • Both Instruct and Thinking editions available
  • Available on both HuggingFace and ModelScope

October 15, 2025 - Qwen3-VL 4B & 8B Models

New Model Releases Released four new models in the mid-size range: Highlights:
  • Balanced performance-to-resource ratio
  • 4B suitable for RTX 3090/4090 GPUs
  • 8B provides strong capabilities for professional use
  • Both editions support full feature set

October 4, 2025 - MoE Models & FP8 Quantization

Qwen3-VL-30B-A3B Release Released our first Mixture-of-Experts (MoE) models: FP8 Quantized Models Released FP8 quantized versions of all Qwen3-VL models for efficient deployment on H100/H200 GPUs. Features:

September 23, 2025 - Qwen3-VL 235B Launch

Qwen3-VL-235B-A22B Release Released the flagship Qwen3-VL models: Major Features:
  • State-of-the-art vision-language performance
  • 235B parameters, 22B active (MoE)
  • Native 256K context, expandable to 1M
  • Advanced capabilities:
    • Visual agent (PC/mobile GUI interaction)
    • Visual coding (Draw.io, HTML/CSS/JS)
    • 3D grounding and spatial reasoning
    • Enhanced video understanding
    • 32-language OCR
    • Superior multimodal reasoning
Blog: Official announcement

April 8, 2025 - Fine-tuning Code Release

Fine-tuning Support Released fine-tuning code for Qwen2-VL and Qwen2.5-VL, compatible with Qwen3-VL. Resources:

March 25, 2025 - Qwen2.5-VL-32B

Qwen2.5-VL-32B-Instruct Release Released Qwen2.5-VL-32B-Instruct. Improvements:
  • Smarter responses
  • Better human preference alignment
  • Enhanced reasoning capabilities
Blog: Announcement

February 20, 2025 - Qwen2.5-VL Technical Report

Technical Report & AWQ Models Released the Qwen2.5-VL Technical Report along with AWQ-quantized models:

January 28, 2025 - Qwen2.5-VL Series

Qwen2.5-VL Family Release Released the complete Qwen2.5-VL series on HuggingFace. Models:
  • Qwen2.5-VL-3B-Instruct
  • Qwen2.5-VL-7B-Instruct
  • Qwen2.5-VL-72B-Instruct
Blog: Announcement

2024

December 25, 2024 - QvQ-72B-Preview

Visual Reasoning Research Model Released QvQ-72B-Preview, an experimental model focusing on enhanced visual reasoning. Features:
  • 72B parameters
  • Advanced visual reasoning capabilities
  • Research preview for community feedback
Blog: QvQ announcement

September 19, 2024 - Qwen2-VL-72B

Large Model & Quantization Released Qwen2-VL-72B-Instruct with multiple quantized versions: Qwen2-VL Paper Published Qwen2-VL technical paper.

August 30, 2024 - Qwen2-VL Series Launch

Qwen2-VL Family Release Released the Qwen2-VL series:
  • Qwen2-VL-2B-Instruct
  • Qwen2-VL-7B-Instruct
  • (72B announced, released later)
Features:
  • Multi-resolution image understanding
  • Video support
  • Enhanced OCR
  • Improved multimodal reasoning
Collection: Qwen2-VL Blog: Qwen2-VL announcement

Release Timeline Summary

DateReleaseHighlights
2025-11-27Technical PaperArchitectural details and evaluations
2025-10-212B & 32B ModelsEdge to high-performance range
2025-10-154B & 8B ModelsMid-range balanced performance
2025-10-0430B-A3B MoE & FP8Efficient MoE, quantized models
2025-09-23235B-A22B FlagshipState-of-the-art VLM
2025-04-08Fine-tuning CodeCustom training support
2025-03-25Qwen2.5-VL-32BImproved alignment
2025-02-20Qwen2.5-VL Report & AWQTechnical documentation
2025-01-28Qwen2.5-VL SeriesComplete model family
2024-12-25QvQ-72B PreviewVisual reasoning research
2024-09-19Qwen2-VL-72B & PaperLarge model & documentation
2024-08-30Qwen2-VL LaunchSeries introduction

Migration Guides

From Qwen2.5-VL to Qwen3-VL

Key Changes:
  1. Patch size: 14 → 16
    # Update qwen-vl-utils calls
    process_vision_info(messages, image_patch_size=16)  # was 14
    
  2. Video metadata: New return format
    images, videos, video_kwargs = process_vision_info(
        messages,
        image_patch_size=16,
        return_video_kwargs=True,
        return_video_metadata=True  # New for Qwen3-VL
    )
    
  3. Architecture improvements: Interleaved-MRoPE, DeepStack, Text-Timestamp alignment
  4. New capabilities:
    • 3D grounding
    • Enhanced visual coding
    • Better spatial reasoning
    • Extended OCR (32 languages)
Compatibility: Most Qwen2.5-VL code works with minimal changes.

From Qwen2-VL to Qwen3-VL

Major Updates:
  • Significantly improved performance across all tasks
  • Native long context (256K vs 32K)
  • Video understanding enhancements
  • New architectures (MoE, Thinking editions)
  • Broader model size range (2B-235B)

Future Roadmap

The following items are planned but subject to change:
  • Additional model sizes and variants
  • Enhanced fine-tuning tools and examples
  • More cookbook examples and tutorials
  • Extended language support for OCR
  • Performance optimizations
  • Community-contributed extensions

Resources

Documentation

Code & Models

Papers

Community

Citation

If you use Qwen3-VL in your research, please cite:
@article{Qwen3-VL,
  title={Qwen3-VL Technical Report},
  author={Qwen Team},
  journal={arXiv preprint arXiv:2511.21631},
  year={2025}
}