Qwen

admin · September 22, 2024, 11:44am

Qwen

Recommended Models:

qwq-32b (20GB) - this is the default reasoning model for 25.07 release
qwen2.5-14b (9GB)

admin · September 25, 2024, 8:47pm

Qwen 2.5

The recent Qwen2.5 release has pushed open source large language models (LLMs) to new heights, beating the previous open source leader Llama 3.1 across a number of benchmarks.

Qwen-2.5-7B-Q4 model is now available by default (along with Llama-3.1-8B-Q4 model) on most public Compute Asset e,g. model.aunsw.88.io

For Compute Assets with 16GB+ of VRAM on GPUs, running Qwen2.5-14B-Q8 model is recommended.

Licenses

Becareful that the 3B and 72B variants of Qwen 2.5 ave some restrictions on commercial use, the others licensed under Apache 2.0 which is ok for most type of use.

Limited Resources

For those with low-end GPUs:

4GB VRAM
Breakthrough: Running the New King of Open-Source LLMs QWen2.5 on an Ancient 4GB GPU | by Gavin Li | AI Advances

admin · March 8, 2025, 3:38am

QwQ

Notes:

Content Window bigger than 8K tokens needs special YaRN setup.
魔搭社区

Recommended:

num_ctx of 8192 - context length
top_k of 30 - considers top 30 tokens
temperature of 0.6
top_p of 0.95 - considers top 95% tokens (higher more diverse, lower more focused)

admin · April 30, 2025, 2:46am

Qwen 3

Qwen 3 is a hybrid model, so it can operate in either /think or /no_think mode. Qwen 3 also has Dense models and MoE (mixture of experts) models.

qwen3:32b-q4_K_M

32b is the top dense model available.
41GB. 79%GPU 21%CPU

qwen3:14b-q4_K_M

14b is the recommended Dense model for most use cases.
15GB 100%GPU

qwen3:30b-a3b-q4_K_M

30b is the a small MoE model.
27GB 100%GPU

admin · May 31, 2025, 5:11am

Qwen2.5 VL

For the 25.07 release the Advanced Vision Model is Qwen2.5-VL-72B-Instruct-4bit.

Deployment

Ollama

qwen2.5vl

MLX

GitHub - Blaizzy/mlx-vlm: MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Qwen

Qwen

Qwen 2.5

Licenses

Limited Resources

QwQ

Qwen 3

qwen3:32b-q4_K_M

qwen3:14b-q4_K_M

qwen3:30b-a3b-q4_K_M

Qwen2.5 VL

Deployment

Ollama

MLX

References