Qwen
Recommended Models:
- qwq-32b (20GB) - this is the default reasoning model for 25.04 release
- qwen2.5-14b (9GB)
Recommended Models:
The recent Qwen2.5 release has pushed open source large language models (LLMs) to new heights, beating the previous open source leader Llama 3.1 across a number of benchmarks.
Qwen-2.5-7B-Q4 model is now available by default (along with Llama-3.1-8B-Q4 model) on most public Compute Asset e,g. model.aunsw.88.io
For Compute Assets with 16GB+ of VRAM on GPUs, running Qwen2.5-14B-Q8 model is recommended.
Becareful that the 3B and 72B variants of Qwen 2.5 ave some restrictions on commercial use, the others licensed under Apache 2.0 which is ok for most type of use.
For those with low-end GPUs:
Notes:
Recommended:
Qwen 3 is a hybrid model, so it can operate in either /think
or /no_think
mode. Qwen 3 also has Dense models and MoE (mixture of experts) models.
32b is the top dense model available.
41GB. 79%GPU 21%CPU
14b is the recommended Dense model for most use cases.
15GB 100%GPU
30b is the a small MoE model.
27GB 100%GPU