2025
Team Compute Model Set provides a recommended set up for entities developing new AI deployments.
16GB VRAM
As of early 2025, the best VRAM size / dollar is from Nvidia 4060 Ti 16GB (with Compute Capability 8.9), we will be using that as reference in creation of Model Sets., but any device with AI accelerated instructions and at least 16GB fast RAM (e.g. from AMD, Apple) can also be used.
16GB Set 1
Set 1 consists of stable proven models
Due to Entity Agent's current RAG focus, we have 2 embedding models in the Model Set 1.
root@ollama0-aunsw0:~# ollama ps
NAME ID SIZE PROCESSOR UNTIL
bge-m3:567m 790764642607 1.7 GB 100% GPU Forever
nomic-embed-text:137m-v1.5-fp16 0a109f422b47 849 MB 100% GPU Forever
llama3.2-vision:11b-instruct-q4_K_M 085a1fdae525 12 GB 100% GPU Forever
root@ollama0-aunsw0:~# nvidia-smi
Wed May 7 17:24:36 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti Off | 00000000:01:00.0 Off | N/A |
| 0% 47C P2 24W / 165W | 13282MiB / 16380MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 342738 C /usr/bin/ollama 0MiB |
| 0 N/A N/A 342777 C /usr/bin/ollama 0MiB |
| 0 N/A N/A 342833 C /usr/bin/ollama 0MiB |
+-----------------------------------------------------------------------------------------+
16GB Set 2
Entity Agent can take advantage the benefits of needs reasoning model focus, we have 2 embedding models in the Model Set 1.
root@ollama1-aunsw0:/# ollama ps
NAME ID SIZE PROCESSOR UNTIL
nomic-embed-text:137m-v1.5-fp16 0a109f422b47 849 MB 100% GPU Forever
bge-m3:567m 790764642607 1.7 GB 100% GPU Forever
qwen3:14b-q4_K_M 7d7da67570e2 14 GB 100% GPU Forever
llama3.2-vision:11b-instruct-q4_K_M 085a1fdae525 12 GB 100% GPU Forever