Model Set

2025

Team Compute Model Set provides a recommended set up for entities developing new AI deployments.

16GB VRAM

As of early 2025, the best VRAM size / dollar is from Nvidia 4060 Ti 16GB (with Compute Capability 8.9), we will be using that as reference in creation of Model Sets., but any device with AI accelerated instructions and at least 16GB fast RAM (e.g. from AMD, Apple) can also be used.

16GB Set 1

Set 1 consists of stable proven models

Due to Entity Agent's current RAG focus, we have 2 embedding models in the Model Set 1.

root@ollama0-aunsw0:~# ollama ps

NAME                                   ID              SIZE      PROCESSOR    UNTIL   
bge-m3:567m                            790764642607    1.7 GB    100% GPU     Forever    
nomic-embed-text:137m-v1.5-fp16        0a109f422b47    849 MB    100% GPU     Forever    
llama3.2-vision:11b-instruct-q4_K_M    085a1fdae525    12 GB     100% GPU     Forever
root@ollama0-aunsw0:~# nvidia-smi
 
Wed May  7 17:24:36 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 Ti     Off |   00000000:01:00.0 Off |                  N/A |
|  0%   47C    P2             24W /  165W |   13282MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A    342738      C   /usr/bin/ollama                                 0MiB |
|    0   N/A  N/A    342777      C   /usr/bin/ollama                                 0MiB |
|    0   N/A  N/A    342833      C   /usr/bin/ollama                                 0MiB |
+-----------------------------------------------------------------------------------------+

16GB Set 2

Entity Agent can take advantage the benefits of needs reasoning model focus, we have 2 embedding models in the Model Set 1.

root@ollama1-aunsw0:/# ollama ps
NAME                                   ID              SIZE      PROCESSOR    UNTIL   
nomic-embed-text:137m-v1.5-fp16        0a109f422b47    849 MB    100% GPU     Forever    
bge-m3:567m                            790764642607    1.7 GB    100% GPU     Forever    
qwen3:14b-q4_K_M                       7d7da67570e2    14 GB     100% GPU     Forever    
llama3.2-vision:11b-instruct-q4_K_M    085a1fdae525    12 GB     100% GPU     Forever