Embedding Models

Embedding

The word "Embeddings" use in the AI context are numerical representations of real-world objects that machine learning uses to represent complex data and relationships.

Unlike traditional "records" in a database that represent just the "value" of the data, embeddings can provide a lot of information about "relationships" between data.

Embeddings are just a series of numbers to represent a certain meaning, so if the "meaning" they are representing are different then there will be a different series of numbers.

For example, there can be TWO embeddings for the single word "play" :

  1. one embedding for "play" as in "play with a toy"
  2. one embedding for "play" as in "going to a play"

The more numbers there are in the series, the more information we have about that "thing".

By comparing the embeddings of 2 things (their 2 series of numbers) we can tell how "similar" these 2 things are to each other - that is we can discover the relationship between them.

For vector databases to work together as a team, they much use the SAME embedding model. Thus it is VERY important for Compute Teams working together to agree on embedding models they use.

Embedding Models

We don't normally use the large models (e.g. Llama) for generating embeddings since it is slow and the generated embeddings are very large. Instead a smaller more specialised embedding models are used.

Multimodal Models

There used to be different types of embedding models for different data types e.g. for images, for text, for audio, for relationships etc. But recently Multimodal Embedding Models start to work across multiple data types.

Search Models

Embedding and Reranking Models enable data to be stored and search based on their meaning (semantics) instead of their words (syntax).

Embedding Model

Qwen3-VL-Embedding-8B is the default embedding model for the 26.02 release.

Input Template for Embedding:

<|im_start|>system
Represent the user’s input.
<|im_end|>
<|im_start|>user
{Instance}
<|im_end|><|endoftext|>

Note we are using the 8B instead of the 2B model to reduce resource usage the dimension can be adjusted - recommended 4096.

Reranking Model

Input Template for Reranking:

<|im_start|>system
Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".
<|im_end|>
<|im_start|>user
<Instruct>: {Instruction}
<Query>: {Query}
<Document>: {Document}
<|im_end|>
<|im_start|>assistant

Older Models

Older embedding models that are still supported but should no longer be used for new applications.