Skip to content

Embedding Models

Models that convert text, images, or audio into dense vectors for semantic search, similarity, and retrieval-augmented generation (RAG).

Text Embeddings

Small & Fast (Browser-Ready)

Model Dims Size Browser? License Link
all-MiniLM-L6-v2 384 23 MB Yes Apache-2.0 HF
bge-small-en-v1.5 384 33 MB Yes MIT HF
gte-small 384 67 MB Yes MIT HF
multilingual-e5-small 384 118 MB Yes MIT HF

Large & High Quality

Model Dims Size Context License Link
bge-large-en-v1.5 1024 335 MB 512 MIT HF
e5-large-v2 1024 330 MB 512 MIT HF
nomic-embed-text-v1.5 768 274 MB 8192 Apache-2.0 HF
jina-embeddings-v2-base 768 137 MB 8192 Apache-2.0 HF
mxbai-embed-large 1024 335 MB 512 Apache-2.0 HF

Image Embeddings

Model Dims Size License Link
CLIP ViT-B/32 512 150 MB MIT HF
CLIP ViT-L/14 768 890 MB MIT HF
SigLIP 768 400 MB Apache-2.0 HF
DINOv2 768 86 MB (small) Apache-2.0 HF

Multimodal Embeddings

Model Modalities Dims License Link
CLIP Text + Image 512/768 MIT HF
ImageBind Text + Image + Audio + Video + IMU + Thermal 1024 CC-BY-NC-SA HF
ONE-PEACE Text + Image + Audio 1536 Apache-2.0 HF

Vector Databases (for storing embeddings)

Database Browser? License Link
vectra (in-memory) Yes MIT github.com/Stevenic/vectra
Voy (WASM) Yes Apache-2.0 github.com/tantaraio/voy
Chroma No (Python) Apache-2.0 trychroma.com
Qdrant No (server) Apache-2.0 qdrant.tech