What Are Embeddings
🧠
Converting Text into Numbers AI Understands
Computers cannot directly compare the meaning of two text strings. But they can compare numbers. Embeddings solve this by converting any piece of text into a dense numerical vector — a list of numbers that encodes the semantic meaning of that text.
💡 "Dog" and "Puppy" will have very similar embedding vectors. "Dog" and "Airplane" will have vectors that are far apart. This is meaning captured as math.
Vector Properties
🔢
What These Vectors Look Like
·High-dimensional — typically 512, 768, or 1536 numbers per vector
·Similar meaning = similar vectors — this enables semantic search
·Fixed size output — 5 words or 500 words produce the same size vector
Role in RAG
⚙️
Where Embeddings Are Used
1
At indexing timeEvery document chunk is embedded and stored in the vector store
2
At query timeThe user's question is also embedded — using the exact same model
⚠️ Always use the same embedding model for both indexing and querying. Mixing models produces meaningless comparisons and broken retrieval.
Popular Models
🎯
Which Model to Choose
·OpenAI text-embedding-3-small — Fast, affordable, excellent quality. Most popular production choice.
·OpenAI text-embedding-3-large — Higher quality at higher cost — for demanding precision
·Sentence Transformers — Open source, runs locally, zero API cost — great for privacy
·HuggingFace BAAI/bge-small-en — Strong free choice, runs locally
Similarity Measurement
📐
How Similarity Is Calculated
Cosine Similarity measures the angle between two vectors. Score of 1.0 means identical meaning. Score of 0.0 means completely unrelated. This single number drives all retrieval decisions in RAG.
💡 Geometric intuition: two vectors pointing in the same direction are similar. Vectors pointing in opposite directions are very different.
✦
Embeddings are the core engine that makes RAG work. They translate the fuzzy concept of meaning into precise mathematics — vectors that can be compared, searched, and ranked. Use the same model consistently, choose based on your budget and privacy requirements, and your RAG system will retrieve with genuine semantic understanding.