Software Engineer

What I Learned Today About “Vector Embeddings”

2 min read

Aug 10, 2025

Introduction

What I learned today about “Vector Embeddings” from the Python + AI: Level Up series of Microsoft Reactor.

I’m thinking about semantic search these days, and this session was so timely.

Here are my main takeaways:

Vectors are words or data converted into numbers using embedding models.
Each embedding model understands only its own vectors.
Computers understand numbers, not words — that’s why we convert text, images, and video into numeric form.

Add semantic, multilingual, and multimodal search to websites using vector similarity search.
Use vector embeddings for recommendation systems and fraud detection.

Use Approximate Nearest Neighbor (ANN) algorithms such as HNSW, IVFFlat, Faiss, or DiskANN instead of exhaustive search.
HNSW (Hierarchical Navigable Small Worlds) works well for frequently updated data and scales logarithmically with large indexes.

Vector quantization reduces the size of vectors by lowering numeric precision.
Scalar quantization converts 64-bit floating-point numbers into smaller integers (16-bit, 8-bit, or 4-bit).
Binary quantization (1-bit) gives extreme compression while still retaining semantic information.
In Azure AI Search, quantization can reduce storage by ~74% (8-bit) and ~96% (1-bit).
Dimensionality reduction reduces the number of vector dimensions.
Matryoshka Representation Learning (MRL) can reduce dimensions while keeping semantic meaning (for supported models).

Combine quantization and dimensionality reduction carefully.
Use a two-stage retrieval process:
1. Retrieve top N results from the compressed index (fast).
2. Re-score those results using uncompressed vectors (accurate).
This approach ensures both speed and quality in vector search.

It was a great session with a lot of insightful information.

The session link is here: https://developer.microsoft.com/en-us/reactor/events/25084/

Hello! How can I help you today?

Virtual Chat

Hello! My name is VirtuBot. I am a virtual assistant representing Nazar. You can ask me questions as if I am Nazar.
4:29 PM
Tell me about yourself?