Utkarsh Rastogi for AWS Community Builders

Posted on May 29 • Edited on Jun 2

🧠 Day 6: Demystifying LangChain Embeddings and Vector Stores — A Chai Tapri Story ☕

#langchain #llm #aws #openai

Welcome to Day 6 of my LangChain series!

Today, let’s dive into the heart of modern AI-powered search: embedding models — using a story close to home: an Indian chai tapri.

☕ The Chai Tapri Analogy

Picture this.

You’re at your favorite chai tapri (roadside tea stall) in India.

Customers stroll in and say:

“Bhaiya, ek kadak chai dena.”
“Ek strong wali chai chahiye.”
“Zyada pati wali chai milegi kya?”

The words change, but the meaning stays the same:

➡️ “I want a strong cup of tea.”

Now, the chaiwala doesn’t wait for exact keywords. He understands the intent behind each request.

That’s exactly what embedding models do for machines.

They allow machines to understand the meaning behind different wordings — just like our chaiwala!

🧠 What Are Embedding Models?

Embedding models convert human language into a numerical representation (called a vector) that captures its semantic meaning.

This means machines no longer need exact keyword matches to find relevant content. They can search by meaning instead.

💡 Imagine being able to retrieve tweets, documents, or FAQs — not by matching the same words — but by understanding what the text is really about.

🔑 Key Concepts of Embedding Models

(1) Embed Text as a Vector

Any input — be it a tweet, a question, or a paragraph — is transformed into a fixed-length vector of numbers.

This vector acts as a semantic fingerprint of the text.

So whether you say "strong chai" or "kadak chai", both are encoded into vectors that lie close together in the model’s vector space — because their meaning is nearly identical.

(2) Measure Similarity

Once texts are represented as vectors, we can measure how close their meanings are by comparing the vectors.

📐 Cosine Similarity is commonly used. It measures the angle between two vectors:

Smaller angle → More similar meaning
Different words, but same intent, result in vectors pointing in the same direction

"strong chai" → [0.21, -0.34, 0.78, ..., 0.13]

"kadak chai" → [0.19, -0.36, 0.75, ..., 0.12]

🔍 These vectors are very close, so the cosine similarity score approaches 1.0, meaning their meanings are nearly the same.

🛠️ LangChain’s Interface for Embeddings

LangChain simplifies working with embeddings by offering a unified interface, regardless of the underlying provider (OpenAI, HuggingFace, Cohere, etc.).

It provides two central methods:

embed_documents → For embedding multiple texts (used to search against)
embed_query → For embedding a single search query

💡 Think back to our chai tapri.

The customer’s request — like “ek kadak chai dena” — is your query.

The chaiwala’s mental list of chai types — kadak, zyada pati wali, masala — are your documents.

Embedding models treat these two differently so the matching is smarter. LangChain makes it easy to handle both using a clean, consistent interface.

📏 Similarity Metrics for Comparing Texts

Once texts are embedded into vectors, measuring similarity becomes a mathematical task.

Think of embeddings as coordinates in a multi-dimensional space, where similar meanings cluster together.

Here are some common similarity metrics:

Cosine Similarity: Measures angle between two vectors (preferred by OpenAI)
Euclidean Distance: Measures direct line distance between vectors
Dot Product: Measures how much one vector projects onto another

🔍 Back to our chai example:

If two customers say:

“Strong wali chai chahiye”
“Ek kadak chai dena”

The cosine similarity between their vector representations will be close to 1.0, indicating that the chaiwala (or your AI system!) can serve them the same strong tea — because their intent is the same, even if their words differ.

🧬 Types of Embedding Models

LangChain supports various embedding models, each suited to different tasks:

1. OpenAI Embeddings

Widely used, excellent performance for general-purpose tasks like semantic search and Q&A.

2. HuggingFace Transformers

Open-source and highly customizable. Suitable for on-premise use or when fine-tuning is needed.

3. Cohere Embeddings

Powerful embeddings API with strong multilingual support.

4. Google Vertex AI Embeddings

Useful for enterprises already in the GCP ecosystem.

5. Anthropic Claude (via Bedrock)

Emerging support for Claude-based embeddings focused on safe and aligned AI.

☕️ Just like every chai tapri has its own special masala mix, each embedding model has its own recipe for understanding meaning.

You pick the one that best suits your taste — or in AI terms, your use case and infrastructure.

🧭 Where Do These Vectors Go? Say Hello to Vector Stores!

Now that we understand how embedding models convert text into vectors, here’s the next question:

Where do we store these vectors so we can search them later?

That’s where Vector Stores come in.

Think of a vector store as our chaiwala’s diary, where he has stored the semantic fingerprint (vector) of every chai request he’s ever heard.

So when a new customer walks in and says something new — even if it’s in a completely different phrasing — the chaiwala can quickly look through his diary, find the closest match, and serve the right cup!

🔍 How Vector Stores Work in the Real World

Let’s go beyond the metaphor.

In actual AI systems, vector stores are databases optimized to store and retrieve high-dimensional vectors efficiently. Here's how they function in practice:

🧱 1. Indexing

When you store vectors in a vector store, it builds a specialized index behind the scenes (like an address book for vectors).

This index is optimized for nearest neighbor search — helping the system quickly locate vectors that are close in meaning to a new query.

For example, if your query is “strong chai,” the index helps find all stored vectors (documents) whose meanings are closest — even if they used completely different words.

🏃‍♂️ 2. Approximate Search (ANN)

In real applications, speed matters. So vector stores use Approximate Nearest Neighbor (ANN) algorithms.

Instead of checking every vector one by one, ANN quickly narrows down to a shortlist of likely candidates based on the index — balancing speed and accuracy.

Think of it like your chaiwala instantly recalling the 3 most likely types of chai you meant — without going through his entire diary.

🧠 3. Metadata & Filtering

Advanced vector stores also allow you to store metadata (like tags, categories, timestamps) along with vectors.

So you can search not just by meaning, but also filter by attributes — for example,

"Show me all strong chai requests made in the evening."

This is crucial for real-world systems where search needs to be context-aware.

🔁 4. Real-Time Retrieval

When a user enters a new query (like a customer walking in with a request), the system:

Embeds the query into a vector
Searches the vector store for closest matches
Retrieves the top relevant documents (like similar requests)
Optionally, uses them in a response (in RAG or chatbot systems)

And it all happens in milliseconds.

🔌 How LangChain Integrates with Vector Stores

LangChain offers out-of-the-box support for many vector stores, such as:

FAISS – Lightweight, open-source, ideal for local development
Pinecone – Cloud-based, scalable, and easy to use
Weaviate – Comes with metadata filtering and schema
Chroma – Simple and fast local vector DB
Redis – With vector support using Redis modules
Amazon OpenSearch – For those in the AWS ecosystem

LangChain connects embedding models with vector stores seamlessly:

Embed your documents
Store them in a vector store
When a query comes in, embed it too
Then search the vector store for nearest matches

🚀 This setup forms the foundation of powerful apps like AI search assistants, chatbots with memory, and retrieval-augmented generation (RAG) pipelines.

🔁 Why Embeddings Matter in Real Applications

From powering chatbots that understand customer intent, to retrieval-augmented generation (RAG) systems that bring the right document when answering a question — embeddings are foundational.

They help move beyond word-matching to meaning-matching, making every AI interaction smarter, faster, and more relevant.

🥤 Just like your chaiwala doesn’t rely on exact phrases, AI systems using embeddings can understand what you meant, not just what you said — and that’s the power of vector-based intelligence.

🏁 Wrapping Up: What Our Chaiwala Taught Us

Next time you say "kadak chai" and get exactly what you meant — remember,

you just performed a semantic search without even realizing it!

That’s exactly what embedding models enable machines to do — grasp the intent behind words, not just the words themselves.

From transforming text into semantic vectors, to comparing those vectors using cosine similarity, embeddings are the secret sauce behind smart search, recommendation systems, and context-aware AI.

Thanks to LangChain, working with embedding models becomes incredibly intuitive — whether you're embedding a single query or a collection of documents, it abstracts the heavy lifting and provides a universal interface across multiple providers.

So, hats off to our chaiwala — the OG embedding model! ☕

🧠 Credit Where It’s Due

This blog post is inspired by and builds upon the official LangChain documentation.

All concepts are simplified and localized for better learning and relatability.

☁️ About Me

Hi, I’m Utkarsh Rastogi — an AWS Community Builder, multi-cloud certified specialist, and the creator of:

✍️ hashnode
✍️ dev.to

I write blogs that simplify complex AWS, AI, and cloud-native topics with real-world analogies, making them easy and fun to learn for everyone in tech.

🔗 Connect with me on LinkedIn for cloud insights, projects, and storytelling!

✨ Follow for more in this series.

🚀 Day 7 coming soon!

Streaming live from AWS re:Inforce

What’s next in cybersecurity? Find out live from re:Inforce on Security LIVE!

Learn More

Top comments (0)

Best Practices for Running Container WordPress on AWS (ECS, EFS, RDS, ELB) using CDK

This post discusses the process of migrating a growing WordPress eShop business to AWS using AWS CDK for an easily scalable, high availability architecture. The detailed structure encompasses several pillars: Compute, Storage, Database, Cache, CDN, DNS, Security, and Backup.

Read full post