Build AI Apps with Python: How AI Understands Meaning — Embeddings | Episode 14

2views
00
12:21
T
Taught by Celeste AI - AI Coding Coach
View on YouTube
Description
How does a computer know that "cat" and "kitten" mean similar things? Embeddings. In this episode, we convert text into vectors and measure semantic similarity with cosine similarity — built from scratch in pure Python. sentence-transformers is a free Python library that runs a neural network on your machine. No API key needed. The model downloads automatically from HuggingFace on first run and caches locally. It converts any text into a 384-dimensional vector that captures meaning. Four test sentences prove it works: "cat sat on mat" and "kitten rested on rug" score 0.61 (similar!), while "cat" vs "Python programming" scores 0.03 (unrelated). The model understands meaning, not just keywords. This is how RAG searches for relevant chunks. Student code: https://github.com/GoCelesteAI/build-ai-apps-python/tree/main/episode14 Every keystroke is shown on screen with 3-second pauses so you can follow along at your own pace. What You'll Learn: • What embeddings are — text to vector (list of numbers) • Cosine similarity from scratch (dot product, magnitudes) • sentence-transformers — what it is and how to install it • all-MiniLM-L6-v2 — auto-downloads from HuggingFace, caches locally • 384 dimensions — each number captures an aspect of meaning • Suppressing HuggingFace warnings for clean output • Semantic similarity vs keyword matching • Comparing 4 sentences — 6 similarity scores • The foundation for vector search in RAG Timestamps: 0:00 - Introduction 0:12 - What are Embeddings? (Preview) 0:46 - Creating embeddings.py 0:52 - Warning suppression (os, warnings, env vars) 2:18 - What is an embedding? (concept comments) 2:52 - Cosine similarity from scratch 4:35 - sentence-transformers — what it does 4:50 - The import — loads and runs the neural network 5:08 - Model auto-downloads from HuggingFace 5:45 - Four test sentences — cat, kitten, Python, weather 6:58 - Generate embeddings + compare all pairs 9:55 - Results! Cat-kitten: 0.61, Cat-Python: 0.03 11:00 - Recap: 3 Key Takeaways

Tags

python embeddingscosine similarity pythonsentence transformers tutorialtext to vectorssemantic similarityall-MiniLM-L6-v2huggingface modelrag embeddingsai tutorial 2026build ai apps pythonneovim tutorialgenerative ai pythonscreenkeycode alongvector search
Back to tutorials

Duration

12:21

Published

April 3, 2026

Added to Codegiz

April 5, 2026

Open in YouTube