Build AI Apps with Python: How AI Understands Meaning — Embeddings | Episode 14

Celest Kim

•April 18, 2026

Video: Build AI Apps with Python: How AI Understands Meaning — Embeddings | Episode 14 by Taught by Celeste AI - AI Coding Coach

Watch full page →

Build AI Apps with Python: How AI Understands Meaning — Embeddings

Understanding how AI captures the meaning behind words is essential for building intelligent applications. This episode demonstrates how to convert text into numerical vectors called embeddings using the sentence-transformers library, enabling semantic similarity comparisons with cosine similarity—all implemented in pure Python.

Code

from sentence_transformers import SentenceTransformer
import numpy as np

# Load a pre-trained model that converts sentences to 384-dimensional embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')

# Define sentences to compare
sentences = [
  "cat sat on mat",
  "kitten rested on rug",
  "cat",
  "Python programming"
]

# Get embeddings for each sentence
embeddings = model.encode(sentences)

def cosine_similarity(vec1, vec2):
  # Compute cosine similarity between two vectors
  dot_product = np.dot(vec1, vec2)
  norm1 = np.linalg.norm(vec1)
  norm2 = np.linalg.norm(vec2)
  return dot_product / (norm1 * norm2)

# Compare semantic similarity between sentence pairs
sim_cat_kitten = cosine_similarity(embeddings[0], embeddings[1])
sim_cat_python = cosine_similarity(embeddings[2], embeddings[3])

print(f'Similarity between "cat sat on mat" and "kitten rested on rug": {sim_cat_kitten:.2f}')
print(f'Similarity between "cat" and "Python programming": {sim_cat_python:.2f}')

Key Points

Embeddings convert text into high-dimensional vectors that capture semantic meaning beyond keywords.
The sentence-transformers library provides easy access to powerful pre-trained models without needing API keys.
Cosine similarity measures how close two embedding vectors are, indicating semantic similarity between sentences.
Similar sentences like "cat sat on mat" and "kitten rested on rug" score high, while unrelated pairs score low.
This technique underpins retrieval-augmented generation (RAG) by helping find relevant information chunks based on meaning.