Rust Word Frequency Counter with HashMap 📊 | Rust by Examples #10
Video: Rust Word Frequency Counter with HashMap 📊 | Rust by Examples #10 by Taught by Celeste AI - AI Coding Coach
Watch full page →Rust Word Frequency Counter with HashMap
Learn how to build a word frequency counter in Rust using the HashMap collection and the Entry API for efficient counting. This example reads text, counts occurrences of each word, and displays the top N most frequent words sorted by their counts.
Code
use std::collections::HashMap;
use std::fs;
// Counts the frequency of each word in the given text slice
fn count_words(text: &str) -> HashMap<String, usize> {
let mut counts = HashMap::new();
for word in text.split_whitespace() {
// Normalize word to lowercase for consistent counting
let word = word.to_lowercase();
// Use Entry API to insert or update the count efficiently
*counts.entry(word).or_insert(0) += 1;
}
counts
}
// Returns the top n words sorted by frequency in descending order
fn top_words(counts: &HashMap<String, usize>, n: usize) -> Vec<(String, usize)> {
let mut word_counts: Vec<(String, usize)> = counts.iter()
.map(|(word, &count)| (word.clone(), count))
.collect();
// Sort by count descending
word_counts.sort_by(|a, b| b.1.cmp(&a.1));
// Take top n results
word_counts.into_iter().take(n).collect()
}
fn main() {
// Read sample text file
let text = fs::read_to_string("sample.txt").expect("Failed to read sample.txt");
// Count words
let counts = count_words(&text);
// Get top 5 words
let top = top_words(&counts, 5);
println!("Top 5 words by frequency:");
for (word, count) in top {
println!("{}: {}", word, count);
}
}
Key Points
- HashMap is ideal for counting occurrences with keys as words and values as counts.
- The Entry API with entry().or_insert() simplifies updating counts without extra lookups.
- Text processing uses split_whitespace() and lowercase normalization for consistent counting.
- Sorting a vector of (word, count) tuples by count descending helps find the most frequent words.
- Reading from a file and modularizing code makes the counter reusable and clean.