Part of Rust by Examples

Rust Word Frequency Counter with HashMap ๐Ÿ“Š | Rust by Examples #10

Sandy LaneSandy Lane
โ€ข

Video: Rust Word Frequency Counter with HashMap ๐Ÿ“Š | Rust by Examples #10 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read ยท interactive walkthrough
โ†’

Rust Word Frequency Counter with HashMap

Learn how to build a word frequency counter in Rust using the HashMap collection and the Entry API for efficient counting. This example reads text, counts occurrences of each word, and displays the top N most frequent words sorted by their counts.

Code

use std::collections::HashMap;
use std::fs;

// Counts the frequency of each word in the given text slice
fn count_words(text: &str) -> HashMap<String, usize> {
  let mut counts = HashMap::new();

  for word in text.split_whitespace() {
    // Normalize word to lowercase for consistent counting
    let word = word.to_lowercase();
    // Use Entry API to insert or update the count efficiently
    *counts.entry(word).or_insert(0) += 1;
  }

  counts
}

// Returns the top n words sorted by frequency in descending order
fn top_words(counts: &HashMap<String, usize>, n: usize) -> Vec<(String, usize)> {
  let mut word_counts: Vec<(String, usize)> = counts.iter()
    .map(|(word, &count)| (word.clone(), count))
    .collect();

  // Sort by count descending
  word_counts.sort_by(|a, b| b.1.cmp(&a.1));

  // Take top n results
  word_counts.into_iter().take(n).collect()
}

fn main() {
  // Read sample text file
  let text = fs::read_to_string("sample.txt").expect("Failed to read sample.txt");

  // Count words
  let counts = count_words(&text);

  // Get top 5 words
  let top = top_words(&counts, 5);

  println!("Top 5 words by frequency:");
  for (word, count) in top {
    println!("{}: {}", word, count);
  }
}

Key Points

  • HashMap is ideal for counting occurrences with keys as words and values as counts.
  • The Entry API with entry().or_insert() simplifies updating counts without extra lookups.
  • Text processing uses split_whitespace() and lowercase normalization for consistent counting.
  • Sorting a vector of (word, count) tuples by count descending helps find the most frequent words.
  • Reading from a file and modularizing code makes the counter reusable and clean.
Ready? Take the quiz on the full lesson page โ†’
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.
โ†’