Part of C in 100s

C in 100 Seconds: Hash Table Chaining

Celest KimCelest Kim

Video: C in 100 Seconds: Hash Table — Chaining, Insert, Lookup, Delete | Episode 49 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

C Hash Table: Chaining, Insert, Lookup, Delete

Each bucket is a linked list. Insert prepends; lookup walks; delete unlinks. The classic implementation that handles collisions gracefully.

Episode 48 covered the hash-table concept. Today: a working implementation with chaining — each bucket holds a linked list of entries, so collisions just become chain nodes.

The basic shape

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define SIZE 10

typedef struct Entry {
  char *key;
  int value;
  struct Entry *next;
} Entry;

Entry *table[SIZE] = {NULL};

int hash(char *key) {
  int sum = 0;
  while (*key) sum += *key++;
  return sum % SIZE;
}

void insert(char *key, int value) {
  int idx = hash(key);
  Entry *e = malloc(sizeof(Entry));
  e->key = key;
  e->value = value;
  e->next = table[idx];   // prepend to the chain
  table[idx] = e;
}

int lookup(char *key) {
  int idx = hash(key);
  for (Entry *e = table[idx]; e != NULL; e = e->next) {
    if (strcmp(e->key, key) == 0) return e->value;
  }
  return -1;
}

void delete(char *key) {
  int idx = hash(key);
  Entry *e = table[idx];
  Entry *prev = NULL;
  while (e != NULL) {
    if (strcmp(e->key, key) == 0) {
      if (prev == NULL) table[idx] = e->next;
      else prev->next = e->next;
      free(e);
      return;
    }
    prev = e;
    e = e->next;
  }
}

Insert: prepend to the chain

void insert(char *key, int value) {
  int idx = hash(key);
  Entry *e = malloc(sizeof(Entry));
  e->key = key;
  e->value = value;
  e->next = table[idx];
  table[idx] = e;
}

Always prepend. O(1) regardless of chain length.

A subtlety: this version doesn't check for duplicate keys. Inserting the same key twice creates two entries; lookup returns the most recent (because it's at the head). For "update or insert" semantics:

void insert_or_update(char *key, int value) {
  int idx = hash(key);
  for (Entry *e = table[idx]; e != NULL; e = e->next) {
    if (strcmp(e->key, key) == 0) {
      e->value = value;   // update existing
      return;
    }
  }
  // insert new
  Entry *e = malloc(sizeof(Entry));
  e->key = key;
  e->value = value;
  e->next = table[idx];
  table[idx] = e;
}

Lookup: walk the chain

int lookup(char *key) {
  int idx = hash(key);
  for (Entry *e = table[idx]; e != NULL; e = e->next) {
    if (strcmp(e->key, key) == 0) return e->value;
  }
  return -1;
}

Hash to find the bucket, walk the chain, strcmp each key. Return on match.

Average chain length: n / SIZE (load factor). Average lookup: O(1 + load_factor). Good hash + reasonable load → near-constant time.

Worst case (all keys collide): O(n).

Delete: walk and splice

void delete(char *key) {
  int idx = hash(key);
  Entry *e = table[idx];
  Entry *prev = NULL;
  while (e != NULL) {
    if (strcmp(e->key, key) == 0) {
      if (prev == NULL) table[idx] = e->next;
      else prev->next = e->next;
      free(e);
      return;
    }
    prev = e;
    e = e->next;
  }
}

Walk the chain tracking prev. On match, splice the entry out (update prev->next or table[idx]), free.

Same logic as linked-list delete from episode 43.

Iterating all entries

void print_table() {
  for (int i = 0; i < SIZE; i++) {
    printf("[%d]", i);
    Entry *e = table[i];
    while (e != NULL) {
      printf(" -> %s:%d", e->key, e->value);
      e = e->next;
    }
    printf("\n");
  }
}

Walk every bucket; for each, walk its chain. O(SIZE + n) — visits every bucket plus every entry.

For "give me all entries":

void each(void (*fn)(const char *, int)) {
  for (int i = 0; i < SIZE; i++) {
    for (Entry *e = table[i]; e != NULL; e = e->next) {
      fn(e->key, e->value);
    }
  }
}

Function-pointer callback for visitor pattern.

Memory ownership

Entry *e = malloc(sizeof(Entry));
e->key = key;   // does the table own this string?

Two policies:

Borrow: the caller owns the key, the table just holds a pointer. Inserting a stack-buffer pointer is dangerous — the buffer reused, the table now points at garbage.

Own: the table copies the key on insert.

void insert(char *key, int value) {
  // ...
  e->key = strdup(key);   // copy
  e->value = value;
  // ...
}

void delete(char *key) {
  // ...
  free(e->key);   // free the copy
  free(e);
}

Stricter; safer; costs an extra malloc per entry. Typical for production code.

Resizing

When load factor exceeds a threshold (e.g., 0.75), double the table:

void resize() {
  int old_size = SIZE;
  Entry **old_table = table;
  table = calloc(new_size, sizeof(Entry *));

  for (int i = 0; i < old_size; i++) {
    Entry *e = old_table[i];
    while (e != NULL) {
      Entry *next = e->next;
      // rehash with new size
      int idx = hash(e->key);
      e->next = table[idx];
      table[idx] = e;
      e = next;
    }
  }
  free(old_table);
}

Walk every old entry, rehash into the new table. O(n) — but amortized over many inserts, it's still O(1) average.

For a fixed-size table, the load factor only grows. Eventually lookups slow to O(n).

Free the whole table

void free_table() {
  for (int i = 0; i < SIZE; i++) {
    Entry *e = table[i];
    while (e != NULL) {
      Entry *next = e->next;
      free(e->key);  // if owned
      free(e);
      e = next;
    }
    table[i] = NULL;
  }
}

Walk each bucket, free every entry. Set buckets to NULL for safety.

Open addressing variant (briefly)

For comparison, open addressing stores entries directly in the array:

typedef struct {
  char *key;
  int value;
  int occupied;
} Entry;

Entry table[SIZE];

void insert(char *key, int value) {
  int idx = hash(key);
  while (table[idx].occupied) {
    idx = (idx + 1) % SIZE;
  }
  table[idx].key = key;
  table[idx].value = value;
  table[idx].occupied = 1;
}

No malloc per entry — all slots in the array. Cache-friendly. But: deletion is tricky (can't just clear a slot or you break probe sequences for other keys; use "tombstone" markers instead).

Most production hash tables use open addressing for performance. Chaining is conceptually simpler.

Common mistakes

Comparing keys with ==. Pointer equality. Use strcmp for strings, custom equals for structs.

Borrowed key going out of scope. insert(stack_buf, 5); then stack_buf reused — table holds a stale pointer. Either copy keys or document the lifetime requirement.

Leak on delete. Free the entry; if you're storing a copied key, free that too.

Not resizing. Fixed-size table eventually has long chains. Double when load_factor > 0.75.

Hash function dependent on insertion order. Some weak hashes cluster predictably. djb2 or FNV are robust.

Iterating while deleting. Modifying the table during iteration confuses pointers. Collect keys first, then delete.

What's next

Episode 50: binary tree — create, insert, traverse. Hierarchical data with two children per node.

Recap

Chaining: each bucket is a linked list. Insert: hash, prepend. Lookup: hash, walk chain, strcmp. Delete: hash, walk, splice, free. Iteration: walk every bucket and every chain — O(SIZE + n). Decide if the table owns or borrows the key strings. Resize when load factor exceeds threshold. Open addressing is faster but more complex; chaining is simpler and good for learning.

Next episode: binary trees.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.