Part of C in 100s

C in 100 Seconds: Merge Sort

Celest KimCelest Kim

Video: C in 100 Seconds: Merge Sort | Episode 54 by CelesteAI

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

C Merge Sort

Recursive divide and conquer. Split the array in half, sort each half, merge. O(n log n) worst case. Stable, but uses O(n) extra memory.

Merge sort is a classic divide-and-conquer algorithm. It guarantees O(n log n) — no degenerate cases. The merge step is the heart.

The basic shape

#include <stdio.h>

void merge(int arr[], int left, int mid, int right) {
  int i = left, j = mid + 1, k = 0;
  int temp[right - left + 1];

  while (i <= mid && j <= right) {
    if (arr[i] <= arr[j])
      temp[k++] = arr[i++];
    else
      temp[k++] = arr[j++];
  }
  while (i <= mid) temp[k++] = arr[i++];
  while (j <= right) temp[k++] = arr[j++];

  for (int i = 0; i < k; i++)
    arr[left + i] = temp[i];
}

void merge_sort(int arr[], int left, int right) {
  if (left >= right) return;
  int mid = (left + right) / 2;
  merge_sort(arr, left, mid);
  merge_sort(arr, mid + 1, right);
  merge(arr, left, mid, right);
}

int main() {
  int nums[] = {38, 27, 43, 3, 9, 82, 10};
  int n = sizeof(nums) / sizeof(nums[0]);

  merge_sort(nums, 0, n - 1);

  for (int i = 0; i < n; i++) printf("%d ", nums[i]);
  printf("\n");
  // 3 9 10 27 38 43 82
}

Divide and conquer

void merge_sort(int arr[], int left, int right) {
  if (left >= right) return;          // base: single element or empty
  int mid = (left + right) / 2;
  merge_sort(arr, left, mid);          // sort left half
  merge_sort(arr, mid + 1, right);     // sort right half
  merge(arr, left, mid, right);         // combine
}

Three steps:

  1. Divide. Split at the middle.
  2. Conquer. Recursively sort each half.
  3. Combine. Merge the two sorted halves.

The base case is left >= right — a single element (or empty range) is trivially sorted.

For an array of 8: 8 → 4+4 → 2+2+2+2 → 1+1+1+1+1+1+1+1 → merge upward. log₂(8) = 3 levels of recursion.

merge — the key step

void merge(int arr[], int left, int mid, int right) {
  int i = left, j = mid + 1, k = 0;
  int temp[right - left + 1];

  while (i <= mid && j <= right) {
    if (arr[i] <= arr[j])
      temp[k++] = arr[i++];
    else
      temp[k++] = arr[j++];
  }
  while (i <= mid) temp[k++] = arr[i++];
  while (j <= right) temp[k++] = arr[j++];

  for (int i = 0; i < k; i++)
    arr[left + i] = temp[i];
}

Two sorted subranges (arr[left..mid] and arr[mid+1..right]) are merged into a temp buffer in sorted order, then copied back.

The main loop: compare the front of each subrange; take the smaller; advance that index.

The two follow-up loops: drain whichever range still has items.

The copy-back: put the result into place.

The <= (not <) preserves stability — when equal, we take from the left subrange first.

Walkthrough on [38, 27, 43, 3]

Split: [38, 27] and [43, 3].

Recursive sort on [38, 27]: - Split: [38] and [27]. - Each is sorted (length 1). - Merge: [27, 38].

Recursive sort on [43, 3]: - Split: [43] and [3]. - Each is sorted. - Merge: [3, 43].

Merge [27, 38] and [3, 43]: - Compare 27, 3 → take 3. - Compare 27, 43 → take 27. - Compare 38, 43 → take 38. - Drain right: take 43. - Result: [3, 27, 38, 43].

Complexity

Aspect Bubble Sort Merge Sort
Best O(n) O(n log n)
Average O(n²) O(n log n)
Worst O(n²) O(n log n)
Memory O(1) O(n)
Stable Yes Yes

Merge sort wins on speed (O(n log n) vs O(n²)) — for 1,000,000 elements, ~20 million ops vs 500 billion. The cost is O(n) extra memory for the temp buffer.

VLA caveat

int temp[right - left + 1];   // Variable-Length Array

This is a C99 feature. Some compilers (MSVC) don't support VLAs. C11 made them optional. For portable code, use malloc:

int *temp = malloc((right - left + 1) * sizeof(int));
// ... use temp ...
free(temp);

Or allocate one big buffer at the top of merge_sort and pass it down — avoids malloc per recursion.

In-place mergesort exists

Standard mergesort uses O(n) extra memory. There are in-place variants (e.g., block merge) — much harder to implement and slower in practice. The O(n) memory is usually fine.

Why O(n log n)

Each level of recursion processes n elements (across all subarrays). There are log₂ n levels. Total: n × log n.

The merge step at each level is linear in the size of the subarrays it merges. Sum of all merges at level k: O(n).

This is the fundamental lower bound for comparison-based sorting — you can't do better than O(n log n) in the worst case (proven via decision-tree arguments).

When to use

  • Linked lists. Mergesort works without random access. No O(n) extra memory needed (you reuse the list nodes).
  • External sorts. Sort huge files that don't fit in memory by sorting chunks and merging.
  • Stability matters. qsort is not guaranteed stable; mergesort is.
  • Worst-case guarantees. For real-time systems, the no-bad-input property of mergesort matters.

For general-purpose sorting, quicksort (next episode) is usually faster in practice — better cache locality, no extra memory.

Mergesort on linked lists

typedef struct Node { int data; struct Node *next; } Node;

Node *merge_sort_list(Node *head) {
  if (head == NULL || head->next == NULL) return head;

  // split into halves
  Node *slow = head, *fast = head, *prev = NULL;
  while (fast != NULL && fast->next != NULL) {
    prev = slow;
    slow = slow->next;
    fast = fast->next->next;
  }
  prev->next = NULL;   // sever

  Node *left = merge_sort_list(head);
  Node *right = merge_sort_list(slow);

  // merge ...
}

Linked-list mergesort doesn't need extra memory — splice nodes during merge.

Common mistakes

Off-by-one in mid. (left + right) / 2 for inclusive bounds; (left + right + 1) / 2 shifts the split. Be consistent.

Stable vs not. arr[i] <= arr[j] for stability; arr[i] < arr[j] is unstable.

Memory in temp. Forgetting that VLA isn't always available, or forgetting to free if using malloc.

Inclusive vs exclusive right. This implementation uses inclusive right (so the array length is right - left + 1). right = n - 1 for the full array.

Recursion depth. For 4-billion-element arrays, log₂ ≈ 32 stack frames — fine. For deeper recursion, iterative mergesort exists (bottom-up).

What's next

Episode 55: quicksort. Faster in practice, in-place, but worst-case O(n²) on bad pivots.

Recap

Mergesort: divide and conquer. Split at mid, recursively sort halves, merge. O(n log n) guaranteed — no bad inputs. Stable if you use <= in the merge comparison. Costs O(n) extra memory for the temp buffer. For linked lists, no extra memory needed. Use over bubble sort when n is large; choose between mergesort and quicksort based on stability needs and memory budget.

Next episode: quicksort.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.