C in 100 Seconds: Dynamic Arrays
Video: C in 100 Seconds: Dynamic Arrays | Episode 56 by CelesteAI
C Dynamic Arrays: Grow on Demand
Start with a small capacity. When full, double it via
realloc. Amortized O(1) per push despite occasional O(n) reallocations. The series finale.
C arrays are fixed-size. A dynamic array (sometimes called a "vector" or "ArrayList") grows as needed. Today's implementation: a struct holding data, size, and capacity, with realloc-based growth.
The basic shape
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int *data;
int size;
int capacity;
} DynArray;
DynArray create(int cap) {
DynArray a;
a.data = malloc(cap * sizeof(int));
a.size = 0;
a.capacity = cap;
return a;
}
void push(DynArray *a, int val) {
if (a->size == a->capacity) {
a->capacity *= 2;
a->data = realloc(a->data, a->capacity * sizeof(int));
printf("(grew to %d)\n", a->capacity);
}
a->data[a->size++] = val;
}
int main() {
DynArray arr = create(2);
push(&arr, 10);
push(&arr, 20);
push(&arr, 30); // triggers grow
push(&arr, 40);
push(&arr, 50); // triggers grow
for (int i = 0; i < arr.size; i++) printf("%d ", arr.data[i]);
printf("\n");
free(arr.data);
}
The struct: data + size + capacity
typedef struct {
int *data;
int size; // number of elements currently stored
int capacity; // total allocated slots
} DynArray;
Three fields:
data— pointer to the heap buffer.size— how many slots are filled (0tocapacity - 1).capacity— total slots allocated.
size <= capacity always. When they're equal, the next push must grow.
Doubling strategy
if (a->size == a->capacity) {
a->capacity *= 2;
a->data = realloc(a->data, a->capacity * sizeof(int));
}
When full, double the capacity. Why doubling?
If we grew by 1 each time, every push is O(n). Total cost of N pushes: O(N²).
If we double, occasional pushes are O(n) (the realloc), but the cost amortizes. Total cost of N pushes: O(N).
Amortized analysis: Imagine each push pays $3.
- $1 covers the basic write.
- $2 saved for the eventual move.
When the buffer doubles from N to 2N, we move N elements. Each was paid for by an earlier push's $2 savings. The math works: every push contributes $3, and total work over N pushes is ≤ 3N = O(N). Average per push: O(1).
realloc trap
a->data = realloc(a->data, new_size); // BAD if realloc fails
If realloc fails (returns NULL), a->data is now NULL but the original buffer is still alive — leaked.
The safe form (from episode 21):
int *tmp = realloc(a->data, new_size);
if (tmp == NULL) {
// realloc failed; a->data still valid
// handle error
return;
}
a->data = tmp;
For demos we skip the check; for production, always handle.
Other operations
int get(DynArray *a, int i) {
if (i < 0 || i >= a->size) { /* error */ }
return a->data[i];
}
void set(DynArray *a, int i, int val) {
if (i < 0 || i >= a->size) return;
a->data[i] = val;
}
int pop(DynArray *a) {
if (a->size == 0) return -1;
return a->data[--a->size];
}
void clear(DynArray *a) {
a->size = 0; // capacity unchanged; data not zeroed
}
void destroy(DynArray *a) {
free(a->data);
a->data = NULL;
a->size = a->capacity = 0;
}
get and set are O(1). pop is O(1) (just decrement size; data isn't actually freed). clear is O(1) (just set size to 0; data buffer kept).
Insert and remove (middle)
void insert(DynArray *a, int i, int val) {
if (a->size == a->capacity) grow(a);
for (int j = a->size; j > i; j--) {
a->data[j] = a->data[j - 1];
}
a->data[i] = val;
a->size++;
}
void remove_at(DynArray *a, int i) {
for (int j = i; j < a->size - 1; j++) {
a->data[j] = a->data[j + 1];
}
a->size--;
}
Insert and remove in the middle are O(n) — shifting elements. Push and pop at the end are O(1).
For frequent middle inserts, a linked list might be better. For mostly-end operations, dynamic array wins.
Shrinking
void shrink_to_fit(DynArray *a) {
if (a->size < a->capacity) {
a->data = realloc(a->data, a->size * sizeof(int));
a->capacity = a->size;
}
}
If you've popped a lot, the unused capacity is wasted memory. shrink_to_fit reallocates to exactly size bytes.
Most implementations also auto-shrink when size drops below ¼ of capacity (avoids thrash if the array oscillates around half-full).
C++ vector and other equivalents
C doesn't have a built-in vector. Equivalents in other languages:
- C++:
std::vector<int>. - Python:
list. - Rust:
Vec<T>. - Java:
ArrayList<>. - Go: built-in slices.
All are some flavor of "dynamic array" with similar amortized O(1) push.
Generic dynamic arrays
For any type, parameterize via macros or void pointers:
// Macro approach
#define DYN_ARRAY(T) struct { T *data; int size, capacity; }
DYN_ARRAY(int) ints;
DYN_ARRAY(char *) strings;
The klib's kvec.h is a real-world dynamic-array library using this pattern. It's macro-heavy but type-safe and fast.
For real code, consider:
- uthash / utlist / utarray — header-only C utilities.
- klib — fast, single-header.
- std-c-libs — various.
- Or just write it inline — the code is small.
Why dynamic arrays beat linked lists
| Aspect | Dynamic Array | Linked List |
|---|---|---|
| Random access | O(1) | O(n) |
| Push back | Amortized O(1) | O(1) (with tail ptr) |
| Push front | O(n) | O(1) |
| Insert middle | O(n) | O(n) traversal + O(1) splice |
| Memory per element | 1 element | element + 8 bytes (pointer) |
| Cache locality | Excellent | Poor (scattered) |
For most workloads, dynamic arrays win. Cache locality alone makes them dramatically faster for iteration even when big-O is the same.
Linked lists are better only for: (a) frequent insertions/deletions in the middle by node pointer, (b) cases where you can't tolerate the occasional realloc cost.
The series ends here
Across 56 episodes, we've covered:
- Foundations: main, printf, variables, types, formatting, arithmetic, conditionals, loops.
- Functions and arrays: definitions, calls, scope; arrays, strings, scanf/fgets.
- Constants and casting: #define, const, enum, typedef, type conversions.
- Pointers: the core mechanic — addresses, dereferencing, arithmetic, pass-by-reference, void/double pointers, function pointers.
- Memory: stack vs heap, malloc/calloc/realloc/free, leaks, structs, unions.
- Strings: strlen, strcpy, strcmp, strcat, strchr, strstr, strtok, sprintf/snprintf.
- Files: fopen, fgets, fprintf, fread/fwrite, error handling.
- Build system: preprocessor, multi-file, headers, Makefile.
- Data structures: linked lists (singly + doubly), stacks, queues, hash tables, binary trees + BSTs.
- Algorithms: bubble sort, merge sort, quicksort.
That's a complete tour of C — enough to read most C code, write your own programs, and tackle real-world C projects (kernel modules, embedded firmware, system tools).
Next steps
Pick a project. Some ideas:
- A small interpreter or compiler. Tokenizer, parser, AST, evaluator.
- A networked tool. TCP server/client; HTTP request library.
- A game. Simple terminal game (snake, tetris).
- A data tool. Parse a custom file format; output transformations.
- A library you'd use. A hash map, JSON parser, regex matcher.
The deeper magic of C — pointer arithmetic, memory management, the relationship to the machine — only solidifies when you build something nontrivial. The episodes are scaffolding; what you build with them is the point.
Recap
Dynamic array: { T *data; int size, capacity; }. Push checks if full, doubles capacity via realloc, then writes. Amortized O(1) push despite occasional O(n) reallocations. O(1) get/set/pop. O(n) insert/remove in middle. Beats linked lists on cache locality and random access — the default growable container in modern code. Always pair create with destroy to free the buffer.
Series complete. Build something with C — the only way to truly learn it.