Part of C in 100s

C in 100 Seconds: Binary File I/O

Celest KimCelest Kim

Video: C in 100 Seconds: Binary File I/O — fread and fwrite | Episode 37 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

C Binary File I/O: fread and fwrite

fwrite(data, size, count, f) writes raw bytes. fread(buf, size, count, f) reads them back. For structs, arrays of numbers, anything where text isn't appropriate.

Binary I/O is about writing memory contents to disk verbatim — no formatting, no parsing. Useful for serializing structs, storing data tables, or interfacing with binary file formats.

The basic shape

#include <stdio.h>
#include <string.h>

typedef struct {
  char name[20];
  int age;
  float score;
} Student;

int main() {
  Student students[3] = {
    {"Alice", 20, 95.5},
    {"Bob", 22, 87.3},
    {"Carol", 21, 91.0},
  };

  FILE *f = fopen("/tmp/students.dat", "wb");
  fwrite(students, sizeof(Student), 3, f);
  fclose(f);

  Student loaded[3];
  FILE *f2 = fopen("/tmp/students.dat", "rb");
  int count = fread(loaded, sizeof(Student), 3, f2);
  fclose(f2);

  for (int i = 0; i < count; i++) {
    printf("%s, %d, %.1f\n", loaded[i].name, loaded[i].age, loaded[i].score);
  }

  return 0;
}

fwrite

size_t fwrite(const void *ptr, size_t size, size_t count, FILE *stream);

Writes count items of size bytes each from ptr to stream. Returns the number of items successfully written.

fwrite(students, sizeof(Student), 3, f);
// writes 3 × sizeof(Student) bytes

The byte layout written matches the struct's in-memory layout — including any padding bytes the compiler added.

fread

size_t fread(void *ptr, size_t size, size_t count, FILE *stream);

Reads count items into ptr. Returns the number actually read (less than count if EOF hits early).

Student loaded[3];
int n = fread(loaded, sizeof(Student), 3, f);
if (n != 3) {
  printf("Only got %d students\n", n);
}

Always check the return value — partial reads are normal at EOF.

Why binary?

Aspect Text Binary
Human-readable Yes No
Editable Yes (any text editor) Hex editor only
Portable Yes No (endianness, alignment)
Compact Verbose Tight
Fast to parse Slow (string ops) Fast (memcpy)
Cross-platform Yes Care needed

Use binary when speed and compactness matter; text when humans need to read or edit. Most file formats are binary under the hood (PDF, ZIP, MP3, executables) — they trade portability for compactness.

Endianness

Little-endian (x86, ARM): least significant byte first. The integer 0x12345678 is stored as 78 56 34 12.

Big-endian (PowerPC, network protocols): most significant first. 12 34 56 78.

A binary file written on a little-endian machine and read on a big-endian one gets bytes interpreted differently → garbage values. Solutions:

  • Use a known endianness. Many formats are big-endian by convention (the network byte order).
  • Use htonl/ntohl from <arpa/inet.h> for portability.
  • Add an endianness marker at the start of the file (Unicode BOM is one example).

For local-only use, native endianness is fine.

Struct padding

typedef struct {
  char c;     // 1 byte
  int i;      // 4 bytes
} S;

sizeof(S)   // 8, not 5 — 3 bytes of padding after c

The padding bytes contain whatever was in memory before — usually garbage, sometimes leaking sensitive data (security concern).

To eliminate padding, reorder fields biggest-to-smallest, or use compiler-specific attributes:

#pragma pack(1)
typedef struct {
  char c;
  int i;
} S;
#pragma pack()

#pragma pack(1) forces 1-byte alignment — no padding. Comes at the cost of slower memory access on platforms that prefer aligned reads.

Cursor position

After fread(arr, sizeof(T), N, f), the file cursor advances N * sizeof(T) bytes. Subsequent reads continue from there.

To rewind: rewind(f) or fseek(f, 0, SEEK_SET).

To skip: fseek(f, offset, SEEK_CUR).

Reading until EOF

Student s;
while (fread(&s, sizeof(Student), 1, f) == 1) {
  // process s
}

Read one at a time. Loop continues while exactly 1 item is read; stops when fread returns 0 (EOF).

File size

fseek(f, 0, SEEK_END);
long size = ftell(f);
fseek(f, 0, SEEK_SET);

int count = size / sizeof(Student);
Student *arr = malloc(size);
fread(arr, sizeof(Student), count, f);

Get total size, divide by item size, allocate, read all. Standard "read whole file" idiom.

For huge files, prefer streaming (item-by-item or chunk-by-chunk) — don't malloc 10 GB.

Writing arrays of numbers

int nums[] = {10, 20, 30, 40, 50};
FILE *f = fopen("/tmp/nums.dat", "wb");
fwrite(nums, sizeof(int), 5, f);
fclose(f);

sizeof(int) * 5 = 20 bytes written.

Reading back:

int loaded[5];
FILE *f = fopen("/tmp/nums.dat", "rb");
fread(loaded, sizeof(int), 5, f);
fclose(f);

Common mistakes

Forgetting "b" mode on Windows. Without b, Windows translates \r\n\n for "text" files. Binary data with \r\n byte sequences gets corrupted. Always use "rb"/"wb" for binary files.

Endianness assumptions. Files written on x86 read incorrectly on big-endian PowerPC. For portability, define a fixed byte order.

Struct padding leaks. Writing a struct can leak bytes from uninitialized padding fields. Memset to zero before populating, or pack the struct.

Pointer fields in structs. fwrite writes the pointer values (memory addresses) — meaningless when read back in a different process. Serialize pointed-to data separately.

Different struct layouts across compilers. GCC and MSVC may pad differently. For cross-compiler files, define explicit byte layout (e.g., with pragma pack or manual byte assembly).

Mismatched read sizes. fwrite(arr, 8, 5, f) (5 items of 8 bytes) and fread(arr, 4, 10, f) (10 items of 4 bytes) read the same 40 bytes but with wrong semantics.

What's next

Episode 38: command-line arguments — argc and argv. Receiving input via the program's invocation.

Recap

fwrite(ptr, size, count, f) for raw bytes; fread to read back. Always open with "wb"/"rb" for binary. Returns number of items actually read/written. Endianness, padding, and pointers are the three traps for portable binary files. For files meant to outlive the current platform, define an explicit format with fixed byte order. For local-only data caches, native binary is fine.

Next episode: command-line arguments.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.