mediumCSBinary Number SystemsMarch 14, 2026 · claude-opus-4-6
Binary Endianness: Big-Endian vs. Little-Endian in Multi-Byte Systems
Endianness is simply the answer to the question: when you have a number too large to fit in one byte and you need to lay its bytes out in memory (a sequence of addresses), do you store the most-significant byte first (big-endian, like how we write numbers left-to-right) or the least-significant byte first (little-endian, like reading a number starting from the ones digit)? The concept is trivial in isolation, but it becomes a critical source of bugs the moment two systems that disagree on the convention try to exchange data.
Deep Explanation
When a value occupies more than one byte — a 32-bit integer, a 16-bit sensor reading, a 64-bit floating-point number — the hardware must decide which byte to place at the lowest memory address. Big-endian (BE) stores the most-significant byte (MSB) at the lowest address, mirroring the way humans write decimal numbers with the largest place value on the left. Little-endian (LE) stores the least-significant byte (LSB) at the lowest address. For example, the 32-bit hexadecimal value 0x44332211 would appear in memory as bytes [0x44, 0x33, 0x22, 0x11] on a big-endian machine and [0x11, 0x22, 0x33, 0x44] on a little-endian machine.
The reason this matters is that byte ordering is invisible as long as you stay inside a single CPU architecture. The moment data crosses a boundary — written to a file, sent over a network, shared between an ARM Cortex-M microcontroller and an x86 server, or even accessed through a type-punning pointer cast in C — the two sides must agree on byte order or the values will be silently misinterpreted. A 16-bit temperature reading of 0x0100 (256 in decimal) becomes 0x0001 (1 in decimal) if the receiver assumes the opposite endianness. This class of bug is notoriously hard to catch because the program does not crash; it simply produces wrong numbers.
Historically, Motorola 68000 processors and most RISC architectures (SPARC, PowerPC in default mode) used big-endian ordering, while Intel x86 and x86-64 processors — and therefore the vast majority of desktop PCs and servers today — use little-endian. Modern ARM processors are bi-endian (they can be configured either way), but in practice ARMv7 and ARMv8 systems almost universally run in little-endian mode (Android, iOS, Linux on ARM). Network protocols, however, standardized on big-endian byte order decades ago — hence the term 'network byte order' is synonymous with big-endian. The POSIX functions htons(), htonl(), ntohs(), and ntohl() exist precisely to convert between host byte order and network byte order.
In embedded systems, endianness surfaces constantly: when parsing binary sensor data over SPI or I²C, when reading registers from a peripheral whose datasheet specifies a particular byte order, when writing firmware that must serialize structs into flash memory or transmit them over CAN bus. In data serialization formats, Protocol Buffers and MessagePack define their own wire byte order (varint encoding and big-endian respectively), while formats like TIFF include an endianness marker in the file header ('II' for little-endian, 'MM' for big-endian). Understanding endianness is not about memorizing which CPUs use which order — it is about recognizing every point in your system where bytes cross a trust boundary and ensuring an explicit, documented conversion happens there.
Real-World Examples
- TCP/IP networking: Every IP header field (source address, destination address, total length, checksum) is transmitted in big-endian (network byte order). BSD sockets require programmers to call htons() and htonl() before filling in sockaddr_in structures. Forgetting this conversion on a little-endian x86 Linux box causes the kernel to bind to the wrong port or send malformed packets.
- ARM Cortex-M embedded firmware reading an I²C temperature sensor (e.g., TI TMP102): The sensor transmits its 12-bit temperature reading MSB-first (big-endian) across two bytes. The Cortex-M4 CPU running in little-endian mode must byte-swap the received pair before interpreting the value, or the temperature will be wildly incorrect.
- TIFF image file format: The first two bytes of every TIFF file are either 0x4949 ('II', Intel, little-endian) or 0x4D4D ('MM', Motorola, big-endian), telling any reader how to interpret every subsequent multi-byte field in the file. This self-describing endianness marker is a classic design pattern adopted by many binary formats.
- Google Protocol Buffers (protobuf): Protobuf uses variable-length integer encoding (varint) that is byte-order-independent by design — each byte carries 7 data bits plus a continuation flag. This sidesteps endianness entirely, making protobuf messages portable across all architectures without any byte-swapping code.
Exercise
Further Reading
endiannessbyte-orderingmulti-byte dataembedded systemsdata serializationnetwork protocols