When allocation kills performance
You are building a game engine. The level data arrives as a 2GB binary file. You map it into memory. Now you need to read the header to find the texture offsets. Standard deserialization allocates a new struct, copies bytes field by field, and you just burned cycles on a memory allocation for a struct that lives for a microsecond. Or you are parsing network packets in a high-throughput proxy. Every allocation triggers pressure on the allocator. Latency spikes. You want to look at the bytes and see the struct instantly. No copy. No allocation. Just a view.
Zero-copy deserialization interprets a sequence of bytes as a structured value without moving data. You take a &[u8] and convince the compiler it is a &MyStruct. The compiler needs guarantees. The bytes must be long enough. The bytes must be aligned. The layout must be predictable. If you provide those guarantees, Rust lets you cast the pointer and dereference it. The result is a reference that points directly into the original buffer.
The lens, not the copy
Think of a library card catalog. The card tells you where the book is. Zero-copy is reading the card. Copying is photocopying the whole book. In memory, a &[u8] is a bag of bytes. Zero-copy deserialization puts a label on that bag. You are not moving data. You are changing the lens through which you view the data.
This approach requires the struct to have a known, stable layout. Rust normally reorders fields to minimize padding and optimize cache usage. That optimization breaks zero-copy because the compiler might shuffle fields between versions or optimization levels. You must lock the layout using #[repr(C)]. This attribute tells the compiler to lay out the struct exactly as a C compiler would: fields in declaration order, with padding inserted to satisfy alignment requirements.
Alignment is the hardware contract. CPUs expect certain types to start at addresses that are multiples of their size. A u32 usually needs a 4-byte aligned address. A u64 needs 8-byte alignment. If you cast a pointer to a struct and the address is misaligned, the CPU may crash with a hardware fault, or the processor may take a slow path to fetch the data. Zero-copy code must check alignment before casting.
Minimal implementation
The safest way to implement zero-copy is to wrap the unsafe cast behind checks that verify size and alignment. The function returns Option<&T> so the caller can handle invalid input without panicking.
/// Parses a binary header from a byte slice without allocation.
/// Returns a reference to the header data inside the slice.
#[repr(C)]
struct Header {
magic: u32,
version: u16,
count: u16,
}
fn parse_header(bytes: &[u8]) -> Option<&Header> {
// Check if the slice has enough bytes for the struct.
if bytes.len() < std::mem::size_of::<Header>() {
return None;
}
// Check alignment. The pointer must be aligned for the struct type.
// align_offset returns 0 if the pointer is already aligned.
if bytes.as_ptr().align_offset(std::mem::align_of::<Header>()) != 0 {
return None;
}
// SAFETY: We verified size and alignment above.
// The bytes are valid for the lifetime of the slice.
// The struct is #[repr(C)] so the layout matches the bytes.
// 1. `bytes` is non-null and valid for reads of `size_of::<Header>()` bytes.
// 2. The pointer is aligned to `align_of::<Header>()`.
// 3. The memory is initialized because it comes from a &[u8].
// 4. The struct contains no uninitialized fields or invalid bit patterns.
unsafe {
let ptr = bytes.as_ptr() as *const Header;
Some(&*ptr)
}
}
fn main() {
// Create a buffer with enough space and proper alignment.
let mut buffer = [0u8; 64];
let header_ptr = buffer.as_mut_ptr() as *mut Header;
// SAFETY: buffer is aligned and sized correctly.
unsafe {
header_ptr.write(Header {
magic: 0xDEADBEEF,
version: 1,
count: 42,
});
}
// Parse the header from the buffer.
if let Some(header) = parse_header(&buffer) {
println!("Magic: 0x{:08X}, Version: {}, Count: {}", header.magic, header.version, header.count);
}
}
The #[repr(C)] attribute is mandatory here. Without it, the compiler can reorder fields, and your byte offsets will be wrong. The align_offset check prevents misaligned accesses. The // SAFETY comment lists the invariants that make the cast valid. If you cannot write those invariants, you do not have a safe wrapper.
Under the hood
At compile time, size_of::<Header>() and align_of::<Header>() resolve to constants. The compiler knows the struct layout because of #[repr(C)]. The checks in parse_header are just integer comparisons. They cost nothing at runtime compared to an allocation.
The cast bytes.as_ptr() as *const Header is a no-op. It changes the type of the pointer but not the address. Dereferencing &*ptr creates a reference. The borrow checker tracks the lifetime of that reference. It ties the &Header to the lifetime of the &[u8]. If the slice drops, the reference becomes invalid. The compiler enforces this. You cannot return a &Header that outlives the buffer.
This lifetime binding is a feature, not a bug. It prevents dangling pointers. If you try to store the reference somewhere that outlives the buffer, the compiler rejects you with E0515 (cannot return value referencing local data). The borrow checker forces you to keep the buffer alive as long as you use the parsed struct.
Real-world constraints
Real binary formats often have quirks. Endianness is the biggest one. If your data comes from a network or a file written on a different architecture, the byte order might not match your CPU. A u32 value 0x01020304 stored in big-endian order looks like 0x04030201 on a little-endian CPU. Zero-copy assumes native endianness. If the format differs, zero-copy gives you garbage values.
You can handle endianness by swapping bytes after parsing. This requires reading the fields and converting them. That is a copy, but it is a small copy of the struct fields, not a full allocation. Or you can swap bytes in place if you own the buffer. Zero-copy often trades off endianness flexibility for speed. If you need to support multiple endiannesses, you might need a wrapper type that swaps on access, or you accept the cost of a copy.
Another constraint is padding. #[repr(C)] inserts padding between fields to satisfy alignment. If your binary format packs fields tightly without padding, #[repr(C)] will misinterpret the offsets. You can use #[repr(C, packed)] to remove padding. This tells the compiler to lay out fields with zero padding. However, packed structs often trigger unaligned accesses. On some architectures, unaligned accesses are slow. On others, they crash. Use packed structs only when the format explicitly requires it, and be prepared for performance penalties.
The community convention for zero-copy is to use the bytemuck crate. It provides traits like Pod (Plain Old Data) and functions like try_from_bytes that handle the checks and casts safely. Hand-rolling zero-copy is fine for learning, but bytemuck saves you from subtle bugs.
use bytemuck;
/// A message header compatible with bytemuck's Pod trait.
#[repr(C)]
#[derive(Debug, bytemuck::Pod, bytemuck::Zeroable)]
struct MessageHeader {
opcode: u16,
payload_len: u32,
}
fn parse_message(bytes: &[u8]) -> Option<&MessageHeader> {
// bytemuck::try_from_bytes checks size and alignment internally.
// It returns a reference into the slice.
bytemuck::try_from_bytes(bytes).ok()
}
The bytemuck crate requires you to derive Pod and Zeroable. These traits assert that the type has no padding traps, no invalid bit patterns, and a stable layout. The compiler checks these assertions. Using bytemuck is the standard way to do zero-copy in production Rust. It reduces the unsafe surface and makes the intent clear.
Pitfalls and errors
Zero-copy deserialization exposes you to undefined behavior if you get the checks wrong. The compiler cannot protect you inside unsafe. You must verify every invariant.
Alignment errors are the most common cause of crashes. If you skip the alignment check and the buffer is misaligned, the CPU may raise a segmentation fault. On x86, misaligned accesses often work but are slower. On ARM or RISC-V, they can crash immediately. Always check alignment with align_offset. If the buffer is unaligned, you cannot cast to a reference. You must copy the data to an aligned buffer, or use an unaligned read function if the architecture supports it.
Endianness errors cause silent data corruption. The code compiles and runs, but the values are wrong. You might see a timestamp that is years in the future, or a count that is zero when it should be large. Test with known binary data. Print the raw bytes and the parsed values. Verify the byte order matches your expectations.
Lifetime errors cause dangling pointers. If you parse a header from a temporary slice and try to keep the reference, the compiler stops you. If you use unsafe to bypass the borrow checker, you create a dangling pointer. Accessing it reads garbage or crashes. Trust the borrow checker. It usually has a point. If the lifetime system rejects your code, the data is not safe to reference.
Dereferencing a raw pointer without unsafe triggers E0133 (dereference of raw pointer requires unsafe). The compiler forces you to acknowledge the risk. If you try to return a reference to a local variable, you get E0515. If you cast a pointer to the wrong type, you might get E0308 (mismatched types) if the types are incompatible, or worse, silent misinterpretation if the types are compatible but the layout is wrong.
Choosing your strategy
Zero-copy is a tool for specific problems. It is not a replacement for all deserialization. Use the right tool for the data and the constraints.
Use zero-copy deserialization when you are parsing large binary blobs where allocation overhead is measurable and the data layout is fixed. Use zero-copy when the input data is already aligned or you can control the alignment of the buffer. Use zero-copy when the endianness matches the target platform, or you can handle byte swapping efficiently. Use bytemuck for zero-copy in production code to reduce unsafe surface and leverage community-tested checks.
Use standard deserialization with Serde when the format is complex, nested, or text-based like JSON or YAML. Serde handles validation, type conversion, and error reporting. It allocates, but the overhead is negligible for small payloads or text formats. Use Serde when you need to deserialize into owned types that outlive the input buffer.
Use #[repr(packed)] only when you are interfacing with a legacy binary format that explicitly forbids padding, and you accept the performance penalty of unaligned accesses. Reach for std::mem::transmute only as a last resort for type punning; prefer pointer casts or bytemuck for clarity and safety.
Pick the tool that matches your data. Zero-copy is a scalpel, not a hammer. Use it where precision and speed matter, and let the allocator handle the rest.