How to reduce binary size

The 15-megabyte surprise

You just finished your Rust CLI tool. It parses arguments, fetches data, and formats output. You run cargo build and check the file size. The binary is 18 megabytes. You are trying to ship this to a Raspberry Pi with limited storage, or you are baking it into a Docker image and your CI pipeline is timing out on the upload. Rust binaries have a reputation for being large. The default build is intentionally bloated to help you debug. Shrinking it down is a matter of telling the compiler to stop packing the training wheels.

Why Rust binaries start big

Rust's default build mode is debug. This mode prioritizes compile speed and debugging information over binary size. The compiler includes a massive symbol table that maps every line of your code to machine instructions. It also skips many optimization passes that would shrink the code but take time to compute.

Think of it like shipping a piece of furniture. The debug build ships the furniture, the assembly instructions, the original packaging, the spare screws, and a video tutorial on how to build it. The release build ships just the furniture, assembled and wrapped in shrink wrap. You don't need the instructions once the furniture is built. You certainly don't need them on the customer's shelf.

Debug builds are for fixing bugs. Release builds are for running code.

The release profile

The single most effective step is switching to the release profile. The --release flag tells Cargo to use a different set of compiler flags. It changes the optimization level, enables more aggressive inlining, and prepares the binary for distribution.

# Debug build includes symbols and skips optimizations.
$ cargo build
$ ls -lh target/debug/my_tool
-rwxr-xr-x 1 user staff 12M ... my_tool

# Release build optimizes and prepares for distribution.
$ cargo build --release
$ ls -lh target/release/my_tool
-rwxr-xr-x 1 user staff 4.2M ... my_tool

The size drop comes from two sources. First, the compiler runs LLVM optimization passes that remove dead code, inline small functions, and simplify control flow. Second, the release profile still includes debug symbols by default, but the optimization passes often eliminate enough code that the symbol table shrinks too. The release profile is the baseline for any production build. Never ship a debug binary.

Stripping the symbols

Even in release mode, the binary retains a symbol table. This table allows debuggers to show variable names and line numbers when a crash occurs. In production, you rarely debug a running binary. You debug on your machine. The symbol table is dead weight.

Stripping removes the symbol table after linking. The modern way to strip is using the strip setting in your profile. This is cleaner than passing linker arguments and works consistently across platforms.

[profile.release]
# Strip symbols from the final binary.
# This removes the symbol table, shrinking the file significantly.
strip = true

The community convention is to use strip = true in Cargo.toml rather than RUSTFLAGS="-C link-arg=-s". The profile setting is explicit, version-controlled, and doesn't require environment variables. It also handles platform-specific stripping tools automatically.

Ship the binary, not the map.

Link Time Optimization

Rust compiles crates independently. When you compile my_tool, the compiler sees the interface of serde, but not the implementation details of every function inside serde. This isolation speeds up compilation but prevents the optimizer from inlining functions across crate boundaries.

Link Time Optimization (LTO) changes this. LTO passes the intermediate representation to the linker instead of object files. The linker merges all crates and runs optimization passes on the combined code. This allows the optimizer to inline functions from dependencies into your code, removing function call overhead and eliminating unused code more aggressively.

[profile.release]
# Enable Link Time Optimization.
# This allows the linker to optimize across crate boundaries.
# Thin LTO is faster to compile; Fat LTO is slower but often smaller.
lto = true

LTO comes with a cost. It increases compile time significantly because the linker has to do more work. It also increases memory usage during linking. The size reduction varies by project. Projects with many small dependencies see the biggest gains. Projects with a single large crate see less benefit.

LTO trades compile time for binary size. Pay the tax if you need the space.

Codegen units and optimization levels

Rust uses codegen units to parallelize compilation. A crate is split into multiple units, each compiled by a separate thread. This speeds up builds but limits optimization. The optimizer works within a unit and cannot see across unit boundaries.

Reducing codegen units to one forces the compiler to treat the entire crate as a single unit. This allows cross-function optimization but kills parallelism. Compile time increases, but binary size and runtime performance often improve.

[profile.release]
# Reduce codegen units to improve optimization.
# Default is number of cores. 1 gives best optimization but slowest compile.
codegen-units = 1

The optimization level controls how hard the compiler works. Level 2 is the default for release and optimizes for speed. Level z optimizes for size, often at the cost of runtime speed. Level s also optimizes for size but with fewer aggressive transformations.

[profile.release]
# Optimize for size.
# Level z is aggressive and may hurt runtime performance.
# Level s is a safer bet for size without major speed loss.
opt-level = "z"

Use opt-level = "z" for embedded systems or WASM modules where every kilobyte counts. Use opt-level = "2" for general applications where runtime speed matters more than binary size.

One codegen unit gives the optimizer the full picture. More units give you faster builds. Pick your poison.

Panic behavior

By default, Rust unwinds the stack on panic. This allows the runtime to run destructors and print a backtrace. Unwinding requires metadata tables that increase binary size. Setting panic = "abort" tells the runtime to terminate immediately on panic. This removes the unwind tables and shrinks the binary.

[profile.release]
# Abort on panic instead of unwinding.
# This removes the unwind table and shrinks the binary.
# You lose backtraces and destructors may not run.
panic = "abort"

This setting changes behavior. Destructors won't run on panic, which can leak resources if you rely on RAII for cleanup. You also lose stack traces, making debugging crashes harder. This is acceptable for servers that handle errors explicitly and never panic. It is risky for interactive tools where a panic might indicate a bug you need to diagnose.

Abort on panic. You get a smaller binary and a harder life if you crash.

Dependency bloat

Optimizing the compiler flags only goes so far. If your dependencies pull in heavy optional features, no amount of stripping will fix the size. Many crates enable features by default that you don't need. Disabling default features is the most effective way to reduce dependency bloat.

[dependencies]
# Disable default features to avoid pulling in heavy optional deps.
# Enable only the features you actually use.
serde = { version = "1.0", default-features = false, features = ["alloc"] }
regex = { version = "1.10", default-features = false, features = ["std"] }

The community convention is to audit dependencies with cargo bloat. This tool analyzes the binary and shows which functions take the most space. It often reveals that serde or regex is dominating the size due to unused features.

# Install the analysis tool.
$ cargo install cargo-bloat

# Analyze the release binary.
$ cargo bloat --release

The output shows a table of functions sorted by size. If you see a function from a dependency that you don't use, check the feature flags. Disable the feature that pulls it in. Every feature flag is a kilobyte waiting to happen.

Audit your dependencies. Every feature flag is a kilobyte waiting to happen.

Pitfalls and trade-offs

Optimization is a trade-off. You pay in compile time or runtime speed to save disk space. LTO and codegen-units = 1 can make compilation take minutes instead of seconds. opt-level = "z" can disable vectorization and make loops slower. panic = "abort" removes safety nets.

The compiler errors you see during development don't change based on profile. A type mismatch is still E0308 (mismatched types) in release mode. A trait bound violation is still E0277 (trait bound not satisfied). Optimization doesn't fix logic errors. It only changes how the code runs and how much space it takes.

Be careful with opt-level = "z". It enables flags like no-builtins that prevent the compiler from using optimized library functions. This can hurt performance on platforms where those builtins are critical. Benchmark your application after switching to z. If performance drops too much, fall back to s or 2.

Optimization is a trade-off. You pay in compile time or runtime speed to save disk space.

Decision matrix

Use cargo build --release when you are building for production. Debug binaries are for development only.

Use strip = true when you are shipping to users and don't need local debugging symbols.

Use lto = true when binary size matters more than compile time, or when you have many small crates that benefit from cross-crate inlining.

Use opt-level = "z" when every kilobyte counts, such as embedded systems or WASM modules, and you have verified that the performance hit is acceptable.

Use panic = "abort" when you want the smallest possible binary and your application handles errors explicitly without panicking.

Use codegen-units = 1 when you are doing a final release build and want the absolute best optimization, accepting the slower compile time.

Reach for cargo bloat when you need to find which dependencies are inflating your binary.

Measure before you optimize. A 10-megabyte binary is fine for a desktop app. A 10-megabyte binary is a disaster for a microcontroller.

Where to go next

Think of this like packing a suitcase for a trip. When you build for development, you pack extra tools and maps (debug info) that help you fix problems but take up space. When you build for release with these flags, you only pack the essentials needed to run, removing the extra weight to make the final package smaller and faster to ship.