The bare-metal starting line
You just plugged a microcontroller board into your laptop. The board has no operating system. No file system. No standard library. Just a tiny ARM core, a few kilobytes of RAM, and some flash memory waiting for instructions. Your goal is to turn Rust code into a binary that runs directly on that silicon, controlling LEDs, reading sensors, or talking to other chips.
This is embedded development. It strips away the comfort of an OS and forces you to manage memory, startup sequences, and hardware registers yourself. The good news is that Rust handles the heavy lifting. The compiler guarantees memory safety, the borrow checker prevents data races, and the toolchain cross-compiles your code for architectures that look nothing like your laptop.
Setting up the environment is just three steps: installing the toolchain, telling it which microcontroller architecture you are targeting, and configuring the build profiles for size and speed. Once those pieces click into place, you can write code that runs at the metal.
What you are actually building
Embedded Rust does not link against std. The standard library assumes an operating system that provides threads, file I/O, and dynamic memory allocation. Microcontrollers have none of that. You are building a no_std application. The compiler still gives you core language features, slices, iterators, and Option, but you must provide your own entry point, panic handler, and memory layout.
Cross-compilation is the bridge between your development machine and the target chip. Your laptop likely runs on x86_64 or ARM64. The microcontroller runs on a 32-bit ARM Cortex-M core or a RISC-V core. The Rust toolchain uses LLVM to translate your high-level code into machine instructions for the target architecture. It also pulls a version of the core library compiled for that exact CPU. The result is a flat binary that contains only your code and the minimal runtime needed to start execution.
Think of it like packing a survival kit. You leave behind the heavy camping gear (the OS, the standard library, dynamic allocators) and pack only what fits in a small backpack. Every byte of flash memory is accounted for. Every register address is mapped. The compiler refuses to build if you try to use features that do not exist on the target.
The toolchain setup
You need rustup to manage the compiler and target specifications. rustup handles multiple toolchains, downloads precompiled standard libraries for foreign architectures, and keeps everything synchronized.
# Install rustup if you do not have it yet.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Add the Cortex-M0 target. This downloads the no_std core library for that architecture.
rustup target add thumbv6m-none-eabi
# Create a new binary crate and move into it.
cargo new --bin blinky
cd blinky
The target triple thumbv6m-none-eabi tells the compiler exactly what you are building. thumb means ARM Thumb instruction set. v6m specifies the Cortex-M0 profile. none means no operating system. eabi defines the calling convention and data alignment rules. When you run cargo build --target thumbv6m-none-eabi, the compiler switches to that instruction set and links against the no_std core library.
You also need to configure the build profiles. Embedded chips have strict flash limits. The default release profile optimizes for speed, which often inflates binary size. You want size optimization instead.
# Cargo.toml
[package]
name = "blinky"
version = "0.1.0"
edition = "2024"
[dependencies]
# cortex-m provides low-level ARM registers and interrupt handling.
cortex-m = "0.7"
# cortex-m-rt provides the reset handler and exception vectors.
cortex-m-rt = "0.7"
# panic-halt stops execution when a panic occurs.
panic-halt = "0.2"
[profile.release]
# Optimize for size. z is more aggressive than s.
opt-level = "z"
# Enable link-time optimization to remove dead code across crate boundaries.
lto = true
# Strip debug symbols to save flash space.
strip = true
The opt-level = "z" flag tells LLVM to prioritize binary size over execution speed. lto = true allows the linker to analyze the entire program and discard unused functions. strip = true removes debug information from the final binary. These three settings together can shrink a release build by half or more.
Keep your Cargo.toml clean. Embedded projects thrive on minimal dependencies. Every crate you add increases compile time and binary size. Only pull in what you need.
How the build pipeline works
When you run cargo build --release --target thumbv6m-none-eabi, the toolchain follows a strict sequence. First, rustc compiles your source files into object files targeting the ARM architecture. It checks types, enforces borrowing rules, and inlines small functions. Second, it compiles the no_std core library for the same target. Third, the linker merges everything into a single ELF file.
The linker needs to know where your code and data live in memory. Microcontrollers have fixed memory maps. Flash starts at a specific address. RAM starts at another. The linker script tells the linker exactly how to place each section. Without it, the linker guesses, and the binary will crash on startup.
// src/main.rs
//! Minimal no_std entry point for a Cortex-M microcontroller.
#![no_std]
#![no_main]
use cortex_m_rt::entry;
use panic_halt as _;
#[entry]
fn main() -> ! {
// The entry attribute replaces the standard main function.
// It runs after the runtime initializes RAM and sets up vectors.
loop {
// Infinite loop keeps the CPU from executing garbage memory.
// Real code would toggle GPIO or handle interrupts here.
}
}
The #![no_std] attribute disables the standard library. The #![no_main] attribute tells the compiler you will provide your own entry point. The #[entry] macro from cortex-m-rt marks the function that runs after reset. The -> ! return type indicates the function never returns, which matches the infinite loop. The panic_halt crate implements the panic handler. When a panic occurs, the CPU enters an infinite loop instead of trying to print a stack trace to a console that does not exist.
This setup gives you a working foundation. The binary will link, flash, and run. The next step is mapping memory so the linker knows where to put things.
A realistic project skeleton
Every embedded project needs a linker script. The cortex-m-rt crate expects a file named memory.x in your project root. It defines the start addresses and sizes of flash and RAM.
/* memory.x */
/* Defines the memory layout for the linker. */
MEMORY
{
/* Flash starts at 0x08000000 on most STM32 chips. */
FLASH : ORIGIN = 0x08000000, LENGTH = 64K
/* RAM starts at 0x20000000. */
RAM : ORIGIN = 0x20000000, LENGTH = 20K
}
You must tell cargo to pass this file to the linker. Add a .cargo/config.toml file to your project.
# .cargo/config.toml
[target.thumbv6m-none-eabi]
# Pass the linker script to the linker via the -T flag.
rustflags = [
"-C", "link-arg=-Tlink.x",
]
The link.x file is provided by cortex-m-rt. It includes memory.x and sets up the vector table, stack pointer, and data section copying. When you build, the linker reads link.x, which reads memory.x, and places your code exactly where the microcontroller expects it.
You also need a way to flash the binary. The community standard is probe-rs. It talks to debug probes like ST-Link, J-Link, or CMSIS-DAP over SWD or JTAG.
# Install probe-rs tools.
cargo install probe-rs-tools
# Flash and reset the target.
probe-rs run --chip STM32F030C8T6 --binary target/thumbv6m-none-eabi/release/blinky
The probe-rs run command flashes the binary, resets the chip, and attaches the debugger. You can also use probe-rs gdb to run gdb or lldb for step-through debugging. The toolchain ecosystem handles the low-level protocol details. You just point it at your chip and binary.
Keep the linker script close to your hardware datasheet. If you change the microcontroller, update memory.x immediately. Mismatched addresses cause hard faults that are nearly impossible to debug.
Common traps and compiler signals
Embedded Rust rejects mistakes early. The compiler and linker will stop you before you waste hours chasing silicon bugs.
If you forget #![no_std], the compiler tries to link against std and fails with E0463 (can't find crate for std). The fix is adding the attribute and removing any std imports. If you use println! or Vec, the compiler complains about missing features. Replace them with defmt for logging and heapless for fixed-size collections.
If you omit the memory.x file or misconfigure .cargo/config.toml, the linker throws an error about missing sections or undefined symbols. The vector table will not be placed correctly, and the reset handler will jump to the wrong address. Double-check the rustflags array. The -Tlink.x flag is mandatory for cortex-m-rt.
Debug builds are often too large for small chips. The default profile includes debug info and does not strip symbols. A simple blinky program can easily exceed 32KB in debug mode. Always test with --release before flashing. If you still hit flash limits, enable codegen-units = 1 in the release profile. It forces LLVM to optimize the entire program in a single pass, which improves size but increases compile time.
Panic handlers must be implemented. If you do not provide one, the linker fails with an undefined reference to __rust_start_panic. The panic-halt crate solves this. If you want more diagnostics, use panic-probe with defmt to print panic messages over the debug probe.
Treat the linker script as a contract. If the addresses do not match the datasheet, the chip will not boot. Verify the memory map before your first flash.
When to pick which target
Use thumbv6m-none-eabi when you are targeting Cortex-M0 or M0+ cores that lack hardware floating point and have tight flash constraints. Use thumbv7m-none-eabi when you need the full Cortex-M3 instruction set but still want to avoid floating point overhead. Use thumbv7em-none-eabihf when your chip has an M4 or M7 core with hardware floating point and you want the compiler to generate FPU instructions. Use riscv32imac-unknown-none-elf when you are working with RISC-V microcontrollers that support atomic operations and compressed instructions. Use x86_64-unknown-linux-gnu when you are writing host-side tools, simulators, or test harnesses that run on your laptop instead of the microcontroller.
Pick the target that matches your silicon. The compiler will refuse to generate instructions your CPU cannot execute. Cross-compilation is strict by design.