When startup speed matters more than first-request latency
You are building a CLI tool that validates user input against a complex regular expression. Compiling that regex takes 40 milliseconds. You do not want to pay that cost every time the user runs a command. You also do not want to pay it when the user just runs --help and exits immediately. You need a value that initializes the first time you actually need it, and stays ready for the rest of the program.
This is the lazy initialization problem. You want to defer expensive work until the moment of first use, while guaranteeing that work happens exactly once, even if multiple threads try to access the value simultaneously.
The once_cell crate solves this with Lazy. It wraps a value and a closure. The closure runs once on first access. The result is stored. Subsequent accesses return the stored result instantly.
The lazy contract: compute once, serve forever
once_cell::sync::Lazy is a type that holds two things: a slot for a value of type T, and a closure that knows how to produce that value. When you create a Lazy, you provide the closure. The closure does not run yet. It sits dormant.
The first time you access the Lazy instance, the crate checks the slot. If the slot is empty, it runs the closure, stores the result, and marks the slot as filled. If the slot is already filled, it skips the closure and returns the reference immediately.
Think of it like a vending machine that manufactures the snack inside. The machine is built with the recipe, but it does not make the snack until you press the button. Once the snack is made, it sits in the tray. The next person just takes it. The machine never runs the recipe again.
This pattern trades startup time for first-access latency. Your program starts faster because the heavy work is deferred. The first user or thread pays the cost. Everyone else gets the cached result.
Minimal example: sync::Lazy
The sync variant is thread-safe. It uses internal synchronization primitives to ensure only one thread runs the closure, and all other threads wait until the value is ready.
use once_cell::sync::Lazy;
use regex::Regex;
/// A regex that compiles once on first access.
static EMAIL_REGEX: Lazy<Regex> = Lazy::new(|| {
// This closure runs exactly once.
// If compilation fails, the panic propagates.
Regex::new(r"^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$").unwrap()
});
fn validate_email(input: &str) -> bool {
// First call triggers compilation.
// Subsequent calls use the cached Regex.
EMAIL_REGEX.is_match(input)
}
fn main() {
println!("Valid: {}", validate_email("test@example.com"));
println!("Valid: {}", validate_email("bad-email"));
}
The Lazy::new call takes a closure. The closure returns the value you want to store. The type of the Lazy is inferred from the closure's return type, or you can specify it explicitly as Lazy<T>.
Accessing the value requires dereferencing. EMAIL_REGEX is a Lazy<Regex>. To use it as a Regex, you write *EMAIL_REGEX or call methods that work via Deref. The is_match call works directly because Lazy implements Deref<Target = Regex>.
What happens under the hood
When you access a sync::Lazy, the implementation performs a check using atomic operations. It reads a flag to see if initialization is complete.
If the flag indicates the value is ready, the access returns immediately with no locking overhead. This is the fast path.
If the flag indicates the value is not ready, the implementation acquires a mutex. It double-checks the flag inside the lock to handle race conditions where another thread might have finished initialization just before this thread acquired the lock. If still uninitialized, it runs the closure, stores the result, sets the flag, and releases the lock.
This protocol guarantees that the closure runs exactly once, even under heavy contention. It also guarantees that no thread sees a partially constructed value. The value appears atomically to all threads.
The unsync::Lazy variant skips the atomic checks and mutexes. It uses a simple flag and assumes single-threaded access. This makes it faster for single-threaded workloads, but using it across threads leads to undefined behavior or panics.
Realistic scenario: heavy config parsing
Consider a server that loads configuration from a YAML file. Parsing the file and validating the structure takes time. You want the config to be available globally, but you want to avoid parsing if the server shuts down before any request arrives.
use once_cell::sync::Lazy;
use serde::Deserialize;
#[derive(Deserialize, Debug)]
struct Config {
port: u16,
database_url: String,
max_connections: usize,
}
/// Global configuration, parsed lazily.
static CONFIG: Lazy<Config> = Lazy::new(|| {
// Read and parse only when first accessed.
let content = std::fs::read_to_string("config.yaml")
.expect("Config file must exist");
serde_yaml::from_str(&content)
.expect("Config file must be valid YAML")
});
fn get_port() -> u16 {
// Access triggers parsing if not done yet.
CONFIG.port
}
fn main() {
// If this function is never called, parsing never happens.
println!("Starting on port {}", get_port());
}
This pattern is common in libraries. A library might provide a global logger or a shared connection pool. Using Lazy ensures the resource is created only if the library is actually used, and it handles thread safety automatically.
Pitfalls: panics, threads, and types
Lazy initialization has subtle behaviors that can trip you up.
The panic retry loop
If the closure passed to Lazy::new panics, the initialization fails. The Lazy instance catches the panic, resets its state to "uninitialized", and propagates the panic to the caller.
The next time you access the Lazy, it sees the uninitialized state and tries to run the closure again. If the panic is deterministic, you get an infinite loop of panics. The program does not crash immediately; it retries forever.
This behavior is intentional. It allows transient failures to be retried. If the config file is temporarily locked, the first access might fail, but the second access might succeed. However, if the failure is permanent, you get a retry loop.
Guard against this by ensuring your closure handles errors gracefully, or by using a fallback value. If the initialization can fail permanently, consider using OnceCell with get_or_try_init instead, which lets you handle the error without retrying.
Moving out of Lazy
Lazy holds the value. You cannot move the value out of the Lazy. You can only borrow it. If you try to move, the compiler rejects you with E0507 (cannot move out of borrowed content).
use once_cell::sync::Lazy;
static DATA: Lazy<Vec<i32>> = Lazy::new(|| vec![1, 2, 3]);
fn bad_function() {
// Error: cannot move out of `DATA` which is behind a shared reference
let _moved: Vec<i32> = *DATA;
}
You must clone the value if you need ownership, or work with references. Dereferencing *DATA gives you a reference to the inner value, not the value itself, because Lazy implements Deref, not DerefMut for statics, and the static is shared.
Unsync across threads
once_cell::unsync::Lazy is not thread-safe. If you pass an unsync::Lazy to another thread, or access it from multiple threads, you risk data races. The crate may panic in debug mode, or you may get undefined behavior in release mode.
Use unsync::Lazy only when you are certain the value is accessed from a single thread. This is common in game loops, single-threaded event loops, or worker threads that own their own state.
Decision: picking the right primitive
The ecosystem offers several ways to do lazy initialization. Choose based on your Rust version, thread requirements, and error handling needs.
Use std::sync::LazyLock when you are on Rust 1.80 or newer and need a thread-safe lazy static value. LazyLock is the standard library equivalent of once_cell::sync::Lazy. It has the same API and performance characteristics. Prefer it to avoid external dependencies.
Use once_cell::sync::Lazy when you are on an older Rust version, or when you already depend on once_cell for other primitives like OnceCell or Once. The behavior is identical to LazyLock.
Use once_cell::unsync::Lazy when you are in a single-threaded context and want to avoid the overhead of atomic operations and mutexes. This is useful in performance-critical inner loops or single-threaded workers where every nanosecond counts.
Use once_cell::sync::OnceCell when you need lazy initialization but the value might fail to initialize, or you need to set the value from multiple places. OnceCell provides get_or_try_init, which returns a Result instead of panicking on failure. It also allows setting the value without a closure, which is useful when the value comes from external input.
Use once_cell::sync::Once when you need to run a side-effect exactly once, but do not need to store a value. Once is lower-level than Lazy. It is useful for registering callbacks or initializing global state that does not produce a return value.
Convention asides
The community convention for Lazy statics is to use SCREAMING_SNAKE_CASE for the name, just like any other static. This signals that the variable is global and immutable after initialization.
When writing the closure, keep it small and focused. The closure should contain only the initialization logic. If the logic is complex, extract it into a helper function. This makes the Lazy definition readable and easier to test.
static CONFIG: Lazy<Config> = Lazy::new(load_config);
fn load_config() -> Config {
// Complex logic here.
// ...
}
The once_cell crate is still widely used, even though LazyLock and OnceLock are now in std. The crate provides OnceCell, which has features like get_or_try_init that OnceLock lacks. If you need fallible initialization, once_cell is still the go-to choice.