The idle thread problem
You write a function that fetches a webpage. It takes three seconds. While it waits for the network, your program sits idle. In a single-threaded app, that means the UI freezes. In a server, that means you drop requests. You want the program to do something else while the network request is in flight, then jump back to your code the moment the data arrives.
Rust handles this with async and await. The keywords do not create threads. They do not magically make code faster. They hand control back to an event loop so other tasks can run. Think of a restaurant kitchen. A synchronous chef starts a dish, stands over the stove, and does nothing until the timer dings. An asynchronous chef starts the dish, walks away to chop vegetables, and returns exactly when the timer dings. The kitchen throughput multiplies without hiring more chefs.
In Rust, async fn does not run immediately. It builds a blueprint of the work and returns it. That blueprint is called a Future. A Future is just a state machine that knows how to advance itself when data is ready. You need an executor to run that state machine. The standard library does not ship with one. You bring your own runtime, like tokio or async-std.
How async actually works
An async function is a lazy computation. Calling it allocates a small struct on the stack or heap. That struct holds your local variables and a state enum. The function body gets sliced into segments at every .await point. Each segment becomes a state. The compiler generates code that matches on the current state, runs the next segment, and returns control to the runtime.
When you write .await, you are telling the runtime: pause this task, register a callback for the underlying I/O, and go run something else. The runtime stores the Future in a queue. When the operating system signals that the socket is ready or the timer expired, the runtime pulls the Future back out, calls its poll() method, and resumes execution from the exact .await line. Local variables are restored. The next segment runs. If the function finishes, it returns Poll::Ready(value). If it needs to wait again, it returns Poll::Pending.
This model gives you concurrency without thread overhead. A single OS thread can juggle thousands of async tasks because the context switch happens in user space, not in the kernel. You pay for the state machine allocation and the queue management, but you avoid the cost of creating and scheduling heavy OS threads.
A minimal working example
Standard Rust refuses to let main be async. The entry point must be synchronous so the operating system can hand control to your program cleanly. You solve this by using a runtime attribute that wraps your async main in a synchronous bootstrap function.
use tokio;
/// Simulates a network request that takes one second.
async fn fetch_data() -> String {
// Pause execution here without blocking the thread.
// The runtime will run other tasks while we wait.
tokio::time::sleep(std::time::Duration::from_secs(1)).await;
// Return the result once the sleep finishes.
"Data received".to_string()
}
/// The runtime attribute transforms this into a sync main.
/// It spins up the executor and blocks until the future completes.
#[tokio::main]
async fn main() {
// Call the async function. It returns a Future immediately.
// The .await operator hands control to the runtime.
let result = fetch_data().await;
println!("{}", result);
}
The #[tokio::main] attribute is a macro. It generates a synchronous main that creates a tokio::Runtime, calls block_on() on your async block, and exits when the future resolves. You never write the boilerplate yourself. The community convention is to keep the attribute on main and leave your library functions as plain async fn. Do not mark every function with runtime attributes. That couples your code to a specific executor.
Under the hood: the state machine
The compiler rewrites async fn into a struct that implements the Future trait. The trait has one method: poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<T>. The Pin type prevents the future from being moved in memory after it starts polling. Moving a future would invalidate internal self-references that the compiler creates for borrowed data. Pinning guarantees the memory address stays fixed.
Each .await point becomes a state transition. The compiler generates a match statement inside poll(). It checks the current state, runs the code up to the next .await, registers the underlying I/O source with the runtime's waker, and returns Poll::Pending. The waker is a lightweight handle that tells the runtime which task to resume when the I/O completes. When the runtime calls poll() again, the match statement jumps to the next state, runs the following segment, and repeats.
This transformation means async functions can grow large. Every local variable lives inside the generated struct. If you declare a heavy struct or a large buffer inside an async function, the state machine carries it around between .await points. The convention is to keep async functions small. Extract heavy setup into synchronous helper functions. Move large allocations outside the async boundary. You reduce the state machine footprint and make the generated code easier to debug.
Realistic example: concurrent I/O
Async shines when you have multiple independent I/O operations. You start them all, then wait for the results. The runtime interleaves the tasks automatically.
use tokio;
/// Fetches a page and returns its simulated length.
async fn get_page_length(url: &str) -> usize {
// Simulate network delay. Real code would use reqwest::get(url).await.
// The .await yields control so other tasks can run.
tokio::time::sleep(std::time::Duration::from_millis(500)).await;
// Return a dummy size once the simulated request completes.
1024
}
#[tokio::main]
async fn main() {
// Start both fetches simultaneously.
// The runtime polls both futures, switching at each .await.
let (len1, len2) = tokio::join!(
get_page_length("https://example.com"),
get_page_length("https://rust-lang.org")
);
// Print results after both futures resolve.
println!("Page 1: {} bytes, Page 2: {} bytes", len1, len2);
}
The tokio::join! macro takes multiple futures and polls them concurrently. It returns a tuple of results in the same order you passed them. The total execution time is roughly 500 milliseconds, not 1000. The runtime switches between the two sleep timers. When the first timer fires, it resumes that task, finishes it, and moves to the second. No threads are created. No kernel context switches happen.
If you need to run tasks independently and forget about them, use tokio::spawn. It returns a JoinHandle and schedules the future on the runtime's thread pool. The original task continues immediately. You await the handle later if you need the result.
Pitfalls and compiler traps
Async Rust introduces a few sharp edges. The compiler catches most of them, but the error messages can feel opaque until you understand the underlying model.
Forgetting .await is the most common mistake. Calling an async function returns a Future. If you ignore it, the compiler rejects you with E0733 (unused Future that must be used). The compiler assumes you meant to run the work. It will not silently drop your network request. Add .await or explicitly discard it with let _ = fetch_data(); if you truly want to ignore the result.
Blocking the runtime is a silent killer. If you call a synchronous function that takes two seconds, like std::thread::sleep or a heavy CPU loop, you freeze the entire executor thread. Other tasks cannot run. The runtime has no way to pause your blocking call and switch to another task. Offload heavy work to tokio::task::spawn_blocking. That function runs the closure on a separate thread pool designed for blocking operations. The async task yields control while the blocking thread works.
Thread safety bounds trip up multi-threaded runtimes. tokio::main uses a multi-threaded scheduler by default. Futures must implement the Send trait to jump between threads. If your future captures a non-Send type, like Rc<T> or a raw pointer, the compiler rejects it with E0277 (trait bound not satisfied). Replace Rc with Arc in async code. Arc is thread-safe. The compiler will guide you to the right type if you pay attention to the trait bound error.
Deadlocks happen when you hold a lock across an .await point. If you acquire a Mutex, then .await a network call, you block other tasks that need that mutex. The runtime cannot switch to the task waiting for the lock because the lock is held by a paused async task. Drop the lock before awaiting. Reacquire it after the I/O completes. Async mutexes like tokio::sync::Mutex exist, but they are heavier and usually unnecessary if you structure your code correctly.
When to use async versus threads
Pick the right tool for the workload. Async is not a replacement for threads. It is a different concurrency model optimized for I/O bound work.
Use async I/O when you are waiting on network, disk, or database operations and want to handle thousands of connections on a few threads. Use standard threads when your workload is CPU-bound, like image processing or cryptographic hashing, and you want to utilize all cores without context-switching overhead. Use synchronous code when your program is simple, sequential, and does not need to juggle multiple I/O operations at once. Use tokio::task::spawn when you need to fire off a background task and forget about it, rather than waiting for the result inline. Use tokio::sync::mpsc when you need to pass messages between async tasks without shared state. Use std::thread::spawn when you need to run blocking legacy code alongside async tasks without starving the executor.
The decision comes down to what your program spends its time doing. Waiting on external systems calls for async. Crunching numbers calls for threads. Mixing them requires careful boundaries. Keep the async layer thin. Push blocking work to dedicated thread pools. Trust the executor to schedule your tasks efficiently.