How to Implement the Typestate Pattern in Rust

Implement the Typestate pattern by defining an enum where each variant represents a valid state and using `match` to enforce state transitions at compile time.

How to Implement the Typestate Pattern in Rust

You are building a library for a hardware device. The device requires a strict sequence: initialize, calibrate, run, shutdown. A user writes code that calls run() immediately after creating the device, skipping calibration. In Python, the device overheats three seconds later. In Rust, you want the compiler to reject that code before it ever compiles. The typestate pattern makes impossible states unrepresentable. It encodes the current state of an object into its type, so the compiler enforces valid transitions at compile time.

The core idea: state as type

In object-oriented languages, an object has a type and internal fields that track state. A TcpStream might have a state field set to Unconnected or Connected. Methods check this field at runtime and panic if the state is wrong. Rust offers a different approach. Instead of a single type with internal state, you define distinct types for each state. TcpStream<Unconnected> is a different type from TcpStream<Connected>. Methods are implemented only for the types where they make sense. The compiler tracks which type you hold. If you try to call a method that doesn't exist for the current type, the code fails to compile.

Think of a multi-step form on a website. You cannot click "Submit" until you fill the address field. The "Submit" button is disabled until the address is valid. Typestate is like that, but the compiler enforces the button state. If the type is FormAddress, the submit method does not exist. You must call add_address first, which returns a FormComplete type that has submit. The state is baked into the type system.

Minimal example: a TCP stream

The typestate pattern relies on generic type parameters and marker types. Marker types are empty structs that carry no data. They exist only to distinguish states. PhantomData is a standard library type that tells the compiler you care about a type parameter, even if you don't store any data of that type.

use std::marker::PhantomData;

// Marker types for states. They hold no data.
struct Unconnected;
struct Connected;
struct Writing;

// The stream carries a state type parameter.
struct TcpStream<S> {
    // PhantomData<S> makes S a generic parameter that affects variance and drop check.
    // It prevents the compiler from warning about unused type parameters.
    _state: PhantomData<S>,
}

impl TcpStream<Unconnected> {
    /// Connects the stream. Consumes self and returns a new type.
    fn connect(self) -> TcpStream<Connected> {
        // Simulate connection logic.
        TcpStream { _state: PhantomData }
    }
}

impl TcpStream<Connected> {
    /// Starts writing. Only available when Connected.
    fn start_write(self) -> TcpStream<Writing> {
        TcpStream { _state: PhantomData }
    }
}

impl TcpStream<Writing> {
    /// Sends data. Only available when Writing.
    fn send(&self, _data: &[u8]) {
        // Simulate sending.
    }
}

fn main() {
    let stream = TcpStream { _state: PhantomData };
    
    // stream.start_write(); // ERROR: method not found.
    // stream.send(b"hello"); // ERROR: method not found.
    
    let stream = stream.connect();
    // stream.connect(); // ERROR: method not found on TcpStream<Connected>.
    
    let stream = stream.start_write();
    stream.send(b"hello"); // OK.
}

The TcpStream struct has a generic parameter S. Each state is a distinct marker type. The connect method is implemented only for TcpStream<Unconnected>. It takes self by value, consuming the Unconnected stream, and returns a TcpStream<Connected>. The variable stream is now a different type. If you try to call connect again, the compiler rejects the code because TcpStream<Connected> does not have a connect method. The state transition is enforced by the type system.

How the compiler enforces the rules

Typestate works because Rust's type system tracks the exact type of every variable. When you call a method that takes self by value, the variable is moved. The function returns a new value, which may have a different type. The compiler updates the type of the variable. If you try to use the variable in a way that doesn't match its new type, the compiler emits an error.

This pattern requires consuming self to change state. You cannot use &mut self to change the type. A mutable reference keeps the same type. If connect took &mut self, the variable would remain TcpStream<Unconnected>, and the compiler would allow calling connect again. Consuming self forces the caller to acknowledge the state change. The old type is gone. The new type is what you have.

Convention aside: PhantomData is the standard way to add a type parameter that doesn't affect memory layout. It is a zero-cost marker. The community expects PhantomData in typestate implementations. Using a private field or a dummy value is discouraged. PhantomData signals intent clearly.

Realistic example: a file handle with error recovery

Real-world typestate often involves error handling. A file handle might start unopened. Opening it can fail. If opening fails, you might want to recover the unopened handle to retry. This requires returning the original state on error.

use std::marker::PhantomData;

// Marker traits for states.
trait FileState {}
struct Unopened; impl FileState for Unopened {}
struct Open; impl FileState for Open {}

struct FileHandle<S: FileState> {
    path: String,
    fd: Option<i32>,
    _state: PhantomData<S>,
}

impl FileHandle<Unopened> {
    fn new(path: &str) -> Self {
        FileHandle {
            path: path.to_string(),
            fd: None,
            _state: PhantomData,
        }
    }

    /// Opens the file. Returns Result with Open handle or Unopened handle on error.
    fn open(self) -> Result<FileHandle<Open>, FileHandle<Unopened>> {
        // Simulate OS call that might fail.
        let success = true;
        if success {
            Ok(FileHandle {
                path: self.path,
                fd: Some(42),
                _state: PhantomData,
            })
        } else {
            // Return the Unopened handle so the caller can retry or clean up.
            Err(self)
        }
    }
}

impl FileHandle<Open> {
    fn read(&self) -> Result<String, String> {
        Ok("data".to_string())
    }

    fn close(self) -> FileHandle<Unopened> {
        FileHandle {
            path: self.path,
            fd: None,
            _state: PhantomData,
        }
    }
}

fn main() {
    let handle = FileHandle::new("/tmp/test.txt");
    
    match handle.open() {
        Ok(mut open_handle) => {
            let data = open_handle.read().unwrap();
            println!("Read: {}", data);
            let _closed = open_handle.close();
        }
        Err(unopened_handle) => {
            // Handle can be retried or dropped.
            println!("Failed to open: {}", unopened_handle.path);
        }
    }
}

The open method returns Result<FileHandle<Open>, FileHandle<Unopened>>. On success, you get an Open handle. On failure, you get the Unopened handle back. This allows the caller to retry or clean up without losing the handle. The type system ensures you cannot read from an unopened file. You must unwrap the Ok variant to get the Open handle. The compiler enforces the workflow.

Pitfalls and compiler errors

Typestate adds complexity. The number of types grows with the number of states. If you have many states, you might end up with a combinatorial explosion of types. Use typestate when the state machine is linear or has a small number of states. For complex state machines, consider an enum with runtime checks.

If you try to call a method that doesn't exist for the current state, the compiler emits E0599 (no method named connect found for struct TcpStream<Connected>). The error message tells you exactly which type you have and which methods are available. This is helpful. It guides you to the correct state transition.

If you try to assign a value of one state to a variable expecting another state, the compiler emits E0308 (mismatched types). This prevents accidental state confusion. The compiler tracks types precisely. You cannot hide a state mismatch behind a type cast.

Convention aside: Keep PhantomData fields private. Exposing them allows users to construct invalid states. The struct should be opaque. Users interact only through the provided methods. This preserves the safety guarantees.

Typestate turns runtime panics into compile-time red squiggles. If the code compiles, the state transitions are valid. Trust the type system to track your state.

When to use typestate

Use typestate when invalid state transitions must be impossible at compile time, such as cryptographic keys or file handles. Use typestate for builder patterns where configuration steps must happen in a specific order. Use an enum with match when state changes are frequent, dynamic, or when you need to inspect the current state at runtime. Use traits when you need to treat different states polymorphically, though typestate often replaces this need. Reach for simple flags when the state logic is trivial and compile-time enforcement adds more friction than value.

Typestate makes the impossible unrepresentable. If the code compiles, the state machine is valid.

Where to go next

The Typestate pattern uses Rust's type system to ensure that an object can only perform certain actions when it is in the right state. Think of it like a vending machine: you can only get a snack after inserting money, and you can't insert money again until you've taken the snack. This prevents mistakes by making invalid actions impossible to code.