How to Parse YAML in Rust

Parse YAML in Rust using the serde_yaml crate and the from_str function to deserialize text into structs.

The config file problem

You are building a tool that reads a configuration file. The file is YAML. It has nested maps, lists, and maybe a comment or two. You need to turn that text into a Rust struct so your code can access config.database.port without string parsing or manual indexing.

In Python, you import yaml and call load. In Rust, you don't write a parser. You use serde_yaml. The pattern mirrors JSON parsing: define the struct, derive the trait, call the deserializer. The difference is YAML's flexibility. YAML allows values to shift types based on context, and serde_yaml handles the negotiation between that looseness and Rust's strict type system.

Serde is the engine, serde_yaml is the adapter

Rust's serialization ecosystem revolves around serde. serde is a framework that defines traits like Serialize and Deserialize. It does not parse any format on its own. You need a format-specific crate to bridge serde to the wire format.

serde_yaml is that bridge for YAML. It implements serde's deserializer traits for YAML input. When you call serde_yaml::from_str, you are asking serde_yaml to read the YAML, walk the structure, and call methods on your struct's Deserialize implementation to fill in the fields.

Think of serde as a universal socket adapter and serde_yaml as the specific plug for YAML outlets. You can swap the plug for serde_json or toml and keep the same socket. Your structs stay the same. Only the parsing crate changes.

Minimal example

Start with a struct that matches your YAML keys. Derive Deserialize. Call from_str.

use serde::Deserialize;

// Derive Deserialize so serde_yaml can fill this struct.
#[derive(Deserialize)]
struct Config {
    name: String,
    port: u16,
}

fn main() {
    let yaml = r#"
name: myapp
port: 8080
"#;

    // from_str returns Result<Config, serde_yaml::Error>.
    // unwrap panics if the YAML is malformed or types don't match.
    let config: Config = serde_yaml::from_str(yaml).unwrap();
    println!("{}:{}", config.name, config.port);
}

Add these dependencies to your Cargo.toml:

[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_yaml = "0.8"

The derive feature in serde is essential. Without it, #[derive(Deserialize)] fails. The compiler rejects the code with E0277 (the trait Deserialize is not implemented for Config).

Convention aside: serde_yaml versioning has been quirky. Version 0.8 is the stable workhorse for most projects. Version 0.9 introduced breaking changes in error handling and API surface. Stick with 0.8 unless you need a specific feature from the newer release. Check the changelog before upgrading.

Trust the struct definition. If the YAML drifts, the parse fails. That's a feature, not a bug.

Walk through the deserialization

When serde_yaml::from_str(yaml) runs, the crate tokenizes the YAML string. It builds an internal representation of the document. Then it invokes the Deserialize implementation on Config.

The deserializer expects a map because Config is a struct. It reads the first key name. It looks for a field named name in the struct. It finds a String field. It reads the value myapp as a scalar string and stores it. It moves to port. It finds a u16 field. It reads 8080. YAML treats unquoted 8080 as an integer. The deserializer coerces the integer into u16 and stores it.

If the YAML contains a key that the struct does not have, serde_yaml ignores it by default. Extra keys are noise. If the struct has a field that the YAML does not contain, the deserialization fails. Rust requires all fields to be initialized. Missing keys cause a runtime error, not a compile error.

Use Option<T> for fields that might be missing. The deserializer sets the field to None when the key is absent. This is the idiomatic way to handle optional configuration.

Make your config struct the source of truth. Use Option for everything that might be missing.

Realistic configuration

Real configs have nested structures, optional sections, and naming mismatches. YAML keys often use camelCase or snake_case. Rust fields prefer snake_case. You can align them with serde attributes.

use serde::Deserialize;

#[derive(Deserialize)]
struct AppConfig {
    // Rename the YAML key to match a different Rust field name.
    #[serde(rename = "appName")]
    app_name: String,

    // Nested struct. serde_yaml recurses into the map.
    database: DatabaseConfig,

    // Optional field. Missing key results in None.
    log_level: Option<String>,
}

#[derive(Deserialize)]
struct DatabaseConfig {
    host: String,
    // Default value if the key is missing.
    #[serde(default = "default_port")]
    port: u16,
}

fn default_port() -> u16 {
    5432
}

fn main() {
    let yaml = r#"
appName: production
database:
  host: db.example.com
  # port is missing, so default_port runs.
log_level: debug
"#;

    let config: AppConfig = serde_yaml::from_str(yaml).unwrap();
    println!("App: {}", config.app_name);
    println!("DB: {}:{}", config.database.host, config.database.port);
    println!("Log: {:?}", config.log_level);
}

The #[serde(rename = "...")] attribute maps a YAML key to a Rust field with a different name. The #[serde(default = "function")] attribute provides a fallback value when the key is absent. The function must return the field type and take no arguments.

Convention aside: Use #[serde(rename_all = "camelCase")] at the struct level if your YAML uses camelCase keys throughout. This avoids repeating rename on every field. The community prefers rename_all for bulk alignment and rename for individual exceptions.

Don't fight the naming mismatch. Use rename_all to bridge the gap.

Dynamic YAML and Value

Sometimes you cannot define a struct. The YAML structure varies per document. Or you are building a tool that inspects arbitrary YAML. In these cases, parse into serde_yaml::Value.

Value is an enum that represents any YAML node: null, boolean, number, string, sequence, or mapping. It works like a dynamic type. You parse the YAML into Value, then inspect and extract data at runtime.

use serde_yaml::Value;

fn main() {
    let yaml = r#"
servers:
  - host: alpha
    port: 80
  - host: beta
    port: 443
metadata:
  version: 1
"#;

    // Parse into Value instead of a struct.
    let doc: Value = serde_yaml::from_str(yaml).unwrap();

    // Access nested fields using get and type-specific methods.
    if let Some(servers) = doc.get("servers").and_then(|v| v.as_sequence()) {
        for server in servers {
            let host = server.get("host").and_then(|v| v.as_str()).unwrap_or("unknown");
            let port = server.get("port").and_then(|v| v.as_u64()).unwrap_or(0);
            println!("Server {} on port {}", host, port);
        }
    }

    // Check metadata type.
    if let Some(version) = doc.get("metadata").and_then(|m| m.get("version")).and_then(|v| v.as_u64()) {
        println!("Version: {}", version);
    }
}

Value methods return Option. as_str() returns Some(&str) if the value is a string. as_sequence() returns Some(&Vec<Value>) if it's a list. Chaining and_then lets you drill down safely without panics.

When the structure is unknown, Value is your safety net. Parse into Value first, then inspect.

Pitfalls and error handling

YAML is permissive. Rust is not. Mismatches surface as runtime errors from from_str. The function returns Result<T, serde_yaml::Error>. Using unwrap() in production code is a recipe for crashes. Handle the error explicitly.

match serde_yaml::from_str::<Config>(yaml) {
    Ok(config) => {
        // Use config.
    }
    Err(e) => {
        eprintln!("Failed to parse config: {}", e);
        // Exit or fallback.
    }
}

Common pitfalls:

  • Type coercion limits: serde_yaml coerces integers to floats and strings to integers in some cases, but not all. If your struct expects u16 and the YAML has "8080" (a quoted string), the parse may fail depending on the version. Unquoted numbers are safe. Quoted numbers require explicit parsing or a custom deserializer.
  • Boolean ambiguity: YAML treats yes, no, on, off as booleans in some contexts. serde_yaml follows the YAML 1.1 spec by default, which can interpret yes as true. If you need strict true/false, be aware of this behavior.
  • Performance: serde_yaml is slower than serde_json. YAML parsing involves more state tracking and comment handling. If performance is critical and the format is negotiable, switch to JSON or TOML.
  • Empty documents: An empty YAML string parses to Value::Null. Parsing an empty string into a struct fails. Check for empty input before parsing.

Treat the Result as a contract. Handle the error path before you ship.

Decision matrix

Use serde_yaml when you need to parse YAML configuration files and want the ergonomics of serde with automatic struct mapping. Use serde_json when your data is already JSON or you need faster parsing and smaller payloads for APIs. Use toml when you are writing a configuration file for a Rust project and want a format that is easier to read and less ambiguous than YAML. Reach for serde_yaml::Value when the YAML structure is dynamic and you cannot define a fixed Rust struct ahead of time. Pick manual parsing only when you are dealing with a YAML subset that serde_yaml rejects and you need to implement a custom tolerance for malformed input.

Where to go next