How to read file line by line

The problem with slurping

You are writing a tool to parse a configuration file. Or a script that scans a log for errors. The instinct is to read the whole file into a String and split it. That works for a 50KB file. It fails catastrophically for a 4GB log dump. fs::read_to_string allocates the entire file content in memory. If the file is larger than your available RAM, the allocator panics or the OS kills your process. Even if you have enough memory, you waste cycles copying data you might never use.

Reading line by line keeps memory usage flat. You process one line, drop it, and move to the next. The memory footprint stays constant regardless of file size. Rust makes this pattern ergonomic and safe, but it forces you to think about buffering. The compiler will not let you read lines efficiently without a buffer. This restriction exists for a reason. Disk I/O is orders of magnitude slower than RAM access. Reading byte-by-byte triggers a system call for every byte. System calls have overhead. The overhead adds up fast.

The buffering layer

Rust separates the raw file handle from the buffering logic. std::fs::File implements the Read trait. It gives you access to bytes, but it does not buffer. If you read from a File directly, you pay the syscall cost on every read.

BufReader wraps any Read source and adds a buffer. It allocates a chunk of memory, usually 8KB by default. When you ask for data, BufReader fills the buffer from the underlying source in one syscall. Subsequent reads come from the buffer in RAM. This batches the expensive disk operations. BufReader implements the BufRead trait. The lines() method lives on BufRead, not on Read. This trait hierarchy is intentional. You cannot call lines() on a raw File because the compiler enforces that you use a buffered reader.

Think of BufReader like a water filter attached to a pipe. The pipe delivers water in massive surges. The filter holds a reservoir. You drink from the reservoir sip by sip. When the reservoir empties, the filter refills from the pipe. You never have to hold the whole pipe's output in your mouth.

Convention aside: The community defaults to the standard 8KB buffer size. You can specify a custom capacity with BufReader::with_capacity, but profiling rarely shows a benefit for general text files. Stick to the default unless you have measured data showing otherwise.

Minimal example

The standard pattern uses File::open, wraps it in BufReader, and iterates with lines(). The lines() method returns an iterator of Result<String>. Each String contains the line content without the newline character.

use std::fs::File;
use std::io::{self, BufRead, BufReader};

fn main() -> io::Result<()> {
    // Open the file. The ? operator propagates errors immediately.
    // If the file doesn't exist, main returns early with an error.
    let file = File::open("input.txt")?;

    // Wrap the file in a BufReader.
    // This allocates a buffer and takes ownership of the file.
    let reader = BufReader::new(file);

    // lines() returns an iterator over Result<String>.
    // Each item is a line without the trailing newline.
    for line in reader.lines() {
        // Unwrap the Result.
        // If reading fails mid-file, the loop stops and the error propagates.
        let line = line?;
        println!("{line}");
    }

    Ok(())
}

How the iterator works

When you call reader.lines(), Rust creates an iterator. Iterators in Rust are lazy. Nothing happens until you call next(). The for loop calls next() repeatedly. Under the hood, lines() uses the BufRead interface to scan the buffer for newline characters.

The iterator maintains state. It tracks the current position in the buffer. When it finds a newline, it constructs a String from the bytes since the last newline, yields the String, and advances the position. If the buffer runs out before finding a newline, BufReader refills from the file and continues scanning. This mechanism handles lines that span buffer boundaries seamlessly.

Each call to next() returns a Result<String>. The Result exists because I/O can fail at any point. The disk might disconnect. The file might be truncated. The encoding might be invalid. Returning Result forces you to handle these cases. You cannot accidentally ignore a read error.

Convention aside: lines() strips the newline characters. If you need to preserve newlines, use read_line() instead. lines() is for logical lines; read_line() is for raw text preservation. This distinction catches many beginners. lines() returns Result<String>. read_line() returns Result<usize> and appends to a provided string. The return types signal different use cases.

Realistic example: Log parser

Real code often needs more than printing. You might want to filter lines, extract data, or report line numbers. The iterator pattern composes well with functional adapters. You can chain filter, map, and enumerate to build complex pipelines.

use std::fs::File;
use std::io::{BufRead, BufReader};

/// Extracts lines containing a keyword from a file.
/// Returns a vector of matching lines with their line numbers.
fn find_lines(filename: &str, keyword: &str) -> std::io::Result<Vec<String>> {
    let file = File::open(filename)?;
    let reader = BufReader::new(file);
    let mut matches = Vec::new();

    // Enumerate gives us the line number for better error messages.
    // The iterator yields (index, Result<String>).
    for (idx, line_result) in reader.lines().enumerate() {
        // Handle I/O errors per line.
        // This allows the parser to skip bad lines if desired,
        // though here we propagate the error to stop processing.
        let line = line_result?;

        // Check if the line contains the keyword.
        if line.contains(keyword) {
            matches.push(format!("Line {}: {}", idx + 1, line));
        }
    }

    Ok(matches)
}

This function demonstrates error handling in a loop. The ? operator inside the loop causes the function to return early if any line fails to read. This is the safe default. If you need to tolerate errors, you can use if let Ok(line) = line_result to skip bad lines. The choice depends on your requirements. Strict parsers fail fast. Lenient parsers skip and continue.

Pitfalls and compiler errors

Rust's type system prevents common mistakes, but it requires you to understand the traits and ownership rules.

Missing the buffer

If you try to call lines() directly on a File, the compiler rejects you. File implements Read, not BufRead. The method lines() does not exist on File.

error[E0599]: no method named `lines` found for struct `std::fs::File`

This error, E0599, tells you that the type you have does not support the operation. The fix is to wrap the file in BufReader. The compiler forces you to buffer because unbuffered line reading is inefficient. Don't fight the compiler here. Wrap the file.

Moved value

BufReader::new(file) takes ownership of the file. You cannot use the file variable after passing it to the buffer.

error[E0382]: use of moved value: `file`

This error, E0382, appears if you try to access file after creating the reader. The buffer owns the file handle. When the buffer drops, it drops the file, closing the handle. This prevents double-closing and ensures the file stays open while buffered. Let the buffer own the file. It manages the lifecycle correctly.

UTF-8 validation

lines() assumes the file contains valid UTF-8. Rust String values are guaranteed to be valid UTF-8. If the file contains invalid byte sequences, lines() returns an error.

io::Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }

This happens often with legacy files or binary data. If you read a binary file line by line, the iterator will error on the first invalid sequence. Check your encoding before assuming lines() works. For non-UTF-8 files, you need a crate like encoding_rs or a parser that handles raw bytes.

The `read_line` append trap

If you switch to read_line() for performance, remember that it appends to the string. It does not clear the string first.

let mut line = String::new();
while reader.read_line(&mut line)? > 0 {
    // line accumulates content if you don't clear it.
    // This is a common bug.
}

If you reuse the string across iterations without clearing it, the content grows. You get garbage data. You must call line.clear() at the start of each iteration or allocate a new string. lines() handles this for you by allocating a fresh String per line.

Decision matrix

Choose the right tool based on your needs. The trade-offs involve memory, performance, and ergonomics.

Use lines() when you want clean strings without newline characters and the file is valid UTF-8. It provides the best ergonomics for text processing and handles allocation automatically.

Use read_line() when you need to preserve newline characters or handle binary data that might contain partial lines. It gives you control over the buffer and avoids allocation if you reuse the string.

Use read_to_string() when the file is small, fits in memory easily, and you need random access to the content. It loads everything at once, which simplifies indexing and searching.

Use BufReader when you are reading from any Read source and want to avoid the performance penalty of unbuffered I/O. It is essential for network streams, pipes, and large files.

Allocation is cheap until it isn't. Profile before optimizing lines(). For most applications, the allocation overhead is negligible compared to disk I/O. Only switch to read_line() with a reused string if profiling shows allocation is the bottleneck.

Where to go next

Reading a file line by line opens the file and processes it one line at a time instead of loading the whole thing into memory. It is like reading a book page by page rather than trying to hold the entire book in your hands at once. You use this when processing large text files to save memory and improve speed.