How to Get a Substring (Slice) from a String in Rust

You cannot directly slice a `String` using integer indices because Rust enforces UTF-8 validity, so you must use `char_indices()` to find valid byte boundaries or use the `split_at` method with a pre-calculated byte offset.

You cannot directly slice a String using integer indices because Rust enforces UTF-8 validity, so you must use char_indices() to find valid byte boundaries or use the split_at method with a pre-calculated byte offset. Attempting to slice at an invalid UTF-8 boundary will panic at runtime, so always ensure your indices align with character starts.

Here is the safe, idiomatic way to extract a substring by character count using char_indices():

fn main() {
    let text = "Hello, δΈ–η•Œ"; // Contains multi-byte characters
    let mut char_count = 0;
    let mut byte_index = 0;

    // Find the byte index for the 5th character
    for (i, _) in text.char_indices() {
        if char_count == 5 {
            byte_index = i;
            break;
        }
        char_count += 1;
    }

    // Safe slicing using the calculated byte index
    let substring = &text[byte_index..];
    println!("Substring: {}", substring); // Output: "δΈ–η•Œ"
}

If you already know the exact byte offset (for example, from a previous calculation or a fixed-width encoding context), you can use split_at which returns an Option or panics if the index is invalid, but split_at is generally safer than direct indexing if you handle the result:

fn main() {
    let text = "Rust is great";
    
    // Split at byte index 4 (after "Rust")
    // This will panic if 4 is not a valid UTF-8 boundary
    let (prefix, suffix) = text.split_at(4);
    
    println!("Prefix: {}", prefix); // "Rust"
    println!("Suffix: {}", suffix); // " is great"
    
    // To get just the substring without the prefix:
    let sub = &text[4..];
    println!("Direct slice: {}", sub);
}

Remember that String indices are byte offsets, not character counts. If you iterate over a string with .chars(), you are counting characters, but the underlying String data is stored as bytes. Slicing &str requires that the start and end indices fall exactly on the boundary of a UTF-8 character. If you slice in the middle of a multi-byte character (like the Chinese characters in the first example), Rust will panic to prevent creating invalid UTF-8 data. Always calculate byte offsets carefully when working with user input or variable-length text.