How to Call Python from Rust for Data Science (PyO3)

Use the PyO3 crate to initialize the Python interpreter and execute Python code directly within a Rust application.

When Rust needs Python's ecosystem

You spent weeks optimizing a data pipeline in Rust. It's fast, safe, and handles memory like a champ. Then you hit a wall. The final step requires a specific machine learning model that only exists in Python. Rewriting the model in Rust takes months. Copying data back and forth via files is slow and messy. You need to call Python directly from your Rust binary, passing data structures seamlessly and getting results back without the overhead of spawning a subprocess.

PyO3 is the bridge. It embeds the Python interpreter inside your Rust process. You don't launch a separate python executable. The interpreter lives in your memory. This means you can pass Rust types to Python and get Python objects back, all in-process. The performance hit is minimal compared to subprocess spawning, and the integration feels native.

The bridge: embedding the interpreter

PyO3 works by wrapping the Python C API. Python exposes a C interface for everything. PyO3 builds safe Rust abstractions on top of those C calls. You get Rust's type safety and error handling, but under the hood, you're talking to the Python runtime.

The biggest mental shift is the Global Interpreter Lock. Python protects its internal state with a single lock. Only one thread can execute Python bytecode at a time. Rust's concurrency model assumes you can run threads freely. PyO3 bridges this gap by forcing you to acquire the GIL explicitly before touching any Python object. You hold the key, you do the work, you release the key.

Trust the GIL guard. It's your safety net against Python's internal chaos.

Minimal example: running a script

Here's the simplest way to run Python code from Rust. You add pyo3 to your dependencies and use the auto-initialize feature to handle interpreter startup.

use pyo3::prelude::*;

/// Runs a Python expression and prints the result.
fn main() -> PyResult<()> {
    // Python::with_gil acquires the Global Interpreter Lock.
    // You cannot touch Python objects without holding this lock.
    // The closure receives a `py` token that proves you hold the lock.
    Python::with_gil(|py| {
        // Create a dictionary to hold local variables for the script.
        // This avoids polluting the global namespace and keeps execution isolated.
        let locals = pyo3::types::PyDict::new(py);

        // Run a Python statement.
        // The `?` operator propagates Python exceptions as Rust errors.
        // If `import math` fails, the function returns early with an error.
        py.run("import math", None, Some(locals))?;

        // Evaluate an expression and extract the result.
        // `eval` returns a Python object. `extract` converts it to a Rust type.
        // If the result isn't a float, `extract` fails with a type error.
        let result = py.eval("math.sqrt(2)", None, Some(locals))?;
        let value: f64 = result.extract()?;

        println!("Square root of 2: {}", value);
        Ok(())
    })
}
[dependencies]
# auto-initialize handles starting the interpreter for you.
# Without this, you'd need to call Python_Initialize manually.
# This feature is safe for simple binaries and tests.
pyo3 = { version = "0.20", features = ["auto-initialize"] }

Convention aside: Always pass None for globals if you don't need them. Passing None forces you to use locals, which is safer and faster than relying on the global namespace. It prevents accidental variable leakage between script executions.

How the GIL guard works

The Python::with_gil closure is the heart of the interaction. The py argument is a guard token. As long as py exists, you hold the GIL. Any Python object you create must borrow from py. This ties the lifetime of the Python object to the lock.

If you try to return a Python object out of the closure, the compiler stops you. You'll see a lifetime error because the reference is tied to py, which is local to the closure. You can't hold a Python reference after releasing the lock. This prevents use-after-free bugs where Python garbage collects an object while Rust still thinks it's valid.

The extract method is the reverse bridge. It takes a Python object and tries to convert it to a Rust type. If the conversion fails, it returns an error. This keeps Python exceptions in the Rust error domain. You handle them with match or ?, just like any other Rust error.

Don't fight the lifetime system. Convert to Rust types before the lock drops.

Realistic example: calling functions with arguments

Real code rarely just runs strings. You usually call specific functions with structured data. Here's how to import a module, create a list, and call a method.

use pyo3::prelude::*;
use pyo3::types::PyList;

/// Calls json.dumps on a Rust vector and returns the JSON string.
fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        // Import a module by name.
        // This mirrors `import json` in Python.
        // `import` returns a Python object representing the module.
        let json = py.import("json")?;

        // Create a Rust vector and convert it to a Python list.
        // `into_py` creates a new Python object from Rust data.
        // The result is a reference to the Python list.
        let data = vec![1, 2, 3, 4, 5];
        let py_list: &PyList = data.into_py(py).downcast_bound::<PyList>(py)?;

        // Call a method on the module.
        // `call_method` takes the method name and arguments as a tuple.
        // Even a single argument needs parentheses: `(arg,)`.
        let json_str = json.call_method("dumps", (py_list,), None)?;

        // Extract the result back to a Rust String.
        // Python strings convert directly to Rust Strings.
        let result: String = json_str.extract()?;
        println!("JSON: {}", result);
        Ok(())
    })
}

Convention aside: When calling Python methods, arguments must be passed as a tuple. The trailing comma is mandatory for single-element tuples in Rust. This catches a common typo where you pass the value directly instead of wrapping it. The compiler will reject call_method("dumps", py_list, None) with a type mismatch error. Always use (py_list,).

Surviving the lock: Py vs references

Python objects have reference counts. Rust references don't. When you hold a &PyAny, you're borrowing the object. If the GIL drops, the borrow is invalid. PyO3 provides Py<T> to solve this. Py<T> holds a reference-counted pointer. You can pass it across threads or store it in a struct.

When you need to use the object again, you acquire the GIL and call get_ref(py). This pattern lets you cache Python objects without holding the lock forever. It's essential for long-running applications where you don't want to re-import modules or recreate objects every time you need them.

use pyo3::prelude::*;
use pyo3::types::PyModule;

/// Caches a Python module and reuses it across calls.
fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        // Import the module once.
        let json = py.import("json")?;

        // Wrap the module in Py<PyModule>.
        // This increments the reference count and detaches from the GIL.
        // You can store this in a struct or return it from a function.
        let cached_json: Py<PyModule> = json.into();

        // Simulate releasing the lock.
        // The closure ends, GIL drops.
        // cached_json is still valid because it owns a reference.
    });

    // Re-acquire the lock to use the cached object.
    Python::with_gil(|py| {
        // Get a reference to the module using the cached handle.
        // This doesn't re-import the module; it just borrows the existing object.
        let json = cached_json.get_ref(py);

        // Use the module as normal.
        let result = json.call_method("dumps", ("hello",), None)?;
        println!("{}", result.extract::<String>()?);
        Ok(())
    })
}

Convention aside: Caching Python modules or singletons is common. The community standard is pyo3::sync::GILOnceCell. It lazily initializes a value the first time the GIL is acquired and caches it for subsequent calls. Don't use lazy_static or OnceLock directly for Python objects; they don't handle the GIL lifecycle correctly. GILOnceCell ensures the initialization happens with the lock held.

Pitfalls and compiler errors

The compiler will reject code that tries to escape the GIL scope. You'll see errors about lifetimes if you try to return a &PyAny from with_gil. The lifetime of the reference is tied to the py token, which is local to the closure. You can't return it. Solution: Convert to a Rust type using extract before leaving the scope, or use Py<PyAny> which is a reference-counted handle that survives the lock release.

Another pitfall is the auto-initialize feature. It works for simple cases, but if you're embedding in a larger application or using multiple interpreters, you need manual initialization. The feature hides the complexity of Py_Initialize and Py_Finalize. If you see crashes on shutdown or conflicts with other libraries, switch to manual initialization.

Python exceptions become Rust errors. If a Python function raises an exception, PyO3 catches it and returns a PyErr. You can inspect the error type and message. Don't ignore PyErr. It contains the full traceback. Use pyo3::exceptions::PyRuntimeError to catch specific error types if you need fine-grained control.

Treat the GIL as a resource. Acquire it only when necessary, and release it as soon as you're done.

Decision: PyO3 vs alternatives

Use PyO3 when you need to call existing Python libraries from Rust without spawning a subprocess. Use PyO3 when you are building a Rust extension for Python and want to expose Rust functions to Python scripts. Use subprocess when you only need to run a Python script occasionally and don't care about the overhead of process creation. Use serde_json or a custom protocol buffer when you need to exchange data between Rust and Python but want to avoid the GIL and interpreter overhead entirely.

PyO3 is the bridge. Build it carefully, or don't cross it.

Where to go next