How to Build a REST API for ML Model Serving in Rust

Build a Rust REST API for ML serving using axum and tokio to handle requests and return predictions.

Use the axum crate with tokio to create a high-performance, async REST API that loads your ML model and exposes prediction endpoints.

[dependencies]
axum = "0.7"
tokio = { version = "1", features = ["full"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
use axum::{routing::post, Json, Router};
use serde::{Deserialize, Serialize};
use std::net::SocketAddr;

#[derive(Deserialize)]
struct InputData { features: Vec<f32> }

#[derive(Serialize)]
struct Prediction { result: f32 }

async fn predict(Json(input): Json<InputData>) -> Json<Prediction> {
    // Load model and run inference here
    Json(Prediction { result: 0.95 })
}

#[tokio::main]
async fn main() {
    let app = Router::new().route("/predict", post(predict));
    let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
    println!("Listening on {}", addr);
    let listener = tokio::net::TcpListener::bind(addr).await.unwrap();
    axum::serve(listener, app).await.unwrap();
}
  1. Create a new Rust project and add the dependencies listed in the first code block to your Cargo.toml.
  2. Replace the contents of src/main.rs with the server code provided in the second code block.
  3. Run the server using cargo run and send a POST request to http://127.0.0.1:3000/predict with JSON input.