All articles
Systems Programming

Rust Embedded Programming: Building Firmware from Scratch

Master embedded Rust development. Build real-world firmware with no_std, embedded-hal, and hardware abstraction for microcontrollers.

By Luis SoaresMarch 3, 2026Original on Medium

Embedded systems power the modern world—from the microcontroller in your smart thermostat to the flight control systems in aircraft. As these systems demand higher reliability and memory safety, rust embedded programming has emerged as a compelling alternative to traditional C/C++ development. Rust's zero-cost abstractions, memory safety guarantees, and growing ecosystem make it an ideal choice for building robust firmware that can run on resource-constrained devices without sacrificing performance or safety.

Unlike traditional systems programming languages, Rust prevents entire classes of bugs at compile time, including buffer overflows, null pointer dereferences, and data races—critical issues that can be catastrophic in embedded contexts. This article explores how to leverage Rust's unique features for embedded development, from bare-metal programming to building complete firmware solutions.

Understanding the Rust Embedded Ecosystem

The Rust embedded ecosystem has matured significantly, offering a comprehensive toolkit for developing firmware across various microcontroller families. At its core, the ecosystem is built around several key components that work together to provide a seamless development experience.

The Embedded Working Group maintains official crates that form the foundation of Rust embedded development. The embedded-hal crate defines hardware abstraction layer traits that allow code to be portable across different microcontroller families. This abstraction means you can write driver code once and use it across ARM Cortex-M, RISC-V, and other architectures.

use embedded_hal::digital::v2::OutputPin;
use embedded_hal::timer::CountDown;
use embedded_hal::prelude::*;
use nb::block;

// Generic LED blinker that works with any HAL implementation
pub fn blink_led<LED, TIMER>(
    mut led: LED,
    mut timer: TIMER,
) -> !
where
    LED: OutputPin,
    TIMER: CountDown,
{
    loop {
        led.set_high().ok();
        block!(timer.wait()).ok();
        led.set_low().ok();
        block!(timer.wait()).ok();
    }
}

This example demonstrates the power of Rust's trait system in embedded contexts. The blink_led function is generic over any type that implements the OutputPin and CountDown traits, making it reusable across different hardware platforms.

Memory Management Without a Heap

One of the most significant advantages of rust embedded programming is the ability to write safe code without requiring heap allocation. Most embedded systems operate with limited RAM and no memory management unit, making heap allocation either impossible or undesirable due to fragmentation concerns.

Rust's ownership system and compile-time memory management eliminate the need for garbage collection or manual memory management. The heapless crate provides data structures that work entirely on the stack or in static memory:

use heapless::{Vec, String, pool::{Pool, Node}};

// Stack-allocated vector with compile-time capacity
let mut buffer: Vec<u8, 32> = Vec::new();
buffer.push(0x42).unwrap();

// Fixed-capacity string
let mut message: String<64> = String::new();
message.push_str("Sensor reading: ").unwrap();

// Memory pool for dynamic allocation without heap fragmentation
static mut MEMORY: [Node<[u8; 64]>; 16] = [Node::new(); 16];
let pool: Pool<[u8; 64]> = Pool::new();
unsafe { pool.grow(&mut MEMORY) };

// Acquire and use memory from the pool
if let Some(mut block) = pool.alloc() {
    block[0] = 0xFF;
    // Memory automatically returned to pool when `block` goes out of scope
}

Setting Up Your First Embedded Rust Project

Getting started with embedded Rust development requires specific tooling and configuration. The process differs from standard Rust development due to the cross-compilation requirements and the need to generate firmware binaries for specific microcontroller architectures.

Toolchain Configuration

First, you'll need to install the appropriate target for your microcontroller. For ARM Cortex-M devices, which represent a large portion of the embedded market:

rustup target add thumbv7em-none-eabihf  # For ARM Cortex-M4F/M7F
rustup target add thumbv6m-none-eabi     # For ARM Cortex-M0/M0+
rustup target add thumbv7m-none-eabi     # For ARM Cortex-M3

A typical Cargo.toml for an embedded project includes specific dependencies and configuration:

[package]
name = "firmware-example"
version = "0.1.0"
edition = "2021"

[dependencies]
cortex-m = "0.7"
cortex-m-rt = "0.7"
panic-halt = "0.2"
nb = "1.0"
embedded-hal = "0.2"

# Device-specific HAL (example for STM32F4xx)
stm32f4xx-hal = { version = "0.19", features = ["rt", "stm32f401"] }

[profile.release]
debug = true        # Enable debug symbols
lto = true         # Link-time optimization
codegen-units = 1  # Better optimization
panic = "abort"    # Don't unwind on panic

The cortex-m-rt crate provides the runtime and startup code, while device-specific HAL crates offer safe abstractions over peripheral registers. The panic-halt crate defines panic behavior—in this case, halting execution rather than attempting to unwind.

Memory Layout and Linker Scripts

Embedded devices have specific memory layouts that must be defined for the linker. A typical memory.x file for an STM32F401 microcontroller:

MEMORY
{
  FLASH : ORIGIN = 0x08000000, LENGTH = 512K
  RAM : ORIGIN = 0x20000000, LENGTH = 96K
}

This file tells the linker where code and data can be placed in the device's memory map. The build system uses this information along with the runtime crate to generate appropriate startup code and vector tables.

Building Hardware Drivers in Rust

Creating hardware drivers is a fundamental aspect of rust embedded programming. Rust's type system allows you to create drivers that are both safe and performant, with many safety checks happening at compile time rather than runtime.

Register-Level Programming

Modern embedded Rust development typically uses generated register access crates created from SVD (System View Description) files. These crates provide type-safe access to peripheral registers:

use stm32f4xx_hal::pac::{Peripherals, RCC, GPIOA};
use cortex_m;

fn configure_gpio() {
    cortex_m::interrupt::free(|_cs| {
        let dp = Peripherals::take().unwrap();
        
        // Enable GPIOA clock
        dp.RCC.ahb1enr.modify(|_, w| w.gpioaen().enabled());
        
        // Configure PA5 as output (LED on many development boards)
        dp.GPIOA.moder.modify(|_, w| w.moder5().output());
        dp.GPIOA.otyper.modify(|_, w| w.ot5().push_pull());
        dp.GPIOA.ospeedr.modify(|_, w| w.ospeedr5().medium_speed());
        
        // Set the pin high
        dp.GPIOA.odr.modify(|_, w| w.odr5().high());
    });
}

This code demonstrates several important concepts:

  • Critical sections: The interrupt::free function ensures atomic access to registers
  • Type safety: The compiler prevents invalid register configurations
  • Ownership: The take() method ensures only one instance of the peripherals exists

Implementing Custom Drivers

Building on the register access layer, you can create higher-level drivers that implement embedded-hal traits:

Practice what you learned

Reinforce this article with hands-on coding exercises and AI-powered feedback.

View all exercises
use embedded_hal::digital::v2::OutputPin;
use stm32f4xx_hal::pac::GPIOA;

pub struct Led {
    pin_mask: u32,
}

impl Led {
    pub fn new(pin: u8) -> Self {
        Self {
            pin_mask: 1 << pin,
        }
    }
}

impl OutputPin for Led {
    type Error = ();
    
    fn set_low(&mut self) -> Result<(), Self::Error> {
        unsafe {
            (*GPIOA::ptr()).odr.modify(|r, w| {
                w.bits(r.bits() & !self.pin_mask)
            });
        }
        Ok(())
    }
    
    fn set_high(&mut self) -> Result<(), Self::Error> {
        unsafe {
            (*GPIOA::ptr()).odr.modify(|r, w| {
                w.bits(r.bits() | self.pin_mask)
            });
        }
        Ok(())
    }
}

While this example uses unsafe code for direct register access, the safety is contained within the driver implementation. Users of the driver get a completely safe API that implements standard embedded HAL traits.

Concurrency and Real-Time Systems

Embedded systems often require handling multiple concurrent tasks, from processing sensor data to managing communication protocols. Rust embedded programming offers several approaches to concurrency, each with different trade-offs and use cases.

Interrupt-Driven Programming

The most fundamental form of concurrency in embedded systems involves interrupt handlers. Rust provides safe abstractions for interrupt handling while maintaining zero-cost abstractions:

use cortex_m::interrupt::{self, Mutex};
use cortex_m_rt::interrupt;
use stm32f4xx_hal::pac::interrupt;
use core::cell::RefCell;
use heapless::spsc::{Queue, Producer, Consumer};

// Shared data structure for communication between interrupt and main thread
static SENSOR_QUEUE: Mutex<RefCell<Option<Queue<u16, 32>>>> = 
    Mutex::new(RefCell::new(None));

static mut PRODUCER: Option<Producer<u16, 32>> = None;
static mut CONSUMER: Option<Consumer<u16, 32>> = None;

#[interrupt]
fn TIM2() {
    // This runs in interrupt context
    interrupt::free(|cs| {
        if let Some(ref mut producer) = unsafe { &mut PRODUCER } {
            // Read sensor value (simplified)
            let sensor_value = read_adc();
            let _ = producer.enqueue(sensor_value);
        }
    });
    
    // Clear interrupt flag
    unsafe {
        (*stm32f4xx_hal::pac::TIM2::ptr())
            .sr
            .modify(|_, w| w.uif().clear_bit());
    }
}

fn read_adc() -> u16 {
    // Simplified ADC reading
    0x3FF
}

fn main() -> ! {
    // Initialize the queue
    let (producer, consumer) = Queue::new().split();
    interrupt::free(|cs| {
        *SENSOR_QUEUE.borrow(cs).borrow_mut() = Some(Queue::new());
    });
    
    unsafe {
        PRODUCER = Some(producer);
        CONSUMER = Some(consumer);
    }
    
    loop {
        if let Some(ref mut consumer) = unsafe { &mut CONSUMER } {
            if let Some(value) = consumer.dequeue() {
                // Process sensor reading in main thread
                process_sensor_data(value);
            }
        }
    }
}

fn process_sensor_data(value: u16) {
    // Process the sensor data
}

This example demonstrates lock-free communication between interrupt handlers and the main thread using the heapless crate's single-producer, single-consumer queue.

Real-Time Operating System Integration

For more complex applications, you might need a real-time operating system. The rtic (Real-Time Interrupt-driven Concurrency) framework provides a Rust-native approach to building real-time systems:

#[rtic::app(device = stm32f4xx_hal::pac, dispatchers = [EXTI0, EXTI1])]
mod app {
    use systick_monotonic::{Systick, ExtU32};
    use heapless::spsc::{Queue, Producer, Consumer};
    
    #[monotonic(binds = SysTick, default = true)]
    type MonotonicTimer = Systick<1000>; // 1kHz tick rate
    
    #[shared]
    struct Shared {
        counter: u32,
    }
    
    #[local]
    struct Local {
        led: Led,
    }
    
    #[init]
    fn init(ctx: init::Context) -> (Shared, Local, init::Monotonics) {
        let mono = Systick::new(ctx.core.SYST, 84_000_000);
        
        // Schedule the first execution of the periodic task
        periodic_task::spawn_after(1.secs()).ok();
        
        (
            Shared { counter: 0 },
            Local { led: Led::new(5) },
            init::Monotonics(mono),
        )
    }
    
    #[task(shared = [counter], local = [led])]
    fn periodic_task(mut ctx: periodic_task::Context) {
        ctx.shared.counter.lock(|c| *c += 1);
        
        // Toggle LED
        ctx.local.led.toggle();
        
        // Schedule next execution
        periodic_task::spawn_after(1.secs()).ok();
    }
    
    #[task(shared = [counter], priority = 2)]
    fn high_priority_task(mut ctx: high_priority_task::Context) {
        ctx.shared.counter.lock(|c| {
            // High priority task can preempt periodic_task
            *c += 10;
        });
    }
}

RTIC automatically generates the necessary scheduling code and ensures that shared resources are accessed safely through priority-based locking.

Optimizing Performance in Resource-Constrained Environments

Performance optimization in embedded systems requires careful attention to both time and space efficiency. Rust embedded programming provides several tools and techniques for achieving optimal performance while maintaining safety guarantees.

Compile-Time Optimizations

Rust's ownership system and zero-cost abstractions mean that many high-level constructs compile down to the same assembly code as hand-optimized C. However, specific techniques can further improve performance:

// Use const generics for compile-time configuration
struct RingBuffer<T, const N: usize> {
    buffer: [T; N],
    head: usize,
    tail: usize,
}

impl<T: Copy + Default, const N: usize> RingBuffer<T, N> {
    const fn new() -> Self {
        Self {
            buffer: [T::default(); N],
            head: 0,
            tail: 0,
        }
    }
    
    #[inline(always)]
    fn push(&mut self, item: T) -> Result<(), T> {
        let next_head = (self.head + 1) % N;
        if next_head == self.tail {
            Err(item) // Buffer full
        } else {
            self.buffer[self.head] = item;
            self.head = next_head;
            Ok(())
        }
    }
    
    #[inline(always)]
    fn pop(&mut self) -> Option<T> {
        if self.head == self.tail {
            None // Buffer empty
        } else {
            let item = self.buffer[self.tail];
            self.tail = (self.tail + 1) % N;
            Some(item)
        }
    }
}

// Usage with compile-time size specification
static mut UART_BUFFER: RingBuffer<u8, 256> = RingBuffer::new();

The #[inline(always)] attribute ensures that function calls are inlined, eliminating call overhead. Const generics allow the compiler to optimize for specific buffer sizes at compile time.

Memory Layout Optimization

Controlling memory layout can significantly impact performance, especially for frequently accessed data structures:

#[repr(C)]
#[repr(packed)]
struct SensorReading {
    timestamp: u32,
    temperature: i16,
    humidity: u16,
    flags: u8,
} // Total size: 9 bytes (packed)

#[repr(C)]
#[repr(align(4))]
struct AlignedBuffer {
    data: [u32; 64], // Aligned for efficient DMA transfers
}

// Place critical data in specific memory sections
#[link_section = ".ccmram"]
static mut FAST_BUFFER: [u32; 128] = [0; 128];

The #[repr(packed)] attribute minimizes memory usage by eliminating padding, while #[repr(align(n))] ensures proper alignment for DMA or cache efficiency. The #[link_section] attribute allows placing data in specific memory regions like tightly coupled memory (TCM) on ARM Cortex-M processors.

Key Takeaways

  • Memory Safety Without Runtime Cost: Rust's ownership system prevents memory safety bugs at compile time, eliminating the need for runtime checks or garbage collection in embedded systems
  • Zero-Cost Abstractions: High-level Rust constructs compile down to efficient assembly code, allowing expressive code without performance penalties
  • Hardware Abstraction: The embedded-hal traits provide portable interfaces across different microcontroller families, enabling code reuse and ecosystem development
  • Concurrency Models: From interrupt-driven programming to RTIC-based real-time systems, Rust offers safe concurrency patterns suitable for embedded applications
  • Toolchain Maturity: The embedded Rust ecosystem provides comprehensive tooling, from register access crates to real-time operating systems
  • Performance Optimization: Compile-time optimizations, memory layout control, and inline assembly provide fine-grained performance tuning capabilities

Building Your Embedded Rust Expertise

Rust embedded programming represents a paradigm shift in firmware development, combining the safety and expressiveness of modern language design with the performance requirements of embedded systems. As the ecosystem continues to mature, we're seeing adoption across industries from automotive to aerospace, where safety and reliability are paramount.

The journey from traditional embedded C/C++ to Rust requires learning new concepts like ownership and borrowing, but the investment pays dividends in reduced debugging time and increased confidence in code correctness. Whether you're building IoT devices, industrial control systems, or safety-critical applications, Rust's guarantees around memory safety and thread safety make it an increasingly attractive choice.

Ready to dive deeper into embedded systems programming? Explore our Systems Programming in Rust track for hands-on projects and advanced techniques. Practice your skills in our interactive playground, work through targeted exercises, or browse our comprehensive collection of systems programming articles to continue your embedded Rust journey. The future of safe, efficient firmware development starts here.

Practice what you learned

Reinforce this article with hands-on coding exercises and AI-powered feedback.

View all exercises

Want to practice Rust hands-on?

Go beyond reading — solve interactive exercises with AI-powered code review on Rust Lab.