10 Essential Rust Smart Pointer Techniques for Performance-Critical Systems

rust

10 Essential Rust Smart Pointer Techniques for Performance-Critical Systems

Discover 10 powerful Rust smart pointer techniques for precise memory management without runtime penalties. Learn custom reference counting, type erasure, and more to build high-performance applications. #RustLang #Programming

Apr 30, 2025

10 Essential Rust Smart Pointer Techniques for Performance-Critical Systems

Rust smart pointers represent a powerful tool for precise memory management without runtime penalties. In my experience working with performance-critical systems, these techniques have proven invaluable for building efficient applications. Let me share ten essential smart pointer approaches that have transformed how I handle memory in Rust.

Custom Reference Counting

When standard Rc and Arc don’t meet specific performance requirements, custom reference counting provides fine-grained control. This approach suits specialized use cases where every CPU cycle matters.

struct RefCounted<T> {
    data: *mut RefCountedInner<T>,
}

struct RefCountedInner<T> {
    count: AtomicUsize,
    value: T,
}

impl<T> RefCounted<T> {
    fn new(value: T) -> Self {
        let inner = Box::new(RefCountedInner {
            count: AtomicUsize::new(1),
            value,
        });
        RefCounted { data: Box::into_raw(inner) }
    }
    
    fn clone(&self) -> Self {
        unsafe {
            (*self.data).count.fetch_add(1, Ordering::Relaxed);
        }
        RefCounted { data: self.data }
    }
}

impl<T> Drop for RefCounted<T> {
    fn drop(&mut self) {
        unsafe {
            if (*self.data).count.fetch_sub(1, Ordering::Release) == 1 {
                std::sync::atomic::fence(Ordering::Acquire);
                Box::from_raw(self.data);
            }
        }
    }
}

I’ve seen this technique significantly reduce overhead in applications processing millions of objects, where the standard library implementations added too much weight.

Thin Pointers with Type Erasure

Object-oriented patterns often involve trait objects, but these carry size overhead. Thin pointers reduce this cost through manual type erasure, maintaining a fixed-size pointer regardless of the underlying type.

trait Drawable {
    fn draw(&self);
}

struct ThinVec<'a> {
    data: Vec<*mut ()>,
    vtable: &'a [fn(*mut ())],
}

impl<'a> ThinVec<'a> {
    fn push<T: 'a>(&mut self, obj: T) where T: Drawable {
        let boxed = Box::new(obj);
        let ptr = Box::into_raw(boxed) as *mut ();
        self.data.push(ptr);
    }
    
    fn draw_all(&self) {
        for item in &self.data {
            let draw_fn = self.vtable[0];
            draw_fn(*item);
        }
    }
}

This technique proves particularly valuable when handling collections of polymorphic objects where memory footprint matters.

Copy-on-Write Smart Pointers

For data that’s often read but rarely modified, copy-on-write pointers defer copying until a write operation occurs, optimizing memory usage.

struct Cow<T: Clone> {
    data: Rc<T>,
    modified: bool,
    local_copy: Option<T>,
}

impl<T: Clone> Cow<T> {
    fn new(data: T) -> Self {
        Self {
            data: Rc::new(data),
            modified: false,
            local_copy: None,
        }
    }
    
    fn get_mut(&mut self) -> &mut T {
        if !self.modified {
            self.local_copy = Some(self.data.as_ref().clone());
            self.modified = true;
        }
        self.local_copy.as_mut().unwrap()
    }
    
    fn get(&self) -> &T {
        if self.modified {
            self.local_copy.as_ref().unwrap()
        } else {
            self.data.as_ref()
        }
    }
}

I’ve used this pattern extensively in document processing systems where multiple views access the same data, with occasional edits.

Intrusive Smart Pointers

For maximum efficiency, intrusive pointers embed reference counting directly within the data structure, eliminating separate allocation for control blocks.

struct Node<T> {
    refs: AtomicUsize,
    next: Option<IntrusivePtr<Node<T>>>,
    data: T,
}

struct IntrusivePtr<T> {
    ptr: *const T,
    _marker: PhantomData<T>,
}

impl<T> IntrusivePtr<T> {
    fn new(node: Box<T>) -> Self {
        let ptr = Box::into_raw(node);
        unsafe { 
            (*(ptr as *mut T)).refs.fetch_add(1, Ordering::Relaxed); 
        }
        Self { ptr, _marker: PhantomData }
    }
}

impl<T> Drop for IntrusivePtr<T> {
    fn drop(&mut self) {
        unsafe {
            let refs = (*self.ptr).refs.fetch_sub(1, Ordering::Release);
            if refs == 1 {
                std::sync::atomic::fence(Ordering::Acquire);
                Box::from_raw(self.ptr as *mut T);
            }
        }
    }
}

This technique has proven especially effective for complex linked data structures where allocations must be minimized.

Generational Indices

Using indices with generation counters creates a safe alternative to raw pointers, preventing use-after-free and dangling pointer issues.

struct GenerationalArena<T> {
    items: Vec<Option<(T, u32)>>,
    free: Vec<usize>,
}

#[derive(Clone, Copy, Debug, Eq, PartialEq)]
struct GenerationalIndex {
    index: u32,
    generation: u32,
}

impl<T> GenerationalArena<T> {
    fn insert(&mut self, value: T) -> GenerationalIndex {
        if let Some(index) = self.free.pop() {
            let generation = self.items[index].as_ref().map(|(_,g)| *g + 1).unwrap_or(0);
            self.items[index] = Some((value, generation));
            GenerationalIndex { 
                index: index as u32, 
                generation 
            }
        } else {
            let index = self.items.len();
            self.items.push(Some((value, 0)));
            GenerationalIndex { 
                index: index as u32, 
                generation: 0 
            }
        }
    }
    
    fn get(&self, index: GenerationalIndex) -> Option<&T> {
        self.items
            .get(index.index as usize)
            .and_then(|item| item.as_ref())
            .and_then(|(value, gen)| 
                if *gen == index.generation { Some(value) } else { None }
            )
    }
}

I’ve implemented this pattern in game engines and simulations where entities frequently come and go, and index validation provides crucial safety.

Thread-Local Smart Pointers

For single-threaded contexts, thread-local pointers eliminate synchronization overhead while maintaining safety guarantees.

struct ThreadBox<T> {
    data: UnsafeCell<T>,
    _marker: PhantomData<*mut ()>, // Not Send or Sync
}

impl<T> ThreadBox<T> {
    fn new(value: T) -> Self {
        Self { 
            data: UnsafeCell::new(value),
            _marker: PhantomData,
        }
    }
    
    fn get_mut(&self) -> &mut T {
        unsafe { &mut *self.data.get() }
    }
    
    fn get(&self) -> &T {
        unsafe { &*self.data.get() }
    }
}

This pattern has significantly boosted performance in single-threaded processing pipelines where I needed interior mutability without atomic operations.

Custom Smart Pointers with Inline Storage

Small string optimization represents a classic example of inline storage, avoiding heap allocations for small values.

struct SmallString {
    data: [u8; 24],
    len: u8,
    is_heap: bool,
    cap: u8,
}

impl SmallString {
    fn new(s: &str) -> Self {
        let len = s.len();
        if len <= 23 {
            let mut data = [0; 24];
            data[..len].copy_from_slice(s.as_bytes());
            Self { data, len: len as u8, is_heap: false, cap: 23 }
        } else {
            let mut string = String::from(s);
            let cap = string.capacity() as u8;
            let ptr = string.as_ptr();
            std::mem::forget(string);
            
            let mut data = [0; 24];
            unsafe {
                std::ptr::copy_nonoverlapping(
                    &ptr as *const _ as *const u8,
                    data.as_mut_ptr(),
                    std::mem::size_of::<*const u8>()
                );
            }
            
            Self { data, len: len as u8, is_heap: true, cap }
        }
    }
}

impl Drop for SmallString {
    fn drop(&mut self) {
        if self.is_heap {
            let ptr = unsafe {
                let mut ptr: *mut u8 = std::mem::zeroed();
                std::ptr::copy_nonoverlapping(
                    self.data.as_ptr(),
                    &mut ptr as *mut _ as *mut u8,
                    std::mem::size_of::<*mut u8>()
                );
                ptr
            };
            
            unsafe {
                String::from_raw_parts(
                    ptr, 
                    self.len as usize, 
                    self.cap as usize
                );
            }
        }
    }
}

I’ve applied this pattern to various data types, dramatically reducing allocation frequency in text processing applications.

Pin Pointers for Self-Referential Structures

Rust’s Pin API enables safe creation of self-referential structures by guaranteeing stability of memory locations.

struct SelfReferential {
    data: String,
    slice: *const str,
}

impl SelfReferential {
    fn new(s: String) -> Pin<Box<Self>> {
        let mut boxed = Box::pin(Self {
            data: s,
            slice: std::ptr::null(),
        });
        
        // This is safe because we pinned the box
        let self_ptr: *mut Self = &mut *boxed as *mut Self;
        unsafe {
            let slice = &(*self_ptr).data as *const String as *const str;
            (*self_ptr).slice = slice;
        }
        
        boxed
    }
    
    fn get_slice(self: Pin<&Self>) -> &str {
        unsafe { &*(self.slice) }
    }
}

This technique has proven invaluable for implementing efficient parsers and state machines that maintain references to their own data.

Weak References with Lazy Initialization

Weak references solve cyclic dependency problems while enabling lazy loading of complex object graphs.

struct Node {
    value: i32,
    parent: Option<Weak<RefCell<Node>>>,
    children: Vec<Rc<RefCell<Node>>>,
}

impl Node {
    fn new(value: i32) -> Rc<RefCell<Self>> {
        Rc::new(RefCell::new(Self {
            value,
            parent: None,
            children: Vec::new(),
        }))
    }
    
    fn add_child(self: &Rc<RefCell<Self>>, value: i32) -> Rc<RefCell<Node>> {
        let child = Rc::new(RefCell::new(Node {
            value,
            parent: Some(Rc::downgrade(self)),
            children: Vec::new(),
        }));
        
        self.borrow_mut().children.push(Rc::clone(&child));
        child
    }
}

I’ve used this pattern extensively in tree structures and UI components where parent-child relationships are bidirectional.

Tagged Pointers

For advanced memory optimization, tagged pointers store metadata in unused bits of aligned pointers.

struct TaggedPtr<T> {
    // Uses the lower bits of the aligned pointer for tag data
    ptr_and_tag: usize,
    _marker: PhantomData<*mut T>,
}

impl<T> TaggedPtr<T> {
    fn new(ptr: *mut T, tag: u8) -> Self {
        assert!(tag < 4, "Tag must fit in 2 bits");
        let ptr_val = ptr as usize;
        // Ensure pointer is aligned
        assert_eq!(ptr_val & 0b11, 0, "Pointer must be aligned to 4 bytes");
        
        Self {
            ptr_and_tag: ptr_val | (tag as usize),
            _marker: PhantomData,
        }
    }
    
    fn tag(&self) -> u8 {
        (self.ptr_and_tag & 0b11) as u8
    }
    
    fn ptr(&self) -> *mut T {
        (self.ptr_and_tag & !0b11) as *mut T
    }
}

This bit-packing technique has proven highly effective in memory-constrained environments where every byte counts.

These ten smart pointer techniques demonstrate Rust’s capacity for zero-cost abstractions. By leveraging the type system and ownership model, we can create memory-safe code without performance penalties. I’ve progressively incorporated these patterns into my production systems, achieving both safety and efficiency.

The beauty of Rust lies in its ability to express these complex patterns while maintaining memory safety guarantees. As systems grow in complexity, these smart pointer techniques become increasingly valuable for managing resources efficiently while preventing memory-related bugs.