Neural networks have always fascinated me, and I’ve been itching to dive deeper into their inner workings. So, I decided to take on the challenge of building one from scratch in Rust. Let me tell you, it’s been quite the journey!
First things first, let’s talk about why Rust. Well, it’s fast, memory-safe, and has a growing ecosystem for machine learning. Plus, I’ve been wanting to level up my Rust skills, so this seemed like the perfect opportunity.
Now, onto the nitty-gritty. A neural network is essentially a collection of interconnected nodes (neurons) organized in layers. Each neuron takes inputs, applies weights, adds a bias, and then passes the result through an activation function. Simple enough, right?
Let’s start by defining our basic neuron structure:
struct Neuron {
weights: Vec<f64>,
bias: f64,
}
Each neuron will have a vector of weights (one for each input) and a bias. Now, we need a way to calculate the output of a neuron:
impl Neuron {
fn forward(&self, inputs: &[f64]) -> f64 {
let sum: f64 = inputs.iter()
.zip(self.weights.iter())
.map(|(i, w)| i * w)
.sum();
(sum + self.bias).max(0.0) // ReLU activation
}
}
Here, we’re using the ReLU (Rectified Linear Unit) activation function, which is simple and effective for many tasks.
Now that we have our neurons, let’s create a layer:
struct Layer {
neurons: Vec<Neuron>,
}
And here’s how we can implement the forward pass for a layer:
impl Layer {
fn forward(&self, inputs: &[f64]) -> Vec<f64> {
self.neurons.iter()
.map(|neuron| neuron.forward(inputs))
.collect()
}
}
With our building blocks in place, we can now create our neural network:
struct NeuralNetwork {
layers: Vec<Layer>,
}
impl NeuralNetwork {
fn forward(&self, inputs: &[f64]) -> Vec<f64> {
self.layers.iter().fold(inputs.to_vec(), |inputs, layer| {
layer.forward(&inputs)
})
}
}
Great! We now have a basic neural network that can perform forward propagation. But how do we train it? This is where things get a bit more complex.
To train our network, we need to implement backpropagation. This is the process of calculating gradients and adjusting weights to minimize the error of our predictions. It’s the heart of how neural networks learn.
Let’s add a method to calculate the error of our network:
fn mean_squared_error(predictions: &[f64], targets: &[f64]) -> f64 {
predictions.iter()
.zip(targets.iter())
.map(|(p, t)| (p - t).powi(2))
.sum::<f64>() / predictions.len() as f64
}
Now, we need to implement the backward pass. This is where things get a bit hairy, so bear with me:
impl NeuralNetwork {
fn backward(&mut self, inputs: &[f64], targets: &[f64], learning_rate: f64) {
let mut layer_inputs = inputs.to_vec();
let mut layer_outputs = Vec::new();
// Forward pass
for layer in &self.layers {
layer_outputs = layer.forward(&layer_inputs);
layer_inputs = layer_outputs.clone();
}
// Backward pass
let mut error = targets.iter()
.zip(layer_outputs.iter())
.map(|(t, o)| t - o)
.collect::<Vec<f64>>();
for layer in self.layers.iter_mut().rev() {
let prev_error = error.clone();
error = vec![0.0; layer_inputs.len()];
for (i, neuron) in layer.neurons.iter_mut().enumerate() {
let gradient = if layer_outputs[i] > 0.0 { prev_error[i] } else { 0.0 };
for (j, weight) in neuron.weights.iter_mut().enumerate() {
let delta = learning_rate * gradient * layer_inputs[j];
*weight += delta;
error[j] += gradient * *weight;
}
neuron.bias += learning_rate * gradient;
}
layer_outputs = layer_inputs.clone();
if layer != self.layers.first().unwrap() {
layer_inputs = self.layers[layer.neurons.len() - 1].forward(&layer_inputs);
}
}
}
}
Phew! That was a lot to take in. Essentially, we’re propagating the error backward through the network, adjusting weights and biases along the way.
Now that we have our training mechanism in place, let’s create a method to train our network:
impl NeuralNetwork {
fn train(&mut self, inputs: &[Vec<f64>], targets: &[Vec<f64>], epochs: usize, learning_rate: f64) {
for epoch in 0..epochs {
let mut total_error = 0.0;
for (input, target) in inputs.iter().zip(targets.iter()) {
let prediction = self.forward(input);
total_error += mean_squared_error(&prediction, target);
self.backward(input, target, learning_rate);
}
println!("Epoch {}: Error = {}", epoch, total_error / inputs.len() as f64);
}
}
}
And there you have it! A fully functional neural network implemented from scratch in Rust. Of course, this is a basic implementation, and there’s a lot more we could add - different activation functions, regularization, optimization algorithms like Adam or RMSprop, and so on.
I’ve learned so much from this project. It’s one thing to use pre-built neural network libraries, but building one from scratch really deepens your understanding of how they work under the hood.
If you’re interested in machine learning and Rust, I highly recommend giving this a try. It’s challenging, but incredibly rewarding. And who knows? Maybe you’ll discover some optimizations or improvements along the way.
Remember, the key to mastering complex topics like neural networks is to break them down into smaller, manageable pieces. Start with a single neuron, then a layer, and build up from there. Before you know it, you’ll have a full network capable of learning and making predictions.
So, what’s next? Well, I’m thinking of expanding this project to include convolutional layers for image processing tasks. Or maybe I’ll dive into recurrent neural networks for sequence data. The possibilities are endless!
I hope this journey through building a neural network in Rust has been as exciting for you as it has been for me. Happy coding, and may your gradients always descend smoothly!