Welcome! This project attempts to rewrite the PyTorch framework (maintaining a consistent API call) in Rust in hopes of a faster, hard-typed AI framework. This project is currently in its Alpha phase, so feel free to contribute or contact me at my email! As this project is in its early phases, documentation will be sparse, but a quick overview of the development scope will be provided below.
A major part of the timeline (MatMul) is completed, tested, and benchmarked. With this, some primitive broadcasting for multiplication is here but is not recommended. Users are recommended to still manually broadcast any constant terms to the size of argument size. Here is a sample of matmul and gradient in action!
use std::sync::{Arc, RwLock};
use neuroxide::{ops::{add::AddOp, matmul::MatMulOp, mul::MulOp, pow::PowOp}, types::{device::Device, tensor::Tensor, tensordb::{DTypes, TensorDB}}};
use neuroxide::ops::op_generic::Operation;
fn main() {
let a: Vec<f32> = vec![1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0];
let b: Vec<f32> = vec![5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
let a_shape = vec![2, 2, 2];
let b_shape = vec![2, 2, 3];
let tensor_db = Arc::new(RwLock::new(TensorDB::new(DTypes::F32)));
let c1 = Tensor::new(&tensor_db, vec![6.0; 12], vec![2, 2, 3], Device::CUDA, true);
let c2 = Tensor::new(&tensor_db, vec![5.0; 12], vec![2, 2, 3], Device::CUDA, true);
let c3 = Tensor::new(&tensor_db, vec![2.0; 12], vec![2, 2, 3], Device::CUDA, true);
let a = Tensor::new(&tensor_db, a, a_shape, Device::CUDA, true);
let b = Tensor::new(&tensor_db, b, b_shape, Device::CUDA, false);
let c = MatMulOp::forward(&vec![&a, &b]);
let d = MulOp::forward(&vec![&c, &c1]);
let e = AddOp::forward(&vec![&d, &c2]);
let f = PowOp::forward(&vec![&e, &c3]);
println!("{}", f);
let grad = f.backward(None);
println!("{}", grad.get(&a.id).unwrap());
}
Here is how a contributor/developer might use the project.
git clone git@github.com:DragonflyRobotics/Neuroxide.git
- Modify the
src/bin.rs
to contain your personal programs cargo run
You must compile the library via cargo build
and copy the file from the target
folder. You can then link this to your Rust projects to use. You can also try installing like this:
cargo install --git git@github.com:DragonflyRobotics/Neuroxide.git
Here are some basic operations (we hope you see the similarity to PyTorch):
Forward Pass
let db = Arc::new(RwLock::new(TensorDB::new(DTypes::F64)));
let mut c1c = Tensor::new(&db, vec![15.0], vec![1], Device::CPU, false);
let mut c2c = Tensor::new(&db, vec![6.0], vec![1], Device::CPU, false);
let mut result = AddOp::forward(&vec![&c1c, &c2c]);
Backward Pass
let db = Arc::new(RwLock::new(TensorDB::new(DTypes::F64)));
let x = Tensor::new(&db, vec![5.0], vec![1], Device::CPU, true);
let c1c = Tensor::new(&db, vec![15.0], vec![1], Device::CPU, false);
let c2c = Tensor::new(&db, vec![6.0], vec![1], Device::CPU, false);
let r1 = MulOp::forward(&vec![&x, &c1c]);
let r2 = MulOp::forward(&vec![&x, &c2c]);
let mut result = AddOp::forward(&vec![&r1, &r2]);
result = MulOp::forward(&vec![&result, &x]);
println!(result.data[0], 525.0));
let grad = result.backward(None);
println!(grad.get(&x.id).unwrap().data[0])
Forward Pass CUDA
let db = Arc::new(RwLock::new(TensorDB::new(DTypes::F32)));
let mut c1c = Tensor::new(&db, vec![15.0], vec![1], Device::CUDA, false);
let mut c2c = Tensor::new(&db, vec![6.0], vec![1], Device::CUDA, false);
let mut result = AddOp::forward(&vec![&c1c, &c2c]);
Partial Backward to Selective Leaves
let db = Arc::new(RwLock::new(TensorDB::new(DTypes::F64)));
let x1 = Tensor::new(&db, vec![5.0], vec![1], Device::CPU, true);
let x2 = Tensor::new(&db, vec![6.0], vec![1], Device::CPU, true);
let x3 = Tensor::new(&db, vec![7.0], vec![1], Device::CPU, true);
let x4 = Tensor::new(&db, vec![8.0], vec![1], Device::CPU, true);
let result = x1.clone() * (x2.clone() + x3) + x4;
println!(result.data[0]);
let grad = result.backward(Some(vec![x2.id.clone()]));
println!(grad.get(&x2.id).unwrap().data[0]);
Note: You can avoid the clunky notation and simply operate on tensors using +
, -
, *
, and /
!
Simple Neural Network
use std::sync::{Arc, RwLock};
use neuroxide::types::{device::Device, tensor::Tensor, tensordb::{DTypes, TensorDB}};
use neuroxide::ops::op_generic::Operation;
use rand::Rng;
#[macro_use]
extern crate neuroxide;
fn main() {
let db = Arc::new(RwLock::new(TensorDB::new(DTypes::F32)));
let mut layer_1_weights = Tensor::<f32>::new(&db, vec![1.0; 16], vec![1, 16], Device::CUDA, true);
let mut layer_1_biases = Tensor::<f32>::new(&db, vec![1.0; 16*16], vec![16, 16], Device::CUDA, true);
let mut layer_2_weights = Tensor::<f32>::new(&db, vec![1.0; 16], vec![16, 1], Device::CUDA, true);
let mut layer_2_biases = Tensor::<f32>::new(&db, vec![1.0; 16], vec![16, 1], Device::CUDA, true);
let pow_const = Tensor::<f32>::new(&db, vec![2.0; 16], vec![16, 1], Device::CUDA, false);
let lr = Tensor::<f32>::new(&db, vec![0.0000001], vec![1], Device::CUDA, false);
for iteration in 0..600 {
let num: f32 = rand::thread_rng().gen_range(0..100) as f32;
let input = Tensor::<f32>::new(&db, vec![num; 16], vec![16, 1], Device::CUDA, false);
let output = Tensor::<f32>::new(&db, vec![num * 2.0; 16], vec![16, 1], Device::CUDA, false);
let c = add!(matmul!(input, layer_1_weights), layer_1_biases);
let c = add!(matmul!(c, layer_2_weights), layer_2_biases);
let loss = pow!(c - output, pow_const);
let grad = loss.backward(None);
layer_1_weights = layer_1_weights.clone() - grad.get(&layer_1_weights.id).unwrap().clone() * lr.clone();
layer_1_weights.clear_graph();
layer_1_biases = layer_1_biases.clone() - grad.get(&layer_1_biases.id).unwrap().clone() * lr.clone();
layer_1_biases.clear_graph();
layer_2_weights = layer_2_weights.clone() - grad.get(&layer_2_weights.id).unwrap().clone() * lr.clone();
layer_2_weights.clear_graph();
layer_2_biases = layer_2_biases.clone() - grad.get(&layer_2_biases.id).unwrap().clone() * lr.clone();
layer_2_biases.clear_graph();
println!("Epoch: {} Loss: {}", iteration, loss);
}
let input = Tensor::<f32>::new(&db, vec![4.0; 16], vec![16, 1], Device::CUDA, false);
let c = add!(matmul!(input, layer_1_weights), layer_1_biases);
let c = add!(matmul!(c, layer_2_weights), layer_2_biases);
println!("{}", c);
}
Note: The layers are manually implemented here but built-in neuroxide.nn.Linear
functionality is coming soon!
Python has many benefits, mainly its flexibility, which makes it an avid language for AI/ML. The tradeoff is the clunky interpreter, alternation between Python and C++ bindings, and lack of multiprocessing, which make it inefficient and slow for many high-performance applications. This project attempts to maintain the comforts of the PyTorch syntax while leveraging a hard-typed, efficient language to create a powerful AI engine for cutting-edge projects.
We appreciate any contributions to this project to help it grow and encompass the full functionality of an AI engine. Please refer to our contributing guidelines for details.
This project has a GNU License, which can be found here.