cargo build --release
./target/release/atneural config.txt
Simple neural network written in Rust for Dr. Nelson's class. It's a vanilla multi-layer perceptron, with a leaky ReLU activation on hidden layers and a softmax activation on the output layer. There are no biases. Code for the network itself is in src/
. The code uses the ADAM optimizer and cross-entropy loss.
image_rec_sm.net
is the final project, and it classifies images of hands/digits into classes of 1 through 5. config.txt
is set up to evaluate this network on the validation set. Images are greyscaled and scaled to the range [0, 1]
. The dataset, along with code for processing/viewing the images, are in dataset/
.
vis/
contains some visualizations of the weights of the network. Positive are green, negative are red. Generated bynetpares.py
.- The
input###.png
files are the rows of the weight matrix of the first hidden layer reinterpreted as images. It shows what parts of the images the network is paying attention to. - The other
layer###_###x###.png
files are just the full weight matrices themselves.
- The
Experiments in training a transformer to generate audio or play chess found in chessnn/
. I also trained a mini 30M parameter language model with the same architecture as LLaMA more or less in chessnn/llm
. Implementation of the transformer architecture in PyTorch is chessnn/llm/custom_transform.py
. SwiGLU activation, rotary positional encoding, weight tying, RMS norm.
The two research papers I've written, misc cpp experiments in jvm/
and cpp/
.