ATCS: Neural Nets

cargo build --release
./target/release/atneural config.txt

Simple neural network written in Rust for Dr. Nelson's class. It's a vanilla multi-layer perceptron, with a leaky ReLU activation on hidden layers and a softmax activation on the output layer. There are no biases. Code for the network itself is in src/. The code uses the ADAM optimizer and cross-entropy loss.

image_rec_sm.net is the final project, and it classifies images of hands/digits into classes of 1 through 5. config.txt is set up to evaluate this network on the validation set. Images are greyscaled and scaled to the range [0, 1]. The dataset, along with code for processing/viewing the images, are in dataset/.

vis/ contains some visualizations of the weights of the network. Positive are green, negative are red. Generated by netpares.py.
- The input###.png files are the rows of the weight matrix of the first hidden layer reinterpreted as images. It shows what parts of the images the network is paying attention to.
- The other layer###_###x###.png files are just the full weight matrices themselves.

Other stuff

Experiments in training a transformer to generate audio or play chess found in chessnn/. I also trained a mini 30M parameter language model with the same architecture as LLaMA more or less in chessnn/llm. Implementation of the transformer architecture in PyTorch is chessnn/llm/custom_transform.py. SwiGLU activation, rotary positional encoding, weight tying, RMS norm.

The two research papers I've written, misc cpp experiments in jvm/ and cpp/.

ROTARTSI82/ATCS

ATCS: Neural Nets

Other stuff