This project is my educational journey into the world of neural networks and machine learning. Developed from scratch in C, the project is a fundamental implementation of a neural network library. It provides the essential components required to construct and train neural network models.
- Common layer types: The library supports dense layers, with plans to add more as my understanding expands.
- Activation functions: It includes a variety of activation functions such as ReLU, leaky ReLU, softmax, and more to come.
- Loss functions: It supports different loss functions such as categorical cross-entropy and mean squared error.
- Optimization algorithms: The library includes various optimization algorithms such as stochastic gradient descent, ADAGRAD, RMS_PROP, and ADAM.
- Backpropagation algorithms: It comes with backpropagation algorithms for computing gradients.
- Data pre-processing and normalization: It provides functionalities for data pre-processing and normalization.
The primary aim of this library is to serve as a "playground" for understanding the underlying mathematical concepts and algorithms that drive neural networks and deep learning. By building it from the ground up in C, one can learn and customize all aspects of the implementation.
While the library's performance and production-readiness are not the main focus, it serves as an educational tool for hands-on learning. Contributions and improvements are always welcome!
The version on the main branch does not come with any performance optimizations, its priority was/is readability and understandability. If you want a more performant version or want to see how neural networks can be optimized take a look at the branch 'optimized-branch'.
- It only supports batch processing.
- It doesn't have the latest changes I have on the main branch, I will be migrating the changes slowly overtime.
It has the following optimizations:
- Optimized Memory access
- Refactored Matrix struct to allow for easier integration of CUDA.
- A thread pool for utilizing POSIX threads and reducing the overhead that comes with thread creation.
- CUDA support for operations that were causing heavy performance issues.
Here is the average time it takes to run wine-categorization model on different versions:
- Non-optimized sequential version: 39ms.
- Non-optimized batched version: 85ms.
- Optimized (ThreadPool and/or CUDA) versions take significantly longer as the overhead introduced by parallelization, which outweighs the performance gains for smaller operations.
Here is the average time it takes to run MNIST model on different versions:
- Non-Optimized sequential version: 70.9417 minutes.
- Optimized batched version: 18.245 minutes.
My recent refactoring efforts have significantly reduced memory leaks from around 11 million bytes to around 10k bytes during the execution of the Wine Recognition data model. However, the remaining leaks seem to stem from the logging library and a few other unknown sources.
For reference, I've included two example models under 'src/example_networks':
Please feel free to delve into these models to gain a better understanding of the project's workings.
I appreciate any feedback that could help improve my skills. Please don't hesitate to share your insights.
- Add Multithreading for matrix operations.
- CUDA support.
- Add an example model for regression.
- Implement Transformers
Before you begin, ensure you have met the following requirements:
- You have installed the latest version of
gcc
. - You have a
<Linux/Mac>
machine. - TODO: Provide link to documentation
To install, follow these steps:
-
Install
libcsv
:brew install libcsv
-
Install
gnuplot
on your system. The method for this varies depending on your operating system:- On Ubuntu, you can use
sudo apt-get install gnuplot
. - On macOS, you can use
brew install gnuplot
.
- On Ubuntu, you can use
-
Clone the repository:
git clone <repository_link>
- Navigate to the project directory:
cd <repository_directory>
- Compile the project:
make