Welcome to the DeepLearningImplementation repository! This repository is dedicated to the implementation of various seminal deep learning architectures for computer vision. Whether you are a researcher, student, or practitioner, you'll find comprehensive implementations, training scripts, and documentation for some of the most influential models in the field.
The DeepLearningImplementation repository is built on a philosophy of simplicity and clarity. The primary goal is to offer implementations that prioritize readability and understandability over optimization and performance. This repository is designed to be a learning resource, helping researchers, students, and practitioners gain a deeper understanding of the inner workings of seminal deep learning architectures.
-
Simplicity: Each implementation is crafted to be as straightforward as possible. The aim is to minimize complexity, making it easier for users to follow along and grasp the core concepts without being overwhelmed by intricate optimizations or advanced coding techniques.
-
Readability: The code is written with a strong emphasis on readability. Clear variable names, concise comments, and structured organization are prioritized to ensure that anyone reading the code can easily understand the flow and purpose of each component.
-
Learning-Oriented: The repository is meant to be a hands-on educational tool. By focusing on the fundamental mechanisms of each architecture, users can learn how these models work at a basic level, facilitating a deeper comprehension that can serve as a foundation for more advanced studies or applications.
-
Minimal Dependencies: To keep things simple and focused, the project relies solely on PyTorch, one of the most widely used and accessible deep learning frameworks. This decision eliminates the need for additional external libraries, reducing setup complexity and ensuring that users can dive straight into learning.
Click on the checkmarks to go to project directories.
- AlexNet [2012] - ✅
- ZFNet [2013] - ✅
- GoogLeNet [2014] - ✅
- VGG16 [2015] - ✅
- ResNet [2015] - ✅
- Rethinked Inception [2015] - ✅
- DenseNet [2016]- ✅
- Xception [2016] - ✅
- SqueezeNet [2016] - ✅
- ResNeXt [2016] - ✅
- SENet [2017] - ✅
- MobileNet [2017] - ✅
- ShuffleNet [2017] - [ ]
- Residual Attention Network [2017] - [ ]
- EfficientNet [2019] - [ ]
- RegNet [2020] - [ ]
- ConvNet [2020] - [ ]
- VisionTransformer [2020] - [ ]
- SwinTransformer [2021] - [ ]
- MaxViT [2022] - [
- VisionLSTM [2024] - [ ]
- FCN [2014] - [ ]
- SegNet [2015] - [ ]
- UNet [2015] - [ ]
- PSPNet [2016] - [ ]
- DeepLab [2016] - [ ]
- ENet [2016] - [ ]
- Mask R-CNN [2017] - []
- DeepLabV3 [2017] - [ ]
- ICNet [2018] - [ ]
- HRNet [2019] - [ ]
- OCRNet [2019] - [ ]
- U-Net++ [2019] - [ ]
- SegFormer [2021] - [ ]
- Mask2Former [2022] - [ ]
- RCNN [2014] - [ ]
- Fast-RCNN [2015] - [ ]
- Faster-RCNN [2015] - [ ]
- YOLO [2015] - [ ]
- SSD [2016] - [ ]
- YOLO9000 [2016] - [ ]
- RetinaNet [2017] - [ ]
- YOLOv3 [2018] - [ ]
- YOLOv4 [2020] - [ ]
- GAN [2014] - [ ]
- DCGAN [2015] - [ ]
- InfoGAN [2016] - [ ]
- Pix2Pix [2016] - [ ]
- WGAN [2017] - [ ]
- CycleGAN [2017] - [ ]
- BigGAN [2018] - [ ]
- StyleGAN [2018] - [ ]
- StyleGAN2 [2019] - [ ]
- DDPM [2020] - [ ]
- PixelRNN [2016] - [ ]
- PixelCNN [2016] - [ ]
- PixelSNAIL [2017] - [ ]
- 3D-R2N2 [2016] - [ ]
- 3D-RecGAN [2017] - [ ]
- 3D-GAN [2017] - [ ]
- 3D-RecGAN++ [2018] - [ ]
- AtlasNet [2018] - [ ]
- Occupancy Networks [2018] - [ ]
- DeepSDF [2019] - [ ]
- NeRF [2020] - [ ]
The DeepLearningImplementation repository is structured into distinct phases to ensure a comprehensive and systematic approach to developing and refining deep learning models. Each phase builds upon the previous one, progressively enhancing the quality and utility of the repository.
The first phase is dedicated to the implementation of various deep learning architectures. During this phase, the primary focus is on writing clear and understandable code for each model. Alongside the implementation, a raw documentation is provided to explain the basic functioning and structure of the models. This phase sets the foundation for further development and ensures that each model is accessible and easy to comprehend.
Current Status: We are currently in phase 1.
In the second phase, the focus shifts to training each implemented model on relevant datasets. This phase involves computing the performance metrics for each model and making comparisons to understand their strengths and weaknesses.
The third and final phase involves refining the code implementations. This phase also includes enhancing the documentation to provide more detailed explanations, usage instructions, and best practices. The aim is to polish the repository, making it a robust and reliable resource for learning and experimentation.
Each directory contains the implementation of a specific architecture along with training scripts and detailed documentation. To get started with any architecture, navigate to the respective directory and follow the instructions in the README file.
Each architecture has its own set of dependencies listed in the requirements.txt
file in its directory. You can install the required packages using:
pip install -r requirements.txt
Contributions are welcome! Please feel free to submit issues or pull requests to help improve the implementations and documentation.
For any questions, please open an issue or contact the repository maintainer.