🎯 Pix2Pix for Depth Estimation

A PyTorch implementation of the Pix2Pix architecture for image-to-image translation, tailored for depth estimation from dashboard camera images. The model consists of a U-Net-based generator that learns a mapping from RGB input images to their corresponding depth maps, and a PatchGAN discriminator that enforces local realism by evaluating image patches rather than the entire image. The system is trained using a combination of adversarial loss (to encourage realistic outputs) and L1 loss (to ensure pixel-wise similarity to the ground truth). Input-target image pairs are extracted from composite images where the input RGB image is on the right half and the target depth/thermal image is on the left.

🏗️ Architecture

Generator: U-Net architecture with skip connections
Discriminator: PatchGAN discriminator for realistic image generation
Loss: Combined adversarial loss and L1 loss for better pixel-wise accuracy

📁 Project Structure

Pix2Pix/
├── data/
│   ├── loader.py          # Data loading utilities
│   └── __init__.py
├── model/
│   ├── generator.py       # U-Net Generator
│   ├── discriminator.py   # PatchGAN Discriminator
│   └── __init__.py
├── train/
│   └── train.py          # Training script
├── inference/
│   └── test.py           # Inference script
├── output/               # Generated results
└── data.py              # Data preprocessing

📋 Requirements

PyTorch
NumPy
OpenCV
Matplotlib
imageio

🚀 Usage

Training

python train/train.py

Inference

python inference/test.py

📸 Results

The model generates depth maps from dashboard camera images. Here are some example results:

Sample Outputs

Input → Output

📊 Dataset

The model is trained on the pix2pix-depth dataset containing paired RGB and depth images from dashboard cameras.

Dataset used: https://www.kaggle.com/datasets/greg115/pix2pix-depth

📄 License

This project is licensed under the Apache License - see the LICENSE file for details.

Haseebasif7/Image2Depth