/Image2Depth

A PyTorch implementation of the Pix2Pix architecture for image-to-image translation, tailored for depth estimation from dashboard camera images

Primary LanguagePythonApache License 2.0Apache-2.0

🎯 Pix2Pix for Depth Estimation

A PyTorch implementation of the Pix2Pix architecture for image-to-image translation, tailored for depth estimation from dashboard camera images. The model consists of a U-Net-based generator that learns a mapping from RGB input images to their corresponding depth maps, and a PatchGAN discriminator that enforces local realism by evaluating image patches rather than the entire image. The system is trained using a combination of adversarial loss (to encourage realistic outputs) and L1 loss (to ensure pixel-wise similarity to the ground truth). Input-target image pairs are extracted from composite images where the input RGB image is on the right half and the target depth/thermal image is on the left.

🏗️ Architecture

  • Generator: U-Net architecture with skip connections
  • Discriminator: PatchGAN discriminator for realistic image generation
  • Loss: Combined adversarial loss and L1 loss for better pixel-wise accuracy

📁 Project Structure

Pix2Pix/
├── data/
│   ├── loader.py          # Data loading utilities
│   └── __init__.py
├── model/
│   ├── generator.py       # U-Net Generator
│   ├── discriminator.py   # PatchGAN Discriminator
│   └── __init__.py
├── train/
│   └── train.py          # Training script
├── inference/
│   └── test.py           # Inference script
├── output/               # Generated results
└── data.py              # Data preprocessing

📋 Requirements

  • PyTorch
  • NumPy
  • OpenCV
  • Matplotlib
  • imageio

🚀 Usage

Training

python train/train.py

Inference

python inference/test.py

📸 Results

The model generates depth maps from dashboard camera images. Here are some example results:

Sample Outputs

Input → Output
Result 1
Result 2
Result 3

📊 Dataset

The model is trained on the pix2pix-depth dataset containing paired RGB and depth images from dashboard cameras.

Dataset used: https://www.kaggle.com/datasets/greg115/pix2pix-depth

📄 License

This project is licensed under the Apache License - see the LICENSE file for details.