EMMA-Claude

An implementation of EMMA (End-to-End Multimodal Model for Autonomous Driving) using the Claude API, based on the EMMA paper. This implementation uses Claude for trajectory prediction and scene understanding instead of the original Gemini model.

Features

End-to-end autonomous driving trajectory prediction
Integration with nuScenes dataset
Real-time visualization tools for predictions
Scene understanding and critical object detection
Comprehensive evaluation metrics
Command-line interface for different operations

Installation

First, install system dependencies and set up the Python environment:

# Make the setup script executable
chmod +x scripts/setup.sh

# Run the setup script
./scripts/setup.sh

Copy the environment file and configure your credentials:

cp .env.example .env

Allow direnv:

direnv allow

If you prefer to install dependencies manually:

Install system dependencies (Ubuntu/Debian):

sudo apt-get update
sudo apt-get install -y \
    build-essential \
    python3-dev \
    gcc \
    pkg-config \
    libfreetype6-dev \
    libpng-dev \
    python3-matplotlib

Create and activate virtual environment:

uv venv --python python3.10
source .venv/bin/activate

Install Python dependencies:

uv pip install -r requirements.txt
uv pip install -r requirements-dev.txt

Configuration

Copy .env.example to .env and fill in your credentials:

cp .env.example .env

Required environment variables:

ANTHROPIC_API_KEY: Your Claude API key
NUIMAGES_DATAROOT: Path to your nuScenes dataset

Development Environment

This project uses direnv to automatically manage environment variables and virtual environments.

Prerequisites

Install direnv:

# On macOS
brew install direnv

# On Ubuntu/Debian
sudo apt-get install direnv

# On Fedora
sudo dnf install direnv

Add direnv hook to your shell configuration:

For bash (~/.bashrc):

eval "$(direnv hook bash)"

For zsh (~/.zshrc):

eval "$(direnv hook zsh)"

For fish (~/.config/fish/config.fish):

direnv hook fish | source

Allow direnv in the project directory:

direnv allow

The included .envrc will automatically:

Create and activate a Python virtual environment using uv
Set up the PYTHONPATH
Load environment variables from .env
Configure development paths

Note: Make sure to copy .env.example to .env and fill in your credentials:

cp .env.example .env

Dataset Setup

This project uses the nuScenes dataset. There are different versions available:

Dataset Setup

This project uses the nuScenes image data for autonomous driving predictions. For development and testing, we recommend using the mini dataset (~4GB).

Required Data Structure

After downloading, your data directory should look like this:

/data/sets/nuimages/
    samples/     - Sensor data for keyframes (annotated images)
    sweeps/      - Sensor data for intermediate frames (unannotated images)
    v1.0-mini/   - JSON tables with metadata and annotations

Setup Instructions

Create an account at nuScenes website and accept the Terms of Use.
Download the following files for the mini set:
- v1.0-mini (metadata and annotations)
- samples (keyframe images)
- sweeps (intermediate frame images)
Extract the archives to your data directory without overwriting folders that occur in multiple archives.
Update your .env file with the dataset path:

NUIMAGES_DATAROOT={WORSPACE_DIR}/data/sets/nuimages

Verifying the Installation

Install and test the nuscenes-devkit:

# Install devkit
uv pip install nuscenes-devkit

# Verify setup (in python)
from nuimages import NuImages
nusc = NuImages(version='v1.0-mini', dataroot='{WORSPACE_DIR}/data/sets/nuimages', verbose=True, lazy=True)

Note: While the full nuScenes dataset includes lidar, radar, and map data, this project focuses only on the image data for Claude-based predictions.

Usage

Command Line Interface

Process a single sample:

python -m src.scripts.cli predict sample_token_123 --output-dir outputs

Run evaluation:

python -m src.scripts.cli evaluate --num-samples 100 --output-dir eval_results

Visualize predictions:

python -m src.scripts.cli visualize sample_token_123 predictions/sample_123.json

Python API

from src.model.emma import ClaudeEMMA
from src.data.nuimages_loader import NuImagesLoader
from src.visualization.visualizer import EMMAVisualizer

# Initialize components
emma = ClaudeEMMA(api_key="your-api-key")
nuim_loader = NuImagesLoader(dataroot="path/to/nuimages")
visualizer = EMMAVisualizer()

# Process a sample
sample_data = nuim_loader.get_sample_data(sample_token)
prediction = emma.predict_trajectory(
    camera_images=sample_data.images,
    ego_history=sample_data.ego_history,
    command=sample_data.command
)

# Visualize results
visualizer.visualize_prediction(
    front_image=sample_data.images['CAM_FRONT'],
    prediction=prediction,
    ground_truth=sample_data.ground_truth,
    save_path="prediction.png"
)

Development

Run tests:

pytest tests/

Format code:

black src/ tests/
ruff check src/ tests/
mypy src/

Evaluation Metrics

The implementation includes several metrics for evaluating prediction quality:

Average Displacement Error (ADE)
Final Displacement Error (FDE)
Trajectory Smoothness
Scene Understanding Accuracy

Visualization

The visualization module provides:

Camera View:
- Object detections with distance annotations
- Critical object highlighting
Bird's Eye View:
- Predicted trajectory
- Ground truth trajectory (when available)
- Critical objects with velocity vectors
- Ego vehicle position and orientation
Text Description:
- Scene analysis
- Critical object list
- Reasoning explanation

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Citation

If you use this implementation in your research, please cite both the original EMMA paper and this implementation:

@article{hwang2024emma,
  title={EMMA: End-to-End Multimodal Model for Autonomous Driving},
  author={Hwang, Jyh-Jing and others},
  journal={arXiv preprint arXiv:2410.23262},
  year={2024}
}

ImanolGo/emma-claude

EMMA-Claude

Features

Installation

Configuration

Development Environment

Prerequisites

Dataset Setup

Dataset Setup

Required Data Structure

Setup Instructions

Verifying the Installation

Usage

Command Line Interface

Python API

Development

Evaluation Metrics

Visualization

Contributing

License

Citation