Author
- Atharv Kumar (atharvkumar43@gmail.com)
Faculty Guides:
Prof. Arnav Bhaskar
This project explores decoding EEG (Electroencephalogram) signals to reconstruct the visual stimuli experienced by the brain β using text and image generation models. It leverages deep learning and multi-modal alignment to generate high-fidelity image reconstructions from brain signals.
This work opens new pathways in brain-computer interfaces, neuroscience, and thought-driven AI systems.
- EEG-based Textual Encoding: Extract meaningful embeddings from EEG data.
- Image Reconstruction: Use captions (via BLIP-2) and images (via Stable Diffusion) to reconstruct what the subject saw.
- Direct Thought-to-Image: Create an end-to-end pipeline from EEG β Text β Image.
- EEG Signals: 16,740 EEG samples (17 channels, 100 timepoints each).
- Images: Each of the 16,740 images shown to 10 subjects.
- Labels: For supervised and aligned training.
- VAE trained on DEAP dataset to extract EEG embeddings.
- Ensures compact, meaningful signal representation.
- BLIP-2 generates captions from the original image.
π§Ύ "A small armadillo walking on the dirt"
- Align EEG and text embeddings via CLIP.
- Trained to bring both into a common latent space.
- GPT-2 decodes EEG β Text via autoregressive generation.
π§ β‘οΈ GPT-2 β‘οΈ "A baby armadillo in its enclosure at the zoo"
- Graph CNN captures spatial relations in EEG for image depth features.
- Prompt + Depth Map β Stable Diffusion (v2.1 base) to synthesize visual output.
| EEG Caption (GPT-2) | BLIP Caption | ROUGE Score |
|---|---|---|
| "a man holding an accordion..." | "a person playing an accordion..." | 0.44 |
| "a floral air mattress..." | "an air mattress with a floral pattern..." | 0.52 |
- CLIP Loss: Dropped from 3.48 to 0.12 (30 epochs).
- Cosine Similarity Matrix: Strong diagonals (high EEG-text alignment).
- ROUGE Scores: ROUGE-1 between 0.44β0.52.
- SSIM: Image similarity remains low (~10β15%), but semantically accurate.
- BLIP: Bootstrapping Language-Image Pre-training
- CLIP: Contrastive LanguageβImage Pretraining
- Stable Diffusion
- GPT-2 by OpenAI
- GCNNs for EEG
Special thanks to our guides Prof. Arnav Bhaskar for their constant support and insights.






