Portfolio

Reinforcement learning projects
Natural Language Processing projects
Deep learning projects: Unsupervised learning
Deep learning projects: Supervised learning

Reinforcement learning projects

Playing Space Invaders with an actor-critic PPO algorithm

I have trained an actor-critic agent with a Proximal policy optimization algorithm to play the Atari 2600 game Space Invaders using the reinforcement learning library TF-Agents. Using the OpenAI Gym environment, the agent has been trained using as input the RAM of the Atari machine consisting of (only!) 128 bytes. In this environment what the agent "sees" is not the rendered image showing the space ships, projectiles and shields but just a sequence of 128 integer numbers corresponding to the RAM containing the stored information that represents the game state. The agent learns to consistently dodge projectiles and is able to complete the first level of the game.

An episode played by the trained agent

Playing MS-Pacman with a categorical DQN

I have trained a Categorical Deep Q-Network to play the Atari 2600 game MsPacman using the reinforcement learning library TF-Agents.
Using the OpenAI Gym environment, the agent has been trained using as input the RAM of the Atari machine consisting of (only!) 128 bytes. In this environment what the agent "sees" is not the rendered image showing the maze, dots, and ghosts but just a sequence of 128 integer numbers corresponding to the RAM containing the stored information that represents the game state. The agent learns to consistently navigate the maze and to chase the ghosts after having eaten the power pellets.

An episode played by the trained agent

Natural Language Processing projects

News category classification fine-tuning

I have fine-tuned a pre-trained RoBERTa model using the trasformer library from Hugging Face on Google Colab TPU to predict the category of news from the headline and a short description. The model has been fine tuned on the News Category Dataset containing 200k news headlines from the year 2012 to 2018 taken from HuffPost.
Based on this project, I have written a tutorial on Towards Data Science.

Deep learning projects: Unsupervised learning

Hierarchical Vector Quantized Variational Autoencoder for image generation (VQ-VAE)

I have implemented a custom architecture of a hierarchical vector quantized variational autoencoder (VQ-VAE) following the concept introduced in the paper Generating Diverse High-Fidelity Images with VQ-VAE-2 togheter with custom implementations of the PixelCNN priors introduced in the paper Conditional Image Generation with PixelCNN Decoders. The architectures of the models were customized in order to retain good performance on large resolution (512x512) images while remaining light enough to train on free Kaggle/Colab TPUs and GPUs. The model has been trained on the image data of the Kaggle competition Humpback Whale Identification as this dataset offered a reasonable number of high resolution images.

An image generated by the model

Deep learning projects: Supervised learning

Vehicle motion prediction with 3d CNN

I have trained a 3d convolutional neural network to predict the future trajectories of vehicles as an entry for the Lyft Motion Prediction for Autonomous Vehicles competition on Kaggle. The model takes as input few frames of bird's-eye view images containing the visual representation of all the vehicles in the scene and, through intermediate layers including also 3d convolutions (1 temporal+ 2 spatial dimensions), predicts 3 possible future trajectories for the target vehicle and associate a probability to each of them. The model easily outperforms the baseline benchmark set by the competition after being trained on approximately 10% of the training set.

Image segmentation to identify glomeruli in Kidney

I trained a U-Net like architecture to predict segmentation masks in order to identify glomeruli inspired by the task of the HuBMAP - Hacking the Kidney competition on Kaggle. The model adds an attention mechanism to the U-Net through the use of the Convolutional Block Attention Module. The model was trained using Kaggle's free TPU quota.

GabrieleSgroi/GabrieleSgroi.github.io