FIRE Capital One Machine Learning at the University of Maryland
FIRE Capital One Machine Learning is an undergraduate research program under the First-Year Innovation & Research Experience initiative at UMD.
College Park, Maryland
Pinned Repositories
2020-Image-Super-Resolution
Super-resolution is the process of recovering a high resolution (HR) image or video from its low resolution (LR) counterpart. Our model uses a series of convolutional layers to extract, or learn, information from the LR image. Then, it combines the data that it collected to create the SR image. In technical terms, this is a seven-layer Efficient Sub-Pixel Convolutional Neural Network that takes a LR image input, extracts LR feature maps through a series of convolutional layers, then applies a sub-pixel convolution layer to assemble the LR feature maps into a HR image output. This project is written in TensorFlow and is based on Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network.
2020-Medical-Object-Detection
This repository contains a neural network that is able to detect melanoma, a serious type of skin cancer. This type of cancer can be easily confused with skin discoloration, so being able to notice what is skin cancer and benign skin discoloration is extremely important.
2020-Object-Tracking
Object tracking is a specific field within computer vision that aims to track objects as they move across a series of video frames. The goal of object tracking is to train a model and estimate the target object present in the scene from previous frames. We completed this by taking a starting bounding box coordinate and creating a unique ID for each of the initial detections of the object and tracking them as they move while keeping the unique ID of each object. The specific learning model we used is a Siam-FC model with an AlexNet backend. The model takes in an instance image, and a search image, and uses the backend to process each image into a embedding, then uses cross-correlation to find the search image in the instance image. This project uses the [Lasot] (http://vision.cs.stonybrook.edu/~lasot/) dataset.
2020-Text-Generation
The aim of this project is to create a Machine Learning Model that generates output sentences given an input sentence by training a Recurrent Neural Network (RNN) Model on hundreds of news articles about New York v. Strauss-Kahn, a case relating to allegations of sexual assault against the former IMF director Dominique Strauss-Kahn. After training the model, a user can give the model an input sentence and the machine learning model will automatically generate a sentence as output.
2021-Malware-Detection-Classification
Malware detection is an important process in modern computing to help protect various systems from getting infected. The goal for any project, program, or system that aims to detect malware is to prevent any malicious software from running on a user’s computer. With our project, we have aimed to assist in the battle against malicious software by creating a model that can detect and label different types of programs as either malware or benign software. For this project, we used a Deep Neural Network (DNN) model. archi The architecture of our model, shown above, consists of a dense layer with relu activation, a batch normalization layer, and finally a dropout layer. As shown in the diagram, we use 10 of these layers. This project takes inspiration from the paper “Malware Analysis with Artificial Intelligence and a Particular Attention on Results Interpretability” created by Benjamin Marais, Tony Quertier, and Christophe Chesneau.
2021-Monocular-Depth-Estimation
When an RGB image is inputted to the model, it produces a depth map that displays the predicted depth of each pixel. It is similar to that of a person's ability to percieve perspective, distinguishing what is far away and what is nearby. It does this by evaluating the darkness of each pixel; something closer is generally lighter and something further is generally darker. For the model architecture, we chose to use a UNet model. It was first proposed for abnormality localization in medical images that used convolutional networks for Biomedical Image Segmentation. As Manikandan [2021] explained, it has the ability for pixel-level localization and distinguishes unique patterns. It also has a 'U' shaped structure with the first half being an encoder and the last half being a decoder. Purkayastha [2020] also described that, "[t]his architecture consists of three sections: The contraction, The bottleneck, and the expansion section".
2022-t1-convolutional
We have created a song recommendation system based on user history. Our product takes in a user’s playlist(s) and recommends songs based on the playlist(s). The product uses the Spotify API to extract the features of a song (11 features in total), these range from danceability to tempo to instrumentalness. Using an aggregation function, the feature
2022-t2-transformer
A variety of machine learning projects based the Transformer model, including: Song Recognition, MIDI Song Extender, Poem Generation, and Sentiment Analysis
MultiSeg
survey-orientation-representations
FIRE Capital One Machine Learning at the University of Maryland's Repositories
umd-fire-coml/survey-orientation-representations
umd-fire-coml/2021-Malware-Detection-Classification
Malware detection is an important process in modern computing to help protect various systems from getting infected. The goal for any project, program, or system that aims to detect malware is to prevent any malicious software from running on a user’s computer. With our project, we have aimed to assist in the battle against malicious software by creating a model that can detect and label different types of programs as either malware or benign software. For this project, we used a Deep Neural Network (DNN) model. archi The architecture of our model, shown above, consists of a dense layer with relu activation, a batch normalization layer, and finally a dropout layer. As shown in the diagram, we use 10 of these layers. This project takes inspiration from the paper “Malware Analysis with Artificial Intelligence and a Particular Attention on Results Interpretability” created by Benjamin Marais, Tony Quertier, and Christophe Chesneau.
umd-fire-coml/2022-t2-transformer
A variety of machine learning projects based the Transformer model, including: Song Recognition, MIDI Song Extender, Poem Generation, and Sentiment Analysis
umd-fire-coml/2021-Monocular-Depth-Estimation
When an RGB image is inputted to the model, it produces a depth map that displays the predicted depth of each pixel. It is similar to that of a person's ability to percieve perspective, distinguishing what is far away and what is nearby. It does this by evaluating the darkness of each pixel; something closer is generally lighter and something further is generally darker. For the model architecture, we chose to use a UNet model. It was first proposed for abnormality localization in medical images that used convolutional networks for Biomedical Image Segmentation. As Manikandan [2021] explained, it has the ability for pixel-level localization and distinguishes unique patterns. It also has a 'U' shaped structure with the first half being an encoder and the last half being a decoder. Purkayastha [2020] also described that, "[t]his architecture consists of three sections: The contraction, The bottleneck, and the expansion section".
umd-fire-coml/2022-t1-convolutional
We have created a song recommendation system based on user history. Our product takes in a user’s playlist(s) and recommends songs based on the playlist(s). The product uses the Spotify API to extract the features of a song (11 features in total), these range from danceability to tempo to instrumentalness. Using an aggregation function, the feature
umd-fire-coml/2022-t4-generative-adversarial
Audio generation when given a genre as natural language input. User inputs a genre tag into frontend. This tag is passed to the semantic similarity NLP model to determine the nearest tag within training the space, and implicitly coerces to (outputs) the found tag. This tag is passed to the audio generation model as input, which produces generated
umd-fire-coml/2020-3D-Object-Detection
The purpose of this research is to do 3D object detection from a photo using machine learning. The goal is a working model that can detect multiple 3D objects and provide their dimensions and orientation given a photo. There are many ways to implement this, but this project uses center-based 3D object detection and tracking, where the model predicts the physical, three-dimensional center of detected objects, which is regressed to predict the dimensions and orientation of the object.
umd-fire-coml/2020-Object-Detection-In-Aerial-Images
This repo detects rotated and cluttered objects in aerial images. This can then be used to detect thing like traffic on satellite maps or for disaster relief. The model itself is a convoultional neural network using several groups of convolutional/deconvolutional and maxpooling layers. We use rotation augmentation to further account for the various rotations objects may be found in.
umd-fire-coml/2021-3D-Object-Detection
This product identifies and labels 3D Objects in images of every day settings, such as cars, trees, bikes, pedestrians, etc. This product makes use of a UNet, which is a Convolutional Neural Network, to identify objects, given voxel data. Our product first takes point cloud data from the SemanticKITTI dataset, and converts it to voxels. For the sake of simplicity, a voxel can be described as a 3d pixel. We visualize these voxels as cubes, each cube containing spatial information in 3 dimensions.
umd-fire-coml/2021-Image-Colorization
This image colorization model takes an input image, convert it to greyscale, then creates a realistic colorization of the image based off of the trained model. The model architecture utilizes the YCbCr colorspace in order to colorize the image, because Y is equivalent to grayscale, so the model has to predict only the Cb and Cr channels. The model uses a series of convolutional layers to transform the 256x256 input into the 256x256x2 output. The model was trained using the places365 dataset as ground truth images for the colorization, and a greyscale version of the places365 dataset as training images.
umd-fire-coml/2021-Text-to-Image-Generation
This project is our own implementataion of text-to-image generation for birds. Based off of descirpiton provided by the user, it tries to create an original bird image. It runs in Python 3 and uses a target-aware generative averserial model.
umd-fire-coml/2022-summer-speech-translation
umd-fire-coml/2022-t3-vector-quantization
User Uploads Soundtrack file. Uploaded track is run through Librosa functions to extract music features. Feature data is run through our Vector Quantization model to find the closest soundtrack in the dataset. Display Name of most "similar" soundtrack.
umd-fire-coml/2021-Game-Playing
Our model learns to play any level of Super Mario Bros. Its architecture is based off of the DQN research paper's model architecture. More specifically; however, this architecture connects a reinforcement model to a deep neural network. For our RGB input for our model, we took a 256x240 pixel screen capture from our Super Mario Bros emulator and produced a simple action vector of the q values of 7 possible movements for Mario to choose (ex: run right, jump, duck, etc.). Our model then chooses the action with the highest score and turns to the next frame, and continues onward through the level as such. Each action's q score is calculated based on projected reward, where positive factors include getting further into the level, living, and shorter time taken and negative factors include dying or going the wrong way over the x axis; hence, reinforcement learning. Our model then trains on the runs that had the best rewards.
umd-fire-coml/2021-Music-Generation
Our project was to utilize an input of MIDI files to generate new music. We utilized a type of recurrent neural network called an LTSM model, which stands for Long Short-Term Memory network. Essentially, this type of model can efficiently learn and recognize long-term patterns. This sort of recognition is incredibly useful with music generation.
umd-fire-coml/2021-Speaker-Recognition
The purpose of this project is to assess whether a given audio recording has a male or female speaker, which is decided through the recording's frequency. The application of this model can be used to decide if a user is male or female, this could enable call centers to know the gender of the speaker and with that information better sell their products. This google colab notebook contains all the code to run through the project.
umd-fire-coml/2021-Facial-Age-Gender-Recognition
The product predicts the face and gender of a person. The use of age and gender recognition has a myriad of applications including, security and the advertising industry. Machine learning facial age and gender recognition uses a model trained with a dataset of images to predict one's age and gender. Our model uses a series of convolutional layers to extract and learn information for better predictions. By training the model, we are able to make is more accurate as it gets more and more data to analyze age and gender. Learn more about this project by watching our demonstration video for a high-level walkthrough
umd-fire-coml/2021-Image-Super-Resolution
We take a low resolution image as input. The image passes through a series of layers that extract features from the image. At the very end, the image with its extracted features is put into a special function called depth to space, which turns all those features into actual pixles. After this, we have our super-resolved image.
umd-fire-coml/2022-summer-video-face-recognition
umd-fire-coml/2022-t5-deep-q-learning
This project trains a DQN model to play a variety of Atari Games, including Q*bert. It includes a random agent, which generates gameplay based on the machine making random actions, as well as a trained model that attempts to make desired actions to win the game. Reinforcement learning is an area of machine learning that is focused on training agent
umd-fire-coml/detr
End-to-End Object Detection with Transformers
umd-fire-coml/example-automated-tests
Example Automated Tests Repository
umd-fire-coml/example-keras-data-generator
umd-fire-coml/example-keras-model-builder
umd-fire-coml/example-keras-model-trainer
umd-fire-coml/example-scrum-project
umd-fire-coml/gym-duckietown
Self-driving car simulator for the Duckietown universe
umd-fire-coml/Rainbow
Rainbow: Combining Improvements in Deep Reinforcement Learning
umd-fire-coml/spring2021san
umd-fire-coml/umd-fire-coml.github.io
FIRE Capital One Machine Learning at the University of Maryland - Official Stream Website