Sardhendu.github.io: A repository from Sardhendu

Hi there, I'am a Data Science grad student at Illinois Tech. I was first introduced to Data Science in the late 2013, and ever since I've been smitten by the field. I love coding, and am very passionate about implementing state-of-the art Data Science methods to solve real-world problems. I like exploring different domains in data science such as computer vision, Natural Language Processing, geospatial analysis, raw data analysis and many more. Here are some of my recent works. I hope you enjoy reading them.

Projects Ongoing:

Object Detection: Semantic Segmentation github

This project is aimed to detect objects in an image. The detection process in deep learning can be achieved in several ways. Some techniques such as YOLO (You only look once) can be performed in real-time. Whereas, other techniques such as Faster-RCNN and Mask-RCNN can be performed in near real-time and can be extended to wide variety of applications such as semantic segmentation. This project is aimed to implement Faster-RCNN and Mask-RCNN object detection techniques.

Tech Stack: Python, Scikit-learn, OpenCV, TensorFlow (GPU), Keras

Projects Accomplished

Property Classification github

This project is aimed to classify the type of property (Land/House) given an "Address string" or a "real-estate property image". Data is collected and integrated from several sources. Building boundaries are collected from Open Street Map (OSM) and satellite, street-side images are collected from Google Maps. Different deep learning models are employed and evaluated for different image types. Techniques of semantic segmentation and model emsembles are employed to output the best prediction given available images.

Tech Stack: Python, Scikit-learn, OpenCV, GeoPandas, Shapely, TensorFlow (GPU), Keras, Azure Cloud stack

Credit Card Fraud github

This project is aimed to identify credit card frauds. Pattern Recognition techniques such as Bagging and Boosting with Trees, Deep Neural Nets, Autoencoders and Bayesian methods with MCMC sampling are evaluated. The dataset is collected from Kaggle. As an good outcome, the models parameters are tuned to have high performance in identifying fraud cases.

Tech Stack: Python, R, TensorFlow, XGBoost (Python API)

Deep Face Recognition github

This application leverages Transfer Learning with Google Net (NN4 small) architecture to recognize faces given an image. It employs a multi-step approach, 1) Extracting all the faces from an image, which is simply obtained by Haar Cascades and 2) Labeling the faces with the person name. The application is developed from scratch using Tensorflow and implements ideas from the Facenet paper.

Tech Stack: Python, Tensorflow, OpenCV

CIFAR-10 Object Recognition github

This project was aimed to evaluate a mix of several Deep Learning, Image Processing and Machine Learning models to classify an object in an image. The data was collected from Kaggle. For the sake of simplicity and to perform extensive evaluation of the algorithms, only two objects were used for classification. Presentation Link

Tech Stack: Python, Scikit-learn, OpenCV, Tensorflow

Diabetic-Readmission Analysis github

This project was aimed to analyze the factors that led to the early re-admission of a diabetic patient. In addition, the model was trained to predict whether a diabetic patient would be re-admitted within the next 30 days. The data set was collected from UCI repository and several data processing ideas were borrowed from the paper published by the authors. The data analysis pipeline was build using Py-Spark framework.

Tech Stack: Spark, Py-Spark (MLlib), Python

Crime Rate Prediction github

This project was aimed to experiment and evaluate several Machine Learning algorithms. The classification task was to classify whether a locality was high crime zone or low crime zone. The regression task was to predict the crime rate (A continuous value).

Tech Stack: Python, Scikit-learn

License Plate Extraction github:

This project was aimed to detect and extract license plates of vehicles (4-wheelers) given an image. The model was trained on manually cropped License plates. In addition, boootstraping methods were used to gather more training images. For license plate extraction, a multi-step approach was taken, 1) Using Image processing techniques all contours (Region of interests) were extracted from a given image, and 2) Each extracted contours were then classified as License Plates or Non-license Plates.

Tech Stack: Python, Scikit-learn, OpenCV

Sardhendu/Sardhendu.github.io