danielmachinelearning
Principal Data Scientist at Fidelity. Masters in Comp Eng. from Stevens Institute of Technology with a concentration in machine learning and NLP
FidelityBoston, MA
Pinned Repositories
CNN_Image_Classification
coursera-Advanced-Machine-Learning-specialization
Repo for coursera specialization Advanced Machine Learning by Higher School of Economics
Doc2Vec_CNN_RNN
This program also does sentiment analysis on IMDB movie reviews, but the reviews are first preprocessed with gensim's Doc2Vec that takes each review and converts the words to vectors. The vectorized words are then inputted into a CNN to find invariant features, followed by an RNN to learn the states. Accuracy after ten iterations with each word represented by a 128-length vector was 90.09%.
GANs_for_Celebrity_Face_Generation
An example of Generative Adverserial Networks (GANs) for generating celebrity-like faces. A generator neural network learns to mimic celebrity faces while a discriminator neural network attempts to discriminate between the real faces and generated ones. Eventually, the generator network is able to generate real-like faces that can fool the discriminator.
HotelSpamDetection
Project is to determine whether not only a given hotel review on a website is positive or negative, but if the review was genuine or if it was done by spam or a troll/bot. Purpose is to prevent hotels from artificially inflating their own value or to prevent hotels from being spammed by trolls. Algorithm to implement it was a CNN-LSTM network using feature sets from Doc2Vec, POS tags, dependency tags, and TF-IDF. Trained accuracy on validation set was 93.44%.
movie_reviews_sentiment_analysis_NN
Imports movie reviews corpus from NLTK, performs a tf-idf for the whole corpus, then does a TruncatedSVD to perform dimensionality reduction on a sparse matrix. Finally, it trains on a multilayer NN in Keras. The test dataset was found to be 88% accurate.
NER_trainer
This takes a corpus from Groningen Meaning Bank, which has prelabeled tags. This is trained using NLTK's classifier, then tested on a novel sentence. Tags to train on were for geographical locations and time units.
Philosophy_blender
With ChatGPT and LLM's now becoming widely available, can such tools be used to create novel philosophical insights from heterogenous philosophy books? You bet it can! My program here takes philosophy books as input, extracts out the text from PDF format and uses LLM's and ChatGPT to create novel philosophical insights as if from a philosopher.
Philosophy_summarizer
This project takes a philosophy e-book, creates summaries via the long-t5-tglobal-base-16384-book-summary transformer. The program creates topics, groups sentences by topics, generates layman summaries via GPT-3, and finally finds important phrases in the text.
SentimentAnalysis_CNN_RNN
An example of a sentiment analysis program, used on the IMDB movie review dataset. The program uses the Keras deep learning library. First, it imports the IMDB movie set, then uses a Convolutional Neural Network (CNN) to extract out invariant features detailing a good or bad movie review. Finally, it passes it through a recursive neural network (RNN) using LSTMs to learn the state transitions of the invariant features.
danielmachinelearning's Repositories
danielmachinelearning/Philosophy_summarizer
This project takes a philosophy e-book, creates summaries via the long-t5-tglobal-base-16384-book-summary transformer. The program creates topics, groups sentences by topics, generates layman summaries via GPT-3, and finally finds important phrases in the text.
danielmachinelearning/HotelSpamDetection
Project is to determine whether not only a given hotel review on a website is positive or negative, but if the review was genuine or if it was done by spam or a troll/bot. Purpose is to prevent hotels from artificially inflating their own value or to prevent hotels from being spammed by trolls. Algorithm to implement it was a CNN-LSTM network using feature sets from Doc2Vec, POS tags, dependency tags, and TF-IDF. Trained accuracy on validation set was 93.44%.
danielmachinelearning/Philosophy_blender
With ChatGPT and LLM's now becoming widely available, can such tools be used to create novel philosophical insights from heterogenous philosophy books? You bet it can! My program here takes philosophy books as input, extracts out the text from PDF format and uses LLM's and ChatGPT to create novel philosophical insights as if from a philosopher.
danielmachinelearning/coursera-Advanced-Machine-Learning-specialization
Repo for coursera specialization Advanced Machine Learning by Higher School of Economics
danielmachinelearning/Dog_Breed_Classifier
Deep Learning classification using CNN transfer learning to recognize over 200+ different dog breeds. Accuracy is ~87%.
danielmachinelearning/Philosopher_generated_GPT3
Ever wonder what would Hegel, Wittgenstein and Heidegger come up with if they were alive and together? Worry no more. This program will take samples of writings from Heidegger, Wittgenstein and Hegel and create unique philosophical sayings from a GPT3 text generation using their writings as input.
danielmachinelearning/AssetGuardAI
A chatbot that will allow you to read the news for ALGORAND, check trend forecasting for the crypto coin, then allow you to create smart contracts based on the upward or downward trend.
danielmachinelearning/awesome-nlp
:book: A curated list of resources dedicated to Natural Language Processing (NLP)
danielmachinelearning/Boston-Housing-Prices
Regression to predict Boston Housing Prices
danielmachinelearning/Clustering-News-Articles-via-Unsupervised-Template-Extraction
Using unsupervised clustering to discover templates within news articles.
danielmachinelearning/Coupled-VAE-Improved-Robustness-and-Accuracy-of-a-Variational-Autoencoder
We present a coupled Variational Auto-Encoder (VAE) method that improves the accuracy and robustness of the probabilistic inferences on represented data. The new method models the dependency between input feature vectors (images) and weighs the outliers with a higher penalty by generalizing the original loss function to the coupled entropy function, using the principles of nonlinear statistical coupling. We evaluate the performance of the coupled VAE model using the MNIST dataset. Compared with the traditional VAE algorithm, the output images generated by the coupled VAE method are clearer and less blurry. The visualization of the input images embedded in 2D latent variable space provides a deeper insight into the structure of new model with coupled loss function: the latent variable has a smaller deviation and the output values are generated by a more compact latent space. We analyze the histograms of probabilities for the input images using the generalized mean metrics, in which increased geometric mean illustrates that the average likelihood of input data is improved. Increases in the -2/3 mean, which is sensitive to outliers, indicates improved robustness. The decisiveness, measured by the arithmetic mean of the likelihoods, is unchanged and -2/3 mean shows that the new model has better robustness.
danielmachinelearning/decisiontree
ID3-based implementation of the ML Decision Tree algorithm
danielmachinelearning/HMM_tagger
Demonstrating POS tagging with Hidden Markov Model
danielmachinelearning/How-to-build-own-text-summarizer-using-deep-learning
In this notebook, we will build an abstractive based text summarizer using deep learning from the scratch in python using keras
danielmachinelearning/lightning
Large-scale linear classification, regression and ranking in Python
danielmachinelearning/Machine_Translation
danielmachinelearning/Manning_RecSys
danielmachinelearning/Manning_Time_Series
danielmachinelearning/natural-language-processing
Resources for "Natural Language Processing" Coursera course.
danielmachinelearning/NER
Example of Named Entity Recognition
danielmachinelearning/NeuralRecWithTextInput
Collaborative Filtering with Neural Networks using Descriptions of Movies as enhancement.
danielmachinelearning/NLP_projects_CNTK
Projects I've implemented for NLP that run using Microsoft's CNTK library.
danielmachinelearning/Predict_Attendance_for_Meetups
Based on past attendance to a given meetup group, along with a Meetup description, predict what amount of people will show to a Meetup.
danielmachinelearning/Predicting_Crypto_Prices_With_CNN_LSTM_AttentionNetworks
danielmachinelearning/Predicting_Crypto_Prices_With_Encoder_Decoder_LSTM_With_Attention
This is a project to predict the next 4 months of prices using an Encoder Decoder LSTM with Attention.
danielmachinelearning/Predicting_NFL_Game_Spread
Will rename
danielmachinelearning/PyMC
PyMC using MCMC with Python
danielmachinelearning/Report
danielmachinelearning/Seq2Seq
Example of Sequence to Sequence algorithm
danielmachinelearning/Training-a-Smartcab
Program uses reinforcement learning to train a smartcab to drive.