Shikhar0605

Pinned Repositories

Applying-Regression-model-on-house-sales-data
Applied Turicreate Linear Regression model on house sales dataset. Examined the effect of feature selections on model accuracy.
Language:Jupyter Notebook0 1 00
Classifier-model-for-Cancer-Detection-Malignant-or-Benign
Applied Scikit learn K-Nearest Neighbor classification algorithm to develop a model for Breast Cancer diagnosis
Language:Jupyter Notebook0 1 00
Creating-and-manipulating-graph-using-Networkx
Language:Jupyter Notebook0 1 00
Credit-card-fraud-detection
Applied SVC classifier and Logistic Regression classifier algorithm onto credit card transaction dataset to detect any fraud.
Language:Jupyter Notebook0 1 00
Custom-data-visualization-using-Matplotlib
Created a dynamic graph using matplotlib to better judge probabilistic data generated through the election dataset. A generated graph changes its colour w.r.t change in y-axis values.
Language:Jupyter Notebook0 1 00
Documents-similarity-prediction-using-Wikipedia-s-People-Dataset
Applied Turicreate's Nearest Neighbor Model to predict the similarity between any two documents taken from Wikipedia's People Dataset.
Language:Jupyter Notebook0 1 00
Hypothesis-testing-using-T-test
Hypothesis: University towns have their mean housing prices less effected by recessions. Performed a T-test to compare the ratio of the mean price of houses in university towns the quarter before the recession starts compared to the recession bottom.
Language:Jupyter Notebook0 1 00
Long-Term-Stock-Price-Growth-Prediction-using-NLP-on-10-K-Financial-Reports
A 10-K FInancial Report is a comprehensive report which must be filed annually by all publicly traded companies about its financial performance. These reports are filed to the US Securities Exchange Commission (SEC). This is even more detailed than the annual report of a company. The 10K documents contain information about the Business' operations, risk factors, selected financial data, the Management's discussion and analysis (MD&A) and also Financial Statements and supplementary data. I have been expected to build an NLP pipeline that ingests 10-K reports of various publicly traded companies and build a machine learning model which can uncover the hidden signals to predict the long term stock performance of a company from the 10-K docs using the ‘Loughran McDonald Master Dictionary’. The Dictionary contain words that are specifically curated in the context of financial reports
Language:HTML1 0 00
Social-Media-Sentiment-Analysis
Using Text Mining and Natural Language Processing Techniques pre- processed 50k tweets. Visualized the impact of hashtags on tweets sentiment using Seaborn. Applied machine learning models, calculated f1_scores, accordingly used the best model for sentiment prediction.
Language:Jupyter Notebook0 1 01
Temperature-analysis-using-NCEI-Dataset
Analyzed the temperature variations of "Ann Arbor, Michigan, U.S." over 2005-2014 using NCEI Dataset.
Language:Jupyter Notebook0 1 00

Shikhar0605's Repositories

Shikhar0605/Long-Term-Stock-Price-Growth-Prediction-using-NLP-on-10-K-Financial-Reports
A 10-K FInancial Report is a comprehensive report which must be filed annually by all publicly traded companies about its financial performance. These reports are filed to the US Securities Exchange Commission (SEC). This is even more detailed than the annual report of a company. The 10K documents contain information about the Business' operations, risk factors, selected financial data, the Management's discussion and analysis (MD&A) and also Financial Statements and supplementary data. I have been expected to build an NLP pipeline that ingests 10-K reports of various publicly traded companies and build a machine learning model which can uncover the hidden signals to predict the long term stock performance of a company from the 10-K docs using the ‘Loughran McDonald Master Dictionary’. The Dictionary contain words that are specifically curated in the context of financial reports
Language:HTML1 0 00
Shikhar0605/Applying-Regression-model-on-house-sales-data
Applied Turicreate Linear Regression model on house sales dataset. Examined the effect of feature selections on model accuracy.
Language:Jupyter Notebook0 1 00
Shikhar0605/Classifier-model-for-Cancer-Detection-Malignant-or-Benign
Applied Scikit learn K-Nearest Neighbor classification algorithm to develop a model for Breast Cancer diagnosis
Language:Jupyter Notebook0 1 00
Shikhar0605/Creating-and-manipulating-graph-using-Networkx
Language:Jupyter Notebook0 1 00
Shikhar0605/Credit-card-fraud-detection
Applied SVC classifier and Logistic Regression classifier algorithm onto credit card transaction dataset to detect any fraud.
Language:Jupyter Notebook0 1 00
Shikhar0605/Custom-data-visualization-using-Matplotlib
Created a dynamic graph using matplotlib to better judge probabilistic data generated through the election dataset. A generated graph changes its colour w.r.t change in y-axis values.
Language:Jupyter Notebook0 1 00
Shikhar0605/Documents-similarity-prediction-using-Wikipedia-s-People-Dataset
Applied Turicreate's Nearest Neighbor Model to predict the similarity between any two documents taken from Wikipedia's People Dataset.
Language:Jupyter Notebook0 1 00
Shikhar0605/Hypothesis-testing-using-T-test
Hypothesis: University towns have their mean housing prices less effected by recessions. Performed a T-test to compare the ratio of the mean price of houses in university towns the quarter before the recession starts compared to the recession bottom.
Language:Jupyter Notebook0 1 00
Shikhar0605/Social-Media-Sentiment-Analysis
Using Text Mining and Natural Language Processing Techniques pre- processed 50k tweets. Visualized the impact of hashtags on tweets sentiment using Seaborn. Applied machine learning models, calculated f1_scores, accordingly used the best model for sentiment prediction.
Language:Jupyter Notebook0 1 01
Shikhar0605/Temperature-analysis-using-NCEI-Dataset
Analyzed the temperature variations of "Ann Arbor, Michigan, U.S." over 2005-2014 using NCEI Dataset.
Language:Jupyter Notebook0 1 00
Shikhar0605/Movie-Recommender-Engine
Created a movie recommender engine based on cosine similarity
Language:Jupyter Notebook1 0
Shikhar0605/Network-Connectivity
Importing and analyzing an internal email communication network between employees of a mid-sized manufacturing company. Each node represents an employee and each directed edge between two nodes represents an individual email. The left node represents the sender and the right node represents the recipient.
Language:Jupyter Notebook1 0
Shikhar0605/NLP-on-10K-Documents
This projects helps scraping and analysing the 10K and 10Q documents filed by publicly traded companies to the SEC.
Language:HTML0 0
Shikhar0605/Pandas-for-Data-Science-Assignment-1
Basics of Pandas for Data Analysis on Census Data
Language:Jupyter Notebook1 0
Shikhar0605/Predicting-Property-Maintenance-Fines-using-Logistic-Regression
Applied Scikit learn Logistic Regression algorithm to predict whether a given blight ticket will be paid on time
Language:Jupyter Notebook1 0
Shikhar0605/ProgrammingAssignment2
Repository for Programming Assignment 2 for R Programming on Coursera
Language:R0 0
Shikhar0605/Project-Report
This project's sole aim is to find out whether there exists any relationship between the World's University Ranking and the expenditure made by each country for their respective education system.
Language:Jupyter Notebook1 0
Shikhar0605/Regex-Assignment
The goal of this assignment is to correctly identify all of the different date variants encoded in this dataset and to properly normalize and sort the dates.
Language:Jupyter Notebook1 0
Shikhar0605/Salary-and-new-connections-predictions-using-Networkx
By using Networkx and ML algorithms created a model to predict whether or not employees in a given company are receiving a management position salary. Also predicted future connections between the employees of the network.
Language:Jupyter Notebook1 0
Shikhar0605/SEC-10K-item-1a-ML-Kmean-Clustering
This project was my final project for the UOFM data analytics certificate program used ML to cluster text files and validated those clusters using stock market data
Language:Jupyter Notebook0 0
Shikhar0605/Sentiment-Analysis
Exploratory sentiment analysis of a firm's management discussion from 10K annual SEC filing
Language:Jupyter Notebook0 0
Shikhar0605/Spelling-Recommender
Created three different spelling recommenders, that each take a list of misspelled words and recommends a correctly spelled word for every word in the list. Each spelling recommender uses different Jaccard distance metrics. For every misspelled word, the recommender find the word in correct spellings that has the shortest distance, and starts with the same letter as the misspelled word, and return that word as a recommendation.
Language:Jupyter Notebook1 0

Shikhar0605

Pinned Repositories

Applying-Regression-model-on-house-sales-data

Classifier-model-for-Cancer-Detection-Malignant-or-Benign

Creating-and-manipulating-graph-using-Networkx

Credit-card-fraud-detection

Custom-data-visualization-using-Matplotlib

Documents-similarity-prediction-using-Wikipedia-s-People-Dataset

Hypothesis-testing-using-T-test

Long-Term-Stock-Price-Growth-Prediction-using-NLP-on-10-K-Financial-Reports

Social-Media-Sentiment-Analysis

Temperature-analysis-using-NCEI-Dataset

Shikhar0605's Repositories

Shikhar0605/Long-Term-Stock-Price-Growth-Prediction-using-NLP-on-10-K-Financial-Reports

Shikhar0605/Applying-Regression-model-on-house-sales-data

Shikhar0605/Classifier-model-for-Cancer-Detection-Malignant-or-Benign

Shikhar0605/Creating-and-manipulating-graph-using-Networkx

Shikhar0605/Credit-card-fraud-detection

Shikhar0605/Custom-data-visualization-using-Matplotlib

Shikhar0605/Documents-similarity-prediction-using-Wikipedia-s-People-Dataset

Shikhar0605/Hypothesis-testing-using-T-test

Shikhar0605/Social-Media-Sentiment-Analysis

Shikhar0605/Temperature-analysis-using-NCEI-Dataset

Shikhar0605/Movie-Recommender-Engine

Shikhar0605/Network-Connectivity

Shikhar0605/NLP-on-10K-Documents

Shikhar0605/Pandas-for-Data-Science-Assignment-1

Shikhar0605/Predicting-Property-Maintenance-Fines-using-Logistic-Regression

Shikhar0605/ProgrammingAssignment2

Shikhar0605/Project-Report

Shikhar0605/Regex-Assignment

Shikhar0605/Salary-and-new-connections-predictions-using-Networkx

Shikhar0605/SEC-10K-item-1a-ML-Kmean-Clustering

Shikhar0605/Sentiment-Analysis

Shikhar0605/Spelling-Recommender