epinhoodceo
Epidemiologist, Advocate, Medic, Data Scientist, DBA, Husband, Father, Co-Founder and CEO of The Epinhood.
@The-Epinhood North Carolina
Pinned Repositories
123DR
ADR
Alpha Data Repo
Alphabet-Board
Ever wonder how Stephen Hawkings, the dude from Breaking Bad, and other ALS / paralysis patients communicate while restricted to the movement of one finger? I built this alphabet board using only Javascript / HTML and Google's undocumented TTS engine.
aws-lex-web-ui
Sample Amazon Lex chat bot web interface
open-semantic-search
Open Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
123DR
ADR
BDR
CDR
DDR
epinhoodceo's Repositories
epinhoodceo/Alphabet-Board
Ever wonder how Stephen Hawkings, the dude from Breaking Bad, and other ALS / paralysis patients communicate while restricted to the movement of one finger? I built this alphabet board using only Javascript / HTML and Google's undocumented TTS engine.
epinhoodceo/classifying-text
Classifying text with bag-of-words
epinhoodceo/epibio_data
epinhoodceo/hello-world
My first eddy on github, this repository is a place where I can store ideas, resources, or even share and discuss things with others.
epinhoodceo/latent_semantic_analysis
A Dockerized Command-Line Application for performing Semantic Search over Wikipedia Articles
epinhoodceo/Minimal-Bag-of-Visual-Words-Image-Classifier
Implementation of a content based image classifier using the bag of visual words approach in Python together with Lowe's SIFT and Libsvm.
epinhoodceo/MorphemeLM
Neural language model using morphemes
epinhoodceo/NATO
NATO Phonetic Alphabet
epinhoodceo/Pandas-Cookbook
Pandas Cookbook, published by Packt
epinhoodceo/pca-dimension-reduction
Reducing High Dimensional Data with Principle Component Analysis (PCA)
epinhoodceo/pdf-toaster
PDF compression utility for macOS
epinhoodceo/plsa
a probabilistic latent semantic analysis model in matlab programming
epinhoodceo/Porter-Stemmer
A Javascript Implementation of the Porter Stemmer
epinhoodceo/porter-stemmer-1
Martin Porter's stemmer for node.js
epinhoodceo/PyROC
This is a python simple tool for generating charts for ROC curve
epinhoodceo/Python-Synopsis
Python Study Guide
epinhoodceo/quasi_dictionary
Data structure indexing key-values with controled false positive rate and with low time and memory impacts
epinhoodceo/Relevance-Ranking-using-Latent-Semantic-Indexing--from-scratch-
Latent Semantic Analysis Introduction: An information retrieval technique patented in 1988. In the context of its application to information retrieval, it is sometimes called Latent Semantic Indexing (LSI). LSI allows a search engine to determine what a page is about outside of specifically matching search query text. It looks at “Themes” instead of “Keywords”. Linear Algebra techniques used in the project: Singular Value Decomposition, Cosine Similarity, Matrix properties. Dataset: “Sci.space” news group from 20 news groups dataset, available in the Scikit-Learn library. It contains 400 news articles related to space. SVD (Singular Value Decomposition): SVD is a matrix decomposition algorithm, it decomposes a matrix into 3 matrices which are a set to transformations. Decomposition leads to an orthogonal matrix U, Diagonal matrix S and a Diagonal Matrix V. This is the best possible transformation of a matrix. In this decomposition method we are looking for a set of orthonormal basis in the row space that when multiplied by the original matrix goes to an orthonormal basis in the column space.Av1 = σ1u1 Av2 = σ2u2
epinhoodceo/Resume-Job-Description-Matching
The purpose of this project was to defeat the current Application Tracking System used by most of the organization to filter out resumes. In order to achieve this goal I had to come up with a universal score which can help the applicant understand the current status of the match. The following steps were undertaken for this project 1) Job Descriptions were collected from Glass Door Web Site using Selenium as other scrappers failed 2) PDF resume parsing using PDF Miner 3) Creating a vector representation of each Job Description - Used word2Vec to create the vector in 300-dimensional vector space with each document represented as a list of word vectors 4) Given each word its required weights to counter few Job Description specific words to be dealt with - Used TFIDF score to get the word weights. 5) Important skill related words were given higher weights and overall mean of each Job description was obtained using the product for word vector and its TFIDF scores 6) Cosine Similarity was used get the similarities of the Job Description and the Resume 7) Various Natural Language Processing Techniques were identified to suggest on the improvements in the resume that could help increase the match score
epinhoodceo/rsemantic
A document vector search with flexible matrix transforms. Currently supports Latent semantic analysis and Term frequency - inverse document frequency
epinhoodceo/sas-to-r
Translate SAS to R
epinhoodceo/semanticpy
A collection of semantic functions for python - including Latent Semantic Analysis(LSA)
epinhoodceo/TfIdf_Cosine_Jaccard
Python code for calculating TF-IDF vectors, Cosine Similarity and Jaccard Index using NumPy.