Pinned Repositories
CodeWars-HackerRankChallenges
** IF JUPYTER NOTEBOOK NOT RENDERING in GitHub, copy and paste the notebook's GitHub URL into https://nbviewer.jupyter.org/ ** Jupyter Notebooks containing my process and the code I used to beat Python challenges found on CodeWars and HackerRank.
InvoiceLog
An application that extracts material information (vendor, invoice number, invoice date, etc.) from an invoice and returns a .csv file containing the information. First, the application classifies which vendor is sending the bill (through predictive modeling/neural network). Next, a vendor's invoice setup is 'created' and stored. Finally, given which vendor sent the bill, the application extracts the information using OCR (pytesseract) and returns a .csv file containing the information.
Ames-Iowa-House-Sale-Price-Predicitons---GA-Project-2
Working with GLM to predict the final sale price of a home given its features, like its total square footage, number of bathrooms, number of bedrooms, etc. Using EDA to ensure assumptions on data are safe to be made (a garage was listed as being built in 2070, for example) and making the appropriate changes. Analyzing how different models scored with respect to their RMSE.
Predicting-Chronic-Kidney-Disease-and-Classification-Model-Evaluation-Lab---GA-Lab-4.02
Predicting Chronic Kidney Disease using multiple Classification models, including Decision Tree, Random Forest, and Logistic Regression. Used different methods to improve model performance, like Grid Search Cross Validation, Polynomial Features, SMOTE, and over/under-sampling methods (for imbalanced data). Worked with sklearn, imblearn, Pandas, NumPy, Matplotlib, and Seaborn.
Subreddit-NLP-Classification-Working-Reddit-s-API
Working with Reddit's API to scrape and store user posts between two similar, yet different subreddits. Using Natural Language Processing to classify whether the post came from one subreddit, or the other. Using GridSearch to optimize model performance.
Baseball_Rookie_Webscrape
Python-for-Algorithms--Data-Structures--and-Interviews
Files for Udemy Course on Algorithms and Data Structures
Time-Series-Analysis---GA-Lab-7.04
Time Series Analysis using ARMA, ARIMA, and SARIMAX models. Understanding the Augmented Dickey-Fuller Test and its interpretation.
BeautifulSoup-and-Webscraping-to-DataFrame---GA-Lab-5.01
Working with BeautifulSoup to pull information from website. Then, using for loop to organize data into dictionary. Finally, using Pandas to create a DataFrame from the data, and export to .csv file.
Building-Pokemon-Dictionary---Understanding-Data-Types---GA-Lab-1.01
Building a dictionary of Pokemon, almost like the Pokedex. The intention is to practice working with different data types, how to add, remove, edit, etc.
KevinWAguiar's Repositories
KevinWAguiar/CodeWars-HackerRankChallenges
** IF JUPYTER NOTEBOOK NOT RENDERING in GitHub, copy and paste the notebook's GitHub URL into https://nbviewer.jupyter.org/ ** Jupyter Notebooks containing my process and the code I used to beat Python challenges found on CodeWars and HackerRank.
KevinWAguiar/Subreddit-NLP-Classification-Working-Reddit-s-API
Working with Reddit's API to scrape and store user posts between two similar, yet different subreddits. Using Natural Language Processing to classify whether the post came from one subreddit, or the other. Using GridSearch to optimize model performance.
KevinWAguiar/Ames-Iowa-House-Sale-Price-Predicitons---GA-Project-2
Working with GLM to predict the final sale price of a home given its features, like its total square footage, number of bathrooms, number of bedrooms, etc. Using EDA to ensure assumptions on data are safe to be made (a garage was listed as being built in 2070, for example) and making the appropriate changes. Analyzing how different models scored with respect to their RMSE.
KevinWAguiar/SAT-ACT-Participation-Trends---GA-Project-1
Analyzed 2017 and 2018 SAT and ACT participation trends to provide test makers with advice to grow participation rates. Used Pandas, NumPy, Matplotlib, Seaborn, and Plotly (Choropleth map). Reviewed the important difference between Descriptive and Inferential Statistics.
KevinWAguiar/Time-Series-Analysis---GA-Lab-7.04
Time Series Analysis using ARMA, ARIMA, and SARIMAX models. Understanding the Augmented Dickey-Fuller Test and its interpretation.
KevinWAguiar/Supervised-Learning-Model-Comparison---GA-Lab-6.01
Analyzing the effectiveness of different Supervised Learning Models, including Linear Regression, KNearestNeighbors, Decision Tree, Bagged Decision Tree, Random Forest, AdaBoost, and SVM.
KevinWAguiar/Natural-Language-Processing-and-Vectorizing---GA-Lab-5.02
Analyzing text through Natural Language Processing. Testing the effectiveness of sklearn's Hashing Vectorizer and TF-IDF Vectorizer.
KevinWAguiar/BeautifulSoup-and-Webscraping-to-DataFrame---GA-Lab-5.01
Working with BeautifulSoup to pull information from website. Then, using for loop to organize data into dictionary. Finally, using Pandas to create a DataFrame from the data, and export to .csv file.
KevinWAguiar/Predicting-Chronic-Kidney-Disease-and-Classification-Model-Evaluation-Lab---GA-Lab-4.02
Predicting Chronic Kidney Disease using multiple Classification models, including Decision Tree, Random Forest, and Logistic Regression. Used different methods to improve model performance, like Grid Search Cross Validation, Polynomial Features, SMOTE, and over/under-sampling methods (for imbalanced data). Worked with sklearn, imblearn, Pandas, NumPy, Matplotlib, and Seaborn.
KevinWAguiar/Data-Science-Process-KNeighbors-Classifier---GA-Lab-4.01
Establishing/walking through the best-practice Data Science Process: 1. Define the Problem; 2. Obtain the data; 3. Explore the data; 4. Model the data; 5. Evaluate the model; 6. Answer the problem. Also, working with KNeighbors Classifier.
KevinWAguiar/Train-Test-Split-Cross-Validation-GLM-Lab---GA-Lab-3.02
In Part 1: Train-Test-Split the data, and understanding the importance of Cross Validation and overfitting/underfitting. In Part 2: Using Generalized Linear Models to predict shots made by Kobe Bryant.
KevinWAguiar/Sacremento-Real-Estate---Linear-Regression-on-Sacramento-s-Real-Estate-Data---GA-Lab-3.01
Working with Sacramento Real Estate Data to build a Linear Regression Model. During the process, we interpret the intercept's value, and the slope. Also reviewed were Panda's get_dummies(), metrics used in Linear Regression, and the Bias-Variance Tradeoff.
KevinWAguiar/Titanic-Data-Set---Working-with-Pandas-NumPy-Matplotlib-and-Seaborn---GA-Lab-2.01
Practice importing data, dealing with NaN values, Feature Extraction, Exploratory Analysis, and Visualizations using Pandas, NumPy, Matplotlib, and Seaborn.
KevinWAguiar/Inferential-Statistics---Understanding-Inf.-Stats.-and-Confidence-Intervals---GA-Lab-2.03
Further practice with Pandas while understanding Inferential Statistics, Confidence Intervals, and their interpretations.
KevinWAguiar/Pandas-Concatenation-Lab---Practice-working-with-Pandas---GA-Lab-2.02
Working with Pandas to manipulate/make changes to a DataFrame.
KevinWAguiar/Distributions---Understanding-different-Distributions-and-the-Central-Limit-Theorem---GA-Lab-1.02
Understanding some of the most prevalent distributions (Bernoulli, Binomial, Poisson, Exponential & Geometric), and understanding the Central Limit Theorem and its implications.
KevinWAguiar/Building-Pokemon-Dictionary---Understanding-Data-Types---GA-Lab-1.01
Building a dictionary of Pokemon, almost like the Pokedex. The intention is to practice working with different data types, how to add, remove, edit, etc.
KevinWAguiar/PlotlyPractice
Practice with Plotly for Visualization Skill Building
KevinWAguiar/Python-for-Algorithms--Data-Structures--and-Interviews
Files for Udemy Course on Algorithms and Data Structures
KevinWAguiar/Twitter_webscrape_to_dataframe
Twitter webscrape-to-dataframe python script.
KevinWAguiar/InvoiceLog
An application that extracts material information (vendor, invoice number, invoice date, etc.) from an invoice and returns a .csv file containing the information. First, the application classifies which vendor is sending the bill (through predictive modeling/neural network). Next, a vendor's invoice setup is 'created' and stored. Finally, given which vendor sent the bill, the application extracts the information using OCR (pytesseract) and returns a .csv file containing the information.
KevinWAguiar/MightyTakeHomeChallenge
KevinWAguiar/GA-Project-4
Material for GA Project 4.
KevinWAguiar/Baseball_Rookie_Webscrape
KevinWAguiar/MainRepo
KevinWAguiar/Instagram-API-python
Unofficial instagram API, give you access to ALL instagram features (like, follow, upload photo and video and etc)! Write on python.
KevinWAguiar/invoice2data
Extract structured data from PDF invoices
KevinWAguiar/jekyll-now
Build a Jekyll blog in minutes, without touching the command line.