gzlupko
PhD student in Org Psych at Columbia University | Data Science intern at Autodesk
Columbia University
Pinned Repositories
5123_Linear_Models
Lab assignments and code repository for Group 1 in HUDM 5123 Linear Models & Experimental Design, Fall 2021
6122_multivariate
Code and assignments for HUDM 6122 Multivariate Analysis I. Methods covered include common data mining and dimensionality reduction techniques such as PCA, cluster analysis, factor analysis and multidimensional scaling.
An-Introduction-to-Statistical-Learning
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.
CART
This repo contains an R programming project that implements and compares decision tree models including C5.0, C5.4 and CART.
dl_timeSeries
Deep Learning for Time Series Classification
Employee_Attrition_Modeling
People analytics project in R that implements predictive modeling to identify employees most likely to leave a company. Discussion around implications for the sample firm and proposed interventions draw on best practices in organizational development.
gr5073_ML
Personal repo for projects associated with GR 5073 Machine Learning for the Social Sciences
Network_Research
This R notebook applies machine learning classification methods in the context of organizational network analysis. The goal was to test the predictive accuracy of various supervised learning models on company employee network data.
nlp_projects
This repo contains Python and R code for Natural Language Processing (NLP) methods applied to open-entry text data collected in psychological and organizational research.
Recruiting_Prediction_Neural_Network
This project analyzes fictional recruiting data through two main approaches - prediction and explanation. A multilayer perceptron is used for prediction and logistic regression for explanation.
gzlupko's Repositories
gzlupko/Employee_Attrition_Modeling
People analytics project in R that implements predictive modeling to identify employees most likely to leave a company. Discussion around implications for the sample firm and proposed interventions draw on best practices in organizational development.
gzlupko/Recruiting_Prediction_Neural_Network
This project analyzes fictional recruiting data through two main approaches - prediction and explanation. A multilayer perceptron is used for prediction and logistic regression for explanation.
gzlupko/nlp_projects
This repo contains Python and R code for Natural Language Processing (NLP) methods applied to open-entry text data collected in psychological and organizational research.
gzlupko/5123_Linear_Models
Lab assignments and code repository for Group 1 in HUDM 5123 Linear Models & Experimental Design, Fall 2021
gzlupko/6122_multivariate
Code and assignments for HUDM 6122 Multivariate Analysis I. Methods covered include common data mining and dimensionality reduction techniques such as PCA, cluster analysis, factor analysis and multidimensional scaling.
gzlupko/An-Introduction-to-Statistical-Learning
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.
gzlupko/CART
This repo contains an R programming project that implements and compares decision tree models including C5.0, C5.4 and CART.
gzlupko/dl_timeSeries
Deep Learning for Time Series Classification
gzlupko/gr5073_ML
Personal repo for projects associated with GR 5073 Machine Learning for the Social Sciences
gzlupko/Network_Research
This R notebook applies machine learning classification methods in the context of organizational network analysis. The goal was to test the predictive accuracy of various supervised learning models on company employee network data.
gzlupko/Factor_Analysis
Factor analytics techniques employed in R, including EFA and CFA, to analyze Martin & Doris's (2003) research on the development of a psychometric instrument measuring individual styles of humor.
gzlupko/gzlupko
gzlupko/hudm5199
Programming for Data Science
gzlupko/k-Nearest_Neighbors
gzlupko/Kaggle_riid_competition
gzlupko/natural-language-processing-1
This project explores methods in Natural Language Processing (NLP) including text mining and pre-processing, sentiment analysis, and Latent Dirichlet Allocation (LDA) topic modeling.
gzlupko/ORLA_6541_Applied_Data_Science
Work team repo for code and coursework associated with ORLA 6541 Applied Data Science in Organizations at Teachers College, Columbia University.
gzlupko/personal_website
Personal website
gzlupko/pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
gzlupko/R-Shiny-Interactive-Visualization
This repository houses code for an R Shiny web application that allows users to share interactive data visualizations. I deployed the application via shinyapps.io, enabling users to access the application through a web browser.
gzlupko/raspberry_pi_python_projects
gzlupko/README-template.md
A README template for anyone to copy and use.
gzlupko/Reticulate
Running Python in R with the reticulate package
gzlupko/social_network_analysis
This repository utilizes social network analysis (SNA) to analyze multiple social networks. Clustering algorithms were used to explore sub-groups and QAP was implemented for non-parametric multiple regression analysis.
gzlupko/spark
Apache Spark - A unified analytics engine for large-scale data processing
gzlupko/Spotify_Data_Analysis
Performed data cleaning, visualization, and statistical testing in R on Spotify’s Global Top 50 songs. Implemented multiple regression to identify multivariate predictors of song popularity.
gzlupko/SQL_Database_in_AWS
Spun instance in Amazon Web Services (AWS) and built database using SQL. Connected the AWS instance to RStudio in a local environment and ran SQL commands in RStudio using the DBI package.
gzlupko/statistics_multi
Repo for the course Applied multivariate statistics
gzlupko/transaction-fraud-detection
A data science project to predict whether a transaction is a fraud or not.
gzlupko/Twitter_API
Utilized Twitter API for data mining. Built an HTML-formatted data visualization using RMarkdown of tweet activity for trending houseplants.