Pinned Repositories
FindingEpics
This is the repo for an activity recommender web application based on VeloSpark
galvanize_hire_predictor
This is a data analysis and modeling project I did to predict the probability of a student being hired given a set up input variables
GdMacmillan.github.io
Github pages blog built with Jekyll
kaggle-protein-classification
Human Protein Atlas Image Classification - Classify subcellular protein patterns in human cells
kickstarter_case_study
Analysis of Kickstarter campaign dataset and models which predict campaign success and staff picks
LensFlare
Python repository for deep learning using Tensorflow and Numpy
ml_flux_tutorial
A short tutorial to walk newbies or anyone interested in Julia through some basic code to start a computer vision project using Google Colab
spark_recommender_systems
This is an intro to recommender systems using the popular Movielens dataset and Spark.ML
think_julia_book
Coursework for the ThinkJulia Book by Ben Lauwens: https://benlauwens.github.io/ThinkJulia.jl/latest/book.html
VeloSpark
The Colorado activity recommender. This is an application I did for my data science immersive capstone project with Galvanize.
GdMacmillan's Repositories
GdMacmillan/artificial-intelligence
GdMacmillan/awesome-public-datasets
A awesome list of (large-scale) public datasets on the Internet. (On-going collection)
GdMacmillan/courses
Course materials for the Data Science Specialization: https://www.coursera.org/specialization/jhudatascience/1
GdMacmillan/data-scientists-guide-apache-spark
Best practices of using Spark for practicing data scientists in the context of a data scientist’s standard workflow.
GdMacmillan/data_science
GdMacmillan/dataweek-workshop
Machine learning workshop using Python, pandas, and scikit-learn. The first half of the day covered supervised classification using Logistic Regression and how to use cross validation to evaluate your models . The second half of the day covered unsupervised clustering with Kmeans as well as an overview of the data science process.
GdMacmillan/diploma-thesis
Diploma thesis "Concurrent Programming for Scalable Web Architectures" released under Creative Commons license
GdMacmillan/LearnDataScience
Open Content for self-directed learning in data science
GdMacmillan/LinRegVB
Code for my paper "Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression"
GdMacmillan/pipelines_and_featureunions
An in depth tutorial on sklearn's Pipeline and FeatureUnion classes.
GdMacmillan/pyodbc
Python ODBC bridge
GdMacmillan/sounder
A grouping of Apache Pig examples.
GdMacmillan/spark-movie-lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
GdMacmillan/statlearning-notebooks
Python notebooks for exercises covered in Stanford statlearning class (where exercises were in R).