Pinned Repositories
building-spark-applications-live-lessons
Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable Spark applications for predictive analytics in the context of a data scientist's standard workflow.
dataweek-workshop
Machine learning workshop using Python, pandas, and scikit-learn. The first half of the day covered supervised classification using Logistic Regression and how to use cross validation to evaluate your models . The second half of the day covered unsupervised clustering with Kmeans as well as an overview of the data science process.
ml-workshop
Opinion-Mining-Project
Feature-Based Sentiment Analysis in Python
pipelines_and_featureunions
An in depth tutorial on sklearn's Pipeline and FeatureUnion classes.
probabilistic-programming-intro
Introduction to probabilistic programming using PyMC3
python-anti-patterns
A presentation of commonly observed beginner-mistakes.
self-study-resources
DSI Self Study Resources
spark-install
Installation guide for Apache Spark + Hadoop on Mac/Linux
zipfian-distribution
A self contained environment to do data science with {Python | Shell | R | Hadoop}. This is a Vagrant box built on Ubuntu 12.04 LTS
Galvanize Data Science's Repositories
GalvanizeDataScience/DS-Glossary-RPT1
student-led glossary of data science terms
GalvanizeDataScience/dataops-docs
Quickly get started with DevOps tools and best practices for building modern data solutions.
GalvanizeDataScience/ds-nyc-project-shell
a shell for a data science repo for capstone projects
GalvanizeDataScience/dsi-premium-prep
Repository for code and materials for the Galvanize DSI Premium Prep
GalvanizeDataScience/Practice
Capstone Project - Do colleges with diversity or high tuition fee fetch high paying jobs
GalvanizeDataScience/RFT4-Capstones
Directory of Final Capstones for Galvanize Data Science, Remote Full Time, Cohort 4
GalvanizeDataScience/Bootcamp-Project-1-Python
GalvanizeDataScience/course-dbt
Analytics engineering with dbt - projects and developer environment
GalvanizeDataScience/delete_me
GalvanizeDataScience/ds-precourse-axes
A repository to hold the challenges having to do with modifying the axes on a matplotlib object.
GalvanizeDataScience/ds-precourse-data-viz
A testable project for data visualizations in DSI precourse
GalvanizeDataScience/Exoplanets
Using NASA data to determine if observation indicates that an exoplanet was located
GalvanizeDataScience/express-mongoose-bookmarks-api
GalvanizeDataScience/express-mongoose-bookmarks-api-1
GalvanizeDataScience/fastbook
The fastai book, published as Jupyter Notebooks
GalvanizeDataScience/g120graphs
GalvanizeDataScience/git-primer-checkpoint
GalvanizeDataScience/github-permission-requests
The repository for requesting permissions
GalvanizeDataScience/Lectures-1
DS Lectures
GalvanizeDataScience/MiniDatabase
GalvanizeDataScience/py-numeric-computing-1-A
GalvanizeDataScience/Python
GalvanizeDataScience/Python-1-Assignment
GalvanizeDataScience/python-for-data-analysis-1
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
GalvanizeDataScience/Python-I---Assignment
GalvanizeDataScience/PythonFullThrottle
Files for attendees of my PythonFullThrottle Safari Online Learning Live Training
GalvanizeDataScience/RPP-DSR-Reference
Repo built for continuous improvement in both efficiency and effectiveness of DSR role
GalvanizeDataScience/rpp_student_coding_sessions
GalvanizeDataScience/staged-recipes
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
GalvanizeDataScience/stc-resources
Resources for STC partnership