The following section serves as a guide-book to get familiar and started on various technologies and tools in the Data Science space.
- Blogs
- Analytics Vidhya
- Becoming a Data Scientist
- Data Mining Research
- Data Science Central
- Data Science Report
- Data Science Weekly
- Data Science @ Berkeley
- Dataconomy
- Datafloq
- Datatau
- DBMS2
- Domino Data Science Blog
- Edwin Chen’s Blog
- FastML
- FiveThirtyEight
- Flowing Data
- Hilary Mason’s Blog
- Insight
- KDnuggets
- Machine Learning (Theory)
- No Free Hunch
- O’Reilly on Our Radar
- R Bloggers
- Simply Statistics
- Smart Data Collective
- Statistical Modeling, Casual Inference, and Social Science
- Stats & Bots - Data stories on machine learning and analytics
- Steve Miller’s Blog at Information Management
- The Gradient Flow
- Walking Randomly
- What’s the Big Data?
- Books
- Concepts
- Databases
- Big Data Now: Current Perspectives from O'Reilly Radar
- Database Explorations
- Database Fundamentals
- Databases, Types, and The Relational Model: The Third Manifesto
- Foundations of Databases
- Readings in Database Systems, 5th Ed.
- Temporal Database Management by Christian S. Jensen
- The Theory of Relational Databases
- What is Database Design, Anyway?
- Data Mining
- A Programmer's Guide to Data Mining by Ron Zacharski
- Data Jujitsu: The Art of Turning Data into Product (email address requested, not required)
- Data Mining Algorithms In R - Wikibooks
- Internet Advertising: An Interplay among Advertisers, Online Publishers, Ad Exchanges and Web Users
- Introduction to Data Science by Jeffrey Stanton
- Mining of Massive Datasets
- School of Data Handbook
- Theory and Applications for Advanced Text Mining
- Information Retrieval
- Licensing
- Machine Learning
- A Brief Introduction to Machine Learning for Engineers by Osvaldo Simeone
- A Brief Introduction to Neural Networks
- A Course in Machine Learning
- A First Encounter with Machine Learning
- Algorithms for Reinforcement Learning by Csaba Szepesvári
- An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
- Bayesian Reasoning and Machine Learning by David Barber
- Computer Vision by Dana Ballard, Chris Brown
- Computer Vision: Algorithms and Applications by Richard Szeliski
- Computer Vision: Models, Learning, and Inference by Simon J.D. Prince
- Convex Optimization – Boyd and Vandenberghe
- Deep Learning by Ian Goodfellow and Yoshua Bengio and Aaron Courville
- Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams
- Information Theory, Inference, and Learning Algorithms
- Introduction to Machine Learning by Amnon Shashua
- Learning Deep Architectures for AI
- Machine Learning
- Machine Learning, Neural and Statistical Classification
- Neural Networks and Deep Learning
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction - Second Edition, February 2009 by Trevor Hastie, Robert Tibshirani, and Jerome Friedman
- The LION Way: Machine Learning plus Intelligent Optimization
- Regular Expressions
- Spark
- Databases
- Languages
- Python
- 20 Python Libraries You Aren't Using (But Should)
- A Beginner's Python Tutorial
- A Guide to Python's Magic Methods by Rafe Kettler
- A Whirlwind Tour of Python by Jake VanderPlas
- Automate the Boring Stuff by Al Sweigart
- Building Machine Learning Systems with Python by Willi Richert & Luis Pedro Coelho, Packt
- Code Like a Pythonista: Idiomatic Python
- Data Structures and Algorithms in Python by B. R. Preiss
- From Python to NumPy
- Full Stack Python
- Functional Programming in Python
- Google's Python Style Guide
- Hadoop with Python
- High Performance Python
- How to Make Mistakes in Python by Mike Pirnat
- Intermediate Python by Muhammad Yasoob Ullah Khalid
- Kalman and Bayesian Filters in Python
- Learn Python, Break Python
- Learn Python in Y minutes
- Learn Pandas by Hernan Rojas
- Learn to Program Using Python by Cody Jackson
- Learn Tensorflow (IPython Notebooks)
- Learning Python by Fabrizio Romano, Packt
- Learning to Program
- Math for programmers (using python)
- Mining the Social Web - 2nd Edition (IPython Notebooks)
- Modeling Creativity: Case Studies in Python by Tom D. De Smedt
- Picking a Python Version: A Manifesto
- Practical Programming in Python by Jeffrey Elkner
- Programming Computer Vision with Python by Jan Erik Solem
- Python Cookbook by David Beazley
- Python for Everybody Exploring Data Using Python 3 by Charles Severance
- Python Practice Projects
- Scipy Lecture Notes
- Python Data Science Handbook (IPython Notebooks)
- The Hitchhiker’s Guide to Python!
- The Python Game Book
- Think Stats: Probability and Statistics for Programmers by Allen B. Downey
- R
- Advanced R Programming by Hadley Wickham
- An Introduction to Statistical Learning with Applications in R by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
- Cookbook for R by Winston Chang
- Introduction to Probability and Statistics Using R by G. Jay Kerns
- Learning Statistics with R by Daniel Navarro
- Machine Learning with R by Brett Lantz, Packt
- ModernDive by Chester Ismay and Albert Y. Kim
- Practical Regression and Anova using R by Julian J. Faraway
- Probabilistic Models in the Study of Language (with R Code)
- R for Data Science by Garrett Grolemund and Hadley Wickham
- R for Spatial Analysis
- R Language for Programmers by John D. Cook
- R Packages by Hadley Wickham
- R Practicals
- R Programming
- R Programming for Data Science by Roger D. Peng
- R Succinctly, Syncfusion
- The caret Package by Max Kuhn
- The R Inferno by Patrick Burns
- The R Language
- The R Manuals
- Tidy Text Mining with R by Julia Silge and David Robinson
- SQL
- Python
- Mathematics
- Algebra
- A First Course in Linear Algebra by Robert A. Beezer
- Advanced Algebra by Anthony W. Knapp
- Basic Algebra by Anthony W. Knapp
- Basics of Algebra, Topology, and Differential Calculus
- Lecture Notes of Linear Algebra by Dr. P. Shunmugaraj, IIT Kanpur
- Linear Algebra by Dr. Arbind K Lal, IIT Kanpur
- Linear Algebra
- Linear Algebra by Jim Hefferon
- Calculus
- Misc.
- An Introduction to the Theory of Numbers by Leo Moser
- Book of Proof by Richard Hammack
- Category Theory for the Sciences
- Computational and Inferential Thinking. The Foundations of Data Science
- Computational Geometry by Sean Luke
- Essentials of Metaheuristics
- Graph Theory
- Introduction to Proofs by Jim Hefferon
- Knapsack Problems - Algorithms and Computer Implementations by Silvano Martello and Paolo Toth
- Mathematical Logic - an Introduction
- Mathematics, MTH101A by P. Shunmugaraj, IIT Kanpur
- Non-Uniform Random Variate Generation by Luc Devroye
- Number Theory by Holden Lee MIT
- Power Programming with Mathematica by David B. Wagner
- Probability & Statistics
- Bayesian Methods for Hackers by Cameron Davidson-Pilon
- CK-12 Probability and Statistics - Advanced
- Collaborative Statistics
- Concepts & Applications of Inferential Statistics
- Forecasting: Principles and Practice by Rob J Hyndman and George Athanasopoulos
- Hyperstat
- Introduction to Probability by Charles M. Grinstead and J. Laurie Snell
- Introduction to Probability and Statistics Spring 2014
- Introduction to Statistical Thought by Micheal Lavine
- Multivariate Statistics: Concepts, Models, and Applications - 3rd Web Edition by David W. Stockburger
- OpenIntro Statistics
- Probability and Statistics Cookbook
- Probability and Statistics EBook
- Statistics Done Wrong by Alex Reinhart
- Statistical Learning with Sparsity: The Lasso and Generalizations by Trevor Hastie, Robert Tibshirani, and Martin Wainwright
- StatLect
- StatSoft
- The Little Handbook of Statistical Practice by Gerard E. Dallal, Ph.D
- Think Bayes: Bayesian Statistics Made Simple by Allen B. Downey
- Algebra
- Concepts
- Cheatsheets
- Databases
- Languages
- MATLAB
- Python
- Methodologies
- Packages
- R
- Methodologies
- Packages
- caret (Modeling and Machine Learning)
- cartograpy (Thematic Maps with Spatial Objects)
- data.table
- devtools (Package Development)
- dplyr (Data Transformation)
- eurostat (eurostat Database)
- ggplot2 (Data Visualization)
- h2o (Big Data and Parallel Processing)
- leaflet (Interactive Maps)
- mlr
- mosaic
- quanteda (Quantitative Analysis of Textual Data)
- randomizr (Random Assignment and Sampling)
- shiny
- sparlyr
- survminer (Survival Plots)
- testthat
- Tidyverse
- xplain (Statistical functions for XML data)
- xts (Time Series)
- Math
- Misc.
- Competitions
- Courses
- Datasets
- APIs
- Data Lists
- Data Repositories
- Open Data Portals
- Country-specific / Government Affiliated
- Independent
- Frameworks
- Podcasts
- Data Science
- Data Visualization
- Machine Learning
- Python Programming
- R Programming
- Statistics
- Talks
- TED
- A Smarter, More Precise Way to Think About Public Health by Sue Desmond-Hellmann
- Big Data is Better Data by Kenneth Cukier
- How Data Will Transform Business by Philip Evans
- How We Found the Worst Place to Park in New York City — Using Big Data by Ben Wellington
- The Human Insights Missing From Big Data by Tricia Wang
- The Rise of Human-Computer Cooperation by Shyam Sankar
- What Do We Do With All This Big Data? by Susan Etlinger
- What Will a Future Without Secrets Look Like? by Alessandro Acquisti
- What’s the Next Window into Our Universe? by Andrew Connolly
- Why Smart Statistics are the Key to Fighting Crime by Anne Milgram
- TED
- Tools