/introduction-datascience-python-book

Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications

Primary LanguageJupyter Notebook

Introduction to Data Science

A Python Approach to Concepts, Techniques and Applications

This repository is part of the book: "Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications"

http://www.springer.com/gp/book/9783319500164

About the Textbook:

This accessible and classroom-tested textbook/reference presents an introduction to the fundamentals of the emerging and interdisciplinary field of data science. The coverage spans key concepts adopted from statistics and machine learning, useful techniques for graph analysis and parallel programming, and the practical application of data science for such tasks as building recommender systems or performing sentiment analysis. Topics and features: provides numerous practical case studies using real-world data throughout the book; supports understanding through hands-on experience of solving data science problems using Python; describes techniques and tools for statistical analysis, machine learning, graph analysis, and parallel programming; reviews a range of applications of data science, including recommender systems and sentiment analysis of text data; provides supplementary code resources and data at an associated website.

About the authors:

Dr. Laura Igual is an Associate Professor at the Departament de Matemàtiques i Informàtica, Universitat de Barcelona, Spain. Dr. Santi Seguí is an Assistant Professor at the same institution. The book was co-written by Jordi Vitrià, Eloi Puertas, Petia Radeva, Oriol Pujol, Sergio Escalera, Francesc Dantí and Lluís Garrido.

Subject Area of the Book

In this era, where a huge amount of information from different fields is gathered and stored, its analysis and the extraction of value have become one of the most attractive tasks for companies and Society in general. The design of solutions for the new questions emerged from data have required multidisciplinary teams. Computer Scientists, Statisticians, Mathematicians, Biologists, Journalists and Sociologists, as well as many others are now working together in order to provide knowledge from data. This new interdisciplinary field is called Data Science. The pipeline of any data science goes through asking the right questions; gathering data; cleaning data; generating hypothesis; making inferences; visualizing data; assessing solutions; etc.

Organization and Feature of the Book

This book is an introduction to concepts, techniques and applications in Data Science. The book focuses on the analysis of data, covering concepts from statistics to machine learning; techniques for graph analysis and parallel programming; and applications such as recommender systems or sentiment analysis. All chapters introduce new concepts that are illustrated by practical cases using real data. Public databases such as Eurostat, different social networks and Movilens are used. Specific questions about the data are posed in each chapter. The solutions to these questions are implemented using Python programming language and presented in code boxes properly commented. This allows the reader to learn data science by solving problems which can generalize to other problems. The book is not intend to cover the whole set of data science methods neither to provide a complete collection of references. Currently, data science is an increasing and emerging field, so readers are encourage to look for specific methods and references using keywords in the net.

Target audiences

This book is addressed to upper-tier undergraduate and beginning graduate students from technical disciplines. Moreover, the book is also addressed to professional audiences following continuous education short courses and to researchers from diverse areas following self-study courses. Basic skills in computer science, mathematics and statistics are required. Code programming in Python is of benefit. However, even if the reader is new to Python this should not be a problem, since acquiring the Python basics is manageable in a short period of time.

Previous Uses of the Materials

Parts of the presented materials have been used in the Postgraduate course of Data Science and Big Data from University of Barcelona. All contributing authors are involved in this course.