This is the code repository for Practical Data Science Cookbook - Second Edition, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.
As an increasing amount of data is generated each year, the need to analyze and operationalize it is more important than ever. Companies that know what to do with their data have a competitive advantage over companies that don't, and this drives a higher demand for knowledgeable and competent data professionals.
Starting with the basics, this book covers how to set up your numerical programming environment, introduces you to the data science pipeline, and guides you through several data projects in a step-by-step format. By sequentially working through the steps in each chapter, you will quickly familiarize yourself with the process and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis—R and Python.
All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.
A block of code is set as follows:
<Contextpath="/jira"docBase="${catalina.home}
/atlassian- jira" reloadable="false" useHttpOnly="true">
Any command-line input or output is written as follows:
mysql -u root -p
For this book, you will need a computer with access to the Internet and the ability to install the open source software needed for the projects. The primary software we will be using consists of the R and Python programming languages, with a myriad of freely available packages and libraries. Installation instructions are in the first chapter.