Code and slides to accompany the online series of webinars: https://data4sci.com/excel by Data For Science.
Microsoft Excel has a long and proud history of bringing data analysis and processing to the masses with its intuitive interface and powerful functionality. However, it has important limitations that are only becoming more apparent in our current age of Big Data and that can only be surpassed by up-skilling to a more programming oriented context.
This lecture will introduce you to the ways in which Python and Pandas can be used to build up on your Excel analyses to bring the power of sophisticated Data Science and Machine Learning tools into your pipeline. We’ll also cover how to both read data from and write results to Excel spreadsheets.
- Default settings
- Worksheet sizes and cross references
- Formatting and styling
- Functions and cell evaluation
- Importing csv files and Excel spreadsheets
- Data cleaning
- Subsetting
- DataFrame Manipulations
- Merge and Join
- Generating simple Excel spreadsheets
- Data smoothing
- Pivot tables
- Basic plotting
- Linear regression
- Curve fitting
- Adding sheets to a workbook
- Reading and formatting Excel files
- Inspecting arbitrary cells
- Modifying specific rows and columns
- Appending a dataframe to an excel sheet