Data Analysis with Python and PySpark

This is the companion repository for the Data Analysis with Python and PySpark book (Manning, estimated publishing date: mid 2021.) It contains the source code and data (or data download scripts if pertinent).

Mistakes or omissions

If you encounter mistakes in the book manuscript (including the printed source code), please use the Manning platform to provide feedback.