/Azure-Databricks

Azure Databricks - Advent of 2020 Blogposts

Primary LanguageJupyter NotebookMIT LicenseMIT

Microsoft Azure Databricks

Hits

Microsoft Azure Databricks repository is a set of blogposts as a Advent of Azure Databricks 2020 presented to readers for easier onboarding with Azure Databricks!

Table of content / Featured blogposts

  1. Dec 01 2020 - What is Azure DataBricks (blogpost)
  2. Dec 02 2020 - How to get started with Azure Databricks (blogpost)
  3. Dec 03 2020 - Getting to know the workspace and Azure Databricks platform (blogpost)
  4. Dec 04 2020 - Creating your first Azure Databricks cluster (blogspot)
  5. Dec 05 2020 - Basics on architecture of clusters, workers, DBFS storage jobs (blogpost)
  6. Dec 06 2020 - Importing and storing data to Azure Databricks (blogpost)
  7. Dec 07 2020 - Starting with Databricks notebooks and loading data (blogpost)
  8. Dec 08 2020 - Using Databricks CLI and DBFS CLI for file upload (blogpost)
  9. Dec 09 2020 - Connect to Azure Blob storage using Notebooks in Azure Databricks (blogpost)
  10. Dec 10 2020 - Using Azure Databricks Notebooks with SQL for Data engineering tasks (blogpost)
  11. Dec 11 2020 - Using Azure Databricks Notebooks with R to do Data engineerg and data analytics) (blogpost)
  12. Dec 12 2020 - Using Azure Databricks Notebooks with Python to do Data engineerg and data analytics (blogpost)
  13. Dec 13 2020 - Using Python Databricks Koalas with Azure Databricks (blogpost)
  14. Dec 14 2020 - From configuration to execution of Databricks jobs (blogpost)
  15. Dec 15 2020 - Databricks Spark UI, Event Logs, Driver logs and Metrics (blogpost)
  16. Dec 16 2020 - Databricks experiments, models and MLFlow (blogpost)
  17. Dec 17 2020 - End-to-End Machine learning project in Azure Databricks (blogpost)
  18. Dec 18 2020 - Using Azure Data Factory with Azure Databricks (blogpost)
  19. Dec 19 2020 - Using Azure Data Factory with Azure Databricks for merging CSV files (blogpost)
  20. Dec 20 2020 - Orchestrating multiple notebooks with Azure Databricks (blogpost)
  21. Dec 21 2020 - Using Scala with Spark Core API in Azure Databricks (blogpost)
  22. Dec 22 2020 - Using Spark SQL and DataFrames in Azure Databricks (blogpost)
  23. Dec 23 2020 - Using Spark Streaming in Azure Databricks (blogpost)
  24. Dec 24 2020 - Using Spark MLlib for Machine Learning in Azure Databricks (blogpost)
  25. Dec 25 2020 - Using Spark GraphFrames in Azure Databricks (blogpost)
  26. Dec 26 2020 - Connecting Azure Machine Learning Services Workspace and Azure Databricks (blogpost)
  27. Dec 27 2020 - Connecting Azure Databricks with on premise environment (blogpost)
  28. Dec 28 2020 - Infrastructure as Code and how to automate, script and deploy Azure Databricks with Powershell (blogpost)
  29. Dec 29 2020 - Performance tuning for Apache Spark (blogpost)
  30. Dec 30 2020 - Monitoring and troubleshooting of Apache Spark (blogpost)
  31. Dec 31 2020 - Azure Databricks documentation, learning materials and additional resources (blogpost)

Additional Material

Additional Material as a collection of demo materials from different sessions is also available for use in this repository.

Blog

All posts were originally posted on my blog and made copy here at Github. On Github is extremely simple to clone the code, markdown file and all the materials.

Cloning the repository

You can follow the steps below to clone the repository.

git clone -n https://github.com/tomaztk/Azure-Databricks.git

Contact

Get in contact:

Gmail

Github URL

Contributing

Do the usual GitHub fork and pull request dance. Add yourself (or I will add you to the contributors section) if you want to.

Suggestions

Feel free to suggest any new topics that you would like to be covered.

License

MIT © Tomaž Kaštrun