/sparksnake

Improving the development of Spark applications deployed as jobs on AWS services like Glue and EMR

Primary LanguagePythonMIT LicenseMIT


snakespark-logo

PyPI PyPI - Downloads PyPI - Status

CI workflow Documentation Status codecov

Table of content

What is the sparksnake library?

The sparksnake library provides an easy, fast, and efficient way to use Spark features inside analytics services on AWS. With sparksnake, it is possible to use classes, methods and functions developed in pyspark to simplify, as much as possible, the journey of building Spark applications anywhere!

Note Now the sparksnake library has an official documentation in readthedocs! Visit the following link and check out usability technical details, hands on demos and more!

Features

  • 🤖 Apply common Spark operations using few lines of code
  • 💻 Start developing your Spark applications anywhere using the "default" mode or in any AWS services that uses Spark
  • ⏳ Stop spending time setting up the boring stuff of your Spark applications
  • 💡 Apply the best practices on your application by structuring your code following the best practices
  • 👁️‍🗨️ Improve your aplication's observability by using detailed log messages on CloudWatch and exception handlers

Contact me


References

Python

Docs

Github

Tests