Apache Spark Examples
This project has examples using Apache Spark and Scala. I'd suggest reading the books Agile Data Science or Agile Data Science because of those examples were inspired by the book.
What do you need to use these examples?
First of all, you need to download your emails. I used the python app in the Agile Data Science GitHub. In that case I used this commands:
- pip install lepl
- pip install avro
- My Gmail is in Portuguese, so I used this option for my All mail: '[Gmail]/Todo o correio'
(optional) Gmail.py command:
$ ./gmail.py -m automatic -u 'your.email@gmail.com' -p 'your_password' -s ./email.avro.schema -f '[Gmail]/Todo o correio' -o /tmp/test_mbox 2>&1 &