This homework consists of a multifaceted assignment that blends data analysis with algorithmic problem-solving on a two large datasets containing information about books and their authors. Variety of tools have been used including Python, Pandas, command-line scripting, Apache Spark, Dask, and AWS EC2 instances.
Memebers of the Group 17:
- Saif Ali (saif.dev03@gmail.com)
- Pietro Sciabbarrasi (sciabbarrasi.1970875@studenti.uniroma1.it)
- Simone Zagaria (nephrite28@gmail.com)
- Edo Fejzic (edo.fejzic@hotmail.com)
Here is an overview of the main files in this project repository:
algorithmic_question.ipynb
: Jupyter notebook containing the solution to the algorithmic question.aws_question.ipynb
: Jupyter notebook detailing the AWS question analysis.aws_script.py
: Python script for AWS question.commandline_LLM.sh
: Executable shell script optimized with an ChatGPT for command-line question.commandline_original.sh
: Original executable shell script for command-line question.commandline_question.ipynb
: Jupyter notebook that documents the command-line question..gitignore
: File specifying which files and directories Git should ignore.README.md
: Markdown file with information about the project.main.ipynb
: Jupyter notebook containing in-depth analysis for multiple research questions.