Analyze movies related data using neo4j and more!
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
This big data project utilizs Neo4j Graph Database to analyze Movies data from IMDb to provide insight about movies, actors, and customers.
We ask questions like:
- What is the max/min/avg of the MovieLens ratings for “Avatar”?
- How many directors Christian Bale and Michael Caine worked with?
- Which actor/actresses does a specific customer probably like the most, who we can recommend to this customer in the future, based on all the movies he/she rated?
Here's the project report and a visual representation of a small fraction of data in the database:
Use the following instructions to achieve the same database state.
python3
is used for data cleansing and validation.brew install python
pipenv
is used for modern dependency managementbrew install pipenv
-
Download the required dataset.
-
Cleanse and validate the above dataset with provided python3 code (or you can directly download validated data directly from here).
-
Copy the datasets into Neo4j import folder and open Neo4j browser.
-
Follow the comment and run the cypher code inside the load-csv-v2.cypher file to import the data into the Neo4j database.
-
Execute cypher queries in the business-questions-v2.cypher file or write your own queries.
This project is a collective effort of the following members:
- @Karashan
- @StrongWeiUMN
- @LuyaoZhang5380
- @zixuanzhang98
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Zixuan Zhang - zixuanzhang.x@gmail.com
Project Link: https://github.com/zixuanzhang98/neo4j-data-processing