This repository is in support of the Kaggle Cursed Movie challenge. The goal of this challenge is to try and predict whether a movie is cursed or not based on a variety of graph features within Neo4j.
Email: clair.sullivan@neo4j.com
-
./cypher_queries/
create_rdf_data.cql
: Cypher query to create the RDF data used in this challenge. You will not actually need to run this since the data is already provided (see below). This script just shows how the data file was created.graph_import.cql
: You will be creating one graph database populated with two different datasets (see below). The first 3 queries are used to pull in the RDF data. The final query is used to pull in the CSV data.
-
./data/
movies-small.nt
: This is the data file, in N-Triples format, of the starting movie list as collected from Wikidata and put into RDF format.curse_data_mined.csv
: This is supplemental information about the above movies that was hand collected.
- Neo4j: The main page
- Neo4j Sandbox Creation: Create your own free database instance
- Graph Algorithms Book: Free book on all things graph data science
- Cypher Query Language: Docs on Cypher, the query language for Neo4j
- Graph Data Science Library: Docs on how to use the GDS library within Neo4j
- Neosematics: Neo4j tool for working with RDF data
- Bite-Sized Neo4j for Data Scientists: YouTube playlist of short (~5 minutes) videos on a variety of topics to get data scientists up and running with Neo4j
- Neo4j Discord Server: Come chat about this challenge in the #hackathon channel!