Visit Kaggle page for the full Game Dataset
.
|-- src
| |-- Combine_game_info.ipynb
| |-- Get_game_id.ipynb
| `-- Get_game_info.ipynb
|
`-- data
|-- game_id.csv
|-- game_info.csv
|-- game_id
| |-- 1.json
| |-- 2.json
| `-- *.json
|
`-- game_info
|-- 1.json
|-- 2.json
`-- *.json
pip3 install -r requirements.txt
- Run
Get_game_id.ipynb
. This makes request to all pages in https://api.rawg.io/api/games?page=1 and save one JSON file for each page in./data/game_id/*.json
where*
is the page number. At the end,./data/game_id.csv
is created which contains the name and id of each game which is needed for step 2. - Run
Get_game_info.ipynb
. Using the id from Step 1, this script makes request to https://api.rawg.io/api/games/ and save one JSON file for each game in./data/game_info/*.json
where*
is the game id. - Run
Combine_game_info.ipynb
. This combines data in./data/game_info/
and saves it as./data/game_info.csv
.game_info.csv
contains the final dataset
- File sizes:
./data/game_id
with 25000 files has the size of ~15MBs../data/game_info
with 470000 files has the size of ~230MBs
- To increase the speed of obtaining the data from RAWG API, concurrent programming is applied to step 1 and 2. However, execution time depends greatly on internet connection speed.
- Step 1 takes ~1 hour with 32 threads
- Step 2 takes ~3-5 hours with 64 threads
- Step 3 takes ~2 minutes
- RAWG API has a limit of 500,000 page views a month. As more games come out in the future, it would be hard to get all game information in one run without exceeding this limitation.
- When 1 thread fails while requesting data, it will skip to next game/page automatically. To make sure you get all games from RAWG, you can run Step 1 and Step 2 multiple times. Downloaded files are skipped automatically.
- To reduce the file size of downloaded files and the final CSV dataset, not all JSON information is downloaded. If you want more customization, you will need to change how the JSON is handled in Step 2
- Although Multithreading is applied, the whole process can take up to ~6 hours to finish because of the large amount of data.
This dataset contains 474417 video games on over 50 platforms including mobiles. All game information was obtained using Python with RAWG API. This dataset was last updated on Dec 22nd 2020. If you are interested in obtaining more recent games, visit the GitHub page for more information. I plan to update this dataset annually.
Each row contains information about one game. There are several columns that have multiple values like platforms, genres, ... In those cases, values are separated by double pipes ||
.
id
: An unique ID identifying this Game in RAWG Databaseslug
: An unique slug identifying this Game in RAWG Databasename
: Name of the gamemetacritic
: Rating of the game on Metacriticreleased
: The date the game was releasedtba
: To be announced stateupdated
: The date the game was last updatedwebsite
: Game Websiterating
: Rating rated by RAWG userrating_top
: Maximum ratingplaytime
: Hours needed to complete the gameachievements_count
: Number of achievements in gameratings_count
: Number of RAWG users who rated the gamesuggestions_count
: Number of RAWG users who suggested the gamegame_series_count
: Number of games in the seriesreviews_count
: Number of RAWG users who reviewed the gameplatforms
: Platforms game was released on. Separated by||
developers
: Game developers. Separated by||
genres
: Game genres. Separated by||
publishers
: Game publishers. Separated by||
esrb_rating
: ESRB ratingsadded_status_yet
: Number of RAWG users had the game as "Not played"added_status_owned
: Number of RAWG users had the game as "Owned"added_status_beaten
: Number of RAWG users had the game as "Completed"added_status_toplay
: Number of RAWG users had the game as "To play"added_status_dropped
: Number of RAWG users had the game as "Played but not beaten"added_status_playing
: Number of RAWG users had the game as "Playing"
Thanks to RAWG for providing easy to use and fast API
Icon made by Good Ware from www.flaticon.com
With this data, one can create a game recommendation platform as well as drawing insights about the gaming industry and gaming trends.