Environment setting
- Put data in mdp/data inside main directory
- Put challenge set in main directory
How to build recommendations:
- run build_structures.py
- run full_kernel_songs.py
- run create_Ku.py with parameters 1 and 0.5
- run create_pl2title.py
- run words_similarity_builder.py
- run calc_P.py
- run titles_similarity_0.py
- run user_based_MSD_1.py
- run item_based_MSD_5_10_25.py with parameters 0.7 0.4 5
- run item_based_MSD_5_10_25.py with parameters 0.7 0.4 10
- run item_based_MSD_5_10_25.py with parameters 0.7 0.4 25
- run selected_KOMD_100.py with parameters 100 50000
- run merge_csv.sh
Estimated memory: 80Gb
Estimated time: 20 hours
Estimated disk space 150Gb
Most of the steps above can be parallelized as shown in the following figure. The estimated time is calculated considering a sequential execution.