/level2_movierecommendation_recsys-level2-recsys-10

level2_movierecommendation_recsys-level3-recsys-10 created by GitHub Classroom

Primary LanguageJupyter Notebook

๐ŸŽฅ Movie Recommendation

1. ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

1-1. ํ”„๋กœ์ ํŠธ ์ฃผ์ œ

image_movie_recommendation_1 Competition ์šฉ๋„๋กœ ์žฌ๊ตฌ์„ฑ๋œ MovieLens ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•ด User์˜ ์˜ํ™” ์‹œ์ฒญ ์ด๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ User๊ฐ€ ์„ ํ˜ธํ•  ์˜ํ™”๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค. User Sequence์—์„œ ์ผ๋ถ€ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ˆ„๋ฝ๋œ ์ƒํ™ฉ์„ ๊ฐ€์ •ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—, Timestamp๋ฅผ ๊ณ ๋ คํ•œ User์˜ ์ˆœ์ฐจ์ ์ธ ์ด๋ ฅ๊ณผ Implicit Feedback์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•ด์•ผ ํ•˜๋Š” ๋ฌธ์ œ์ด๋‹ค.

1-2. ํ”„๋กœ์ ํŠธ ๊ธฐ๊ฐ„

2022.12.12 ~ 2023.01.06(4์ฃผ)

1-3. ํ™œ์šฉ ์žฅ๋น„ ๋ฐ ์žฌ๋ฃŒ

  • ๊ฐœ๋ฐœํ™˜๊ฒฝ : VScode, PyTorch, Jupyter, Ubuntu 18.04.5 LTS, GPU Tesla V100-PCIE-32GB
  • ํ˜‘์—… Tool : GitHub, Notion
  • ์‹œ๊ฐํ™” : WandB

1-4. ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ๋„

  • (1) Sequence folder
    • BERT4Rec
    • FPMC
    • SASRec
    • S3Rec
  • (2) Encoder folder
    • EASE
    • MultiDAE
    • MultiVAE
    • RecVAE
  • (3) Ensemble
  • (4) EDA

1-5. ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ

train
โ”œโ”€โ”€ Ml_item2attributes.json
โ”œโ”€โ”€ directors.tsv
โ”œโ”€โ”€ genres.tsv
โ”œโ”€โ”€ titles.tsv
โ”œโ”€โ”€ train_ratings.csv
โ”œโ”€โ”€ writers.tsv
โ””โ”€โ”€ years.tsv

2. ํ”„๋กœ์ ํŠธ ํŒ€ ๊ตฌ์„ฑ ๋ฐ ์—ญํ• 

๊ตฌํ˜œ์ธ ๊ถŒ์€์ฑ„ ๋ฐ•๊ฑด์˜ ์žฅํ˜„์šฐ ์ •ํ˜„ํ˜ธ ํ—ˆ์œ ์ง„
MultiVAE/DAE ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™” EASE๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™” SASRec, S3Rec๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™” FPMC ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™” RecVAE ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™” EDA, BERT4Rec ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์ตœ์ ํ™”

3. ํ”„๋กœ์ ํŠธ ์ง„ํ–‰

3-1. ์‚ฌ์ „ ๊ธฐํš

  • 22.12.12(์›”) : Git branch ์ „๋žต ํšŒ์˜ Untitled
  • ๋ชจ๋ธ ํƒ์ƒ‰
    • 22.12.16(๊ธˆ) : ์‹ค์Šต ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ์„ธ๋ฏธ๋‚˜
    • 22.12.20(ํ™”) : ๋…ผ๋ฌธ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ์„ธ๋ฏธ๋‚˜
  • ๋ฒ ์ด์Šค๋ผ์ธ ์ฝ”๋“œ ์ž‘์„ฑ ๋ฐ ์‹คํ—˜ ๊ฒฐ๊ณผ ๊ณต์œ 
    • 22.12.23(๊ธˆ) : ๋ฒ ์ด์Šค๋ผ์ธ ์„ธ๋ฏธ๋‚˜

3-2. ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰

์ œ๋ชฉ ์—†๋Š” ๋‹ค์ด์–ด๊ทธ๋žจ drawio ๋‘ ๋ฒˆ์˜ ์„ธ๋ฏธ๋‚˜๋ฅผ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ Sequence ๋ชจ๋ธ๊ณผ Encoder ๋ชจ๋ธ์ด MovieLens ๋ฐ์ดํ„ฐ์— ์ ์ ˆํ•˜๋‹ค๊ณ  ํŒ๋‹จํ•˜์˜€๊ณ , 2๊ฐœ์˜ ์„ธ๋ถ€ ํŒ€(SequenceํŒ€, EncoderํŒ€)์œผ๋กœ ๋ถ„๋ฆฌํ•˜์—ฌ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ–ˆ๋‹ค. SequenceํŒ€, EncoderํŒ€ ๊ฐ์ž ๋ฒ ์ด์Šค๋ผ์ธ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•œ ํ›„ ๊ณต์œ ํ•˜๋Š” ์„ธ๋ฏธ๋‚˜๋ฅผ ์ง„ํ–‰ํ–ˆ๋‹ค. ์ดํ›„, ์ž‘์„ฑํ•œ ๋ฒ ์ด์Šค๋ผ์ธ ์ฝ”๋“œ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰ํ–ˆ๋‹ค.


4. ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰ ๊ฒฐ๊ณผ

4-1. ๋ชจ๋ธ ์„ฑ๋Šฅ ๋ฐ ๊ฒฐ๊ณผ

โ–  ๊ฒฐ๊ณผ (์ƒ์œ„ 4 ๊ฐœ) : Publie, Private 4์œ„ ๐Ÿ…

SASRec BERT FPMC EASE multiVAE multiDAE RecVAE
0.1294 0.0479 0.1278 0.1600 0.1356 0.1376 0.1505
์ตœ์ข… ์„ ํƒ ์—ฌ๋ถ€ ๋ชจ๋ธ (Ensemble ๋น„์œจ) Public Recall@10 Private Recall@10
X EASE ์™€ SASRec ์„ 7:3 ๋น„์œจ๋กœ ์„ž์Œ 0.1755 0.1655
O EASE (1), RecVAE(0.9), MultiDAE(0.8), MultiVAE(0.7), Sasrec(1) 0.1726 0.1651
X EASE , RecVAE, MultiDAE, MultiVAE, SASRec, Recall@10 ์ˆœ์œ„, ๋ชจ๋ธ ๊ฐ€์ค‘์น˜ 0.1630 0.1623
O EASE ์™€ SASRec ์„ 5:5 ๋น„์œจ๋กœ ์„ž์Œ 0.1758 0.1615

4-2. ๋ชจ๋ธ ๊ฐœ์š”

    1. Sequence ๊ณ„์—ด ๋ชจ๋ธ
      1. SASRec
      1. BERT4Rec
      1. FPMC
    1. Encoder ๊ณ„์—ด ๋ชจ๋ธ
      1. EASE
      1. MultiVAE/DAE
      1. RecVAE

4-3. ๋ชจ๋ธ ์„ ์ •

  • ๋ฒ ์ด์Šค๋ผ์ธ ์ฝ”๋“œ
    • SASRec & S3Rec
      • S3Rec์˜ ์‚ฌ์šฉ ์œ ๋ฌด์— ๋”ฐ๋ฅธ ์„ฑ๋Šฅ ์ฐจ์ด ์‹คํ—˜
        • max_seq_len ๊ณผ hidden_dim ์˜ ํฌ๊ธฐ๊ฐ€ ์ปค์งˆ์ˆ˜๋ก ์œ ์˜๋ฏธํ•œ ์„ฑ๋Šฅ ์ƒ์Šน์ด ์žˆ์ง€๋งŒ, GPU ํ—ˆ์šฉ๋Ÿ‰ ๋ฌธ์ œ ๋ฐœ์ƒ
          • SASRec๋งŒ์„ ์ตœ์ ํ™”ํ–ˆ์„ ๋•Œ ๋„์ถœ๋œ max_seq_len(448)๊ณผ hidden_dim(240)์„ S3Rec pretrain์‹œ์— CUDA error ๋ฐœ์ƒ
          • S3Rec์—์„œ๋Š” ๋ฒ”์œ„๋ฅผ ์ถ•์†Œ max_seq_len (150~250) hidden_dim (30,60,120,240)ํ•ด์„œ ์ตœ์ ํ™”
        • S3Rec์—์„œ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์งง์€ max_seq_len์—์„œ ๋™์ผํ•œ Recall@10 ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์œผ๋ฏ€๋กœ ์˜๋ฏธ๊ฐ€ ์žˆ์œผ๋‚˜ ๋งŽ์€ ์—ฐ์‚ฐ๋Ÿ‰๊ณผ ๋†’์€ ์†Œ์š” ์‹œ๊ฐ„ ๋•Œ๋ฌธ์— S3Rec๋งŒ ์ตœ์ ํ™”ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๊ฒฐ์ •
        • S3Rec Pretrain ์‚ฌ์šฉ(sweep ์ ์šฉ) Recall@10 : 0.1294
        • S3Rec Pretrain ๋ฏธ์‚ฌ์šฉ(sweep ์ ์šฉ) Recall@10 : 0.1294
        • ์—ฐ์‚ฐ ์†Œ์š” ์‹œ๊ฐ„์„ ๋น„๊ต
          • S3Rec ์‚ฌ์šฉ์‹œ : ์•ฝ 13์‹œ๊ฐ„
          • S3Rec ๋ฏธ์‚ฌ์šฉ์‹œ : ์•ฝ 3์‹œ๊ฐ„
  • ์ถ”๊ฐ€์ ์ธ ๋ชจ๋ธ ์„ ํƒ
    • Sequence ๋ชจ๋ธ
      • FPMC
        • Movie Recommendation๋Š” User์˜ Implicit Feedback์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ Item์„ ์ถ”์ฒœํ•˜๋Š” ํ”„๋กœ์ ํŠธ์ด๊ธฐ ๋•Œ๋ฌธ์— MF ๋ชจ๋ธ์— Markov Chains๋ฅผ ์ ์šฉํ•œ FPMC ๋ชจ๋ธ์ด ์ ์ ˆํ•˜๋‹ค๊ณ  ํŒ๋‹จ
      • BERT4Rec
        • BERT4Rec ์—์„œ ์‚ฌ์šฉํ•˜๋Š” Cloze mask ๋ฐฉ์‹์ด ๊ฐœ์š”์—์„œ ์†Œ๊ฐœ๋œ ์‹œ์ฒญ ์ด๋ ฅ ๋ˆ„๋ฝ๊ณผ ์œ ์‚ฌํ•˜๋‹ค๊ณ  ํŒ๋‹จ
    • Encoder ๋ชจ๋ธ
      • Movie Recommendation์€ Top-K Ranking์„ ์‚ฌ์šฉํ•˜๋Š” ํ”„๋กœ์ ํŠธ์ด๊ธฐ ๋•Œ๋ฌธ์— Encoder, Decoder๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธกํ•˜๋Š” Encoder ๋ชจ๋ธ์ด ์ ์ ˆํ•˜๋‹ค๊ณ  ํŒ๋‹จ, Static ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•œ ์ถ”์ฒœ์ด๊ธฐ ๋•Œ๋ฌธ์— Encoder ๋ชจ๋ธ ์ค‘ MultiVAE, MultiDAE, RecVAE, EASE ์‚ฌ์šฉ

4-4. ๋ชจ๋ธ ์„ฑ๋Šฅ ๊ฐœ์„  ๋ฐฉ๋ฒ•

  • Hyperparameter Tuning(Wandb, Sweep)
    • Sweep
  • Ensemble
    • Top-K Counting
      • Ensemble๋Œ€์ƒ ๋ชจ๋ธ์˜ Recall@20/15/10์„ ์ถ”์ถœํ•˜์—ฌ ์ถ”์ฒœ Item Count
      • ์ถ”์ฒœ ์ˆœ์œ„ ๋ณ„ ๊ฐ€์ค‘์น˜, ๋ชจ๋ธ ๋ณ„ ๊ฐ€์ค‘์น˜๋ฅผ ์ ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ๋ฅผ ์ง„ํ–‰
    • ๋ชจ๋ธ ๋ณ„ ์ƒ์œ„ N๊ฐœ ์ถ”์ถœ
      • Ensemble ๋Œ€์ƒ ๋ชจ๋ธ ์ค‘ ์ƒ์œ„ N๊ฐœ๋ฅผ ์ถ”์ถœํ•˜์—ฌ 10๊ฐœ์˜ ์ถ”์ฒœ Item์œผ๋กœ ๊ตฌ์„ฑ
      • N๊ฐœ์˜ ๊ธฐ์ค€์€ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ๋ณ„ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ

5. WrapUp Report

Level_2_MovieRecommendation_๋žฉ์—…๋ฆฌํฌํŠธ