/level2-dkt-level2-recsys-06

level2-dkt-level2-recsys-06 created by GitHub Classroom

Primary LanguageJupyter Notebook

๐Ÿ“• DKT


DKT๋ฅผ ํ™œ์šฉํ•˜๋ฉด ์šฐ๋ฆฌ๋Š” ํ•™์ƒ ๊ฐœ๊ฐœ์ธ์—๊ฒŒ ์ˆ˜ํ•™์˜ ์ดํ•ด๋„์™€ ์ทจ์•ฝํ•œ ๋ถ€๋ถ„์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ค ๋ฌธ์ œ๋“ค์„ ํ’€๋ฉด ์ข‹์„์ง€ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•˜์—ฌ DKT๋Š” ๋งž์ถคํ™”๋œ ๊ต์œก์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ์•„์ฃผ ์ค‘์š”ํ•œ ์—ญํ• ์„ ๋งก๊ฒŒ ๋œ๋‹ค. ์‹œํ—˜์„ ๋ณด๋Š” ๊ฒƒ์€ ๋™์ผํ•˜์ง€๋งŒ ๋‹จ์ˆœํžˆ ์šฐ๋ฆฌ๊ฐ€ ์ˆ˜ํ•™์„ 80์ ์„ ๋งž์•˜๋‹ค๊ณ  ์•Œ๋ ค์ฃผ๋Š” ๊ฒƒ์„ ๋„˜์–ด์„œ ์šฐ๋ฆฌ๊ฐ€ ์ˆ˜ํ•™์ด๋ผ๋Š” ๊ณผ๋ชฉ์„ ์–ผ๋งˆ๋งŒํผ ์ดํ•ดํ•˜๊ณ  ์žˆ๋Š”์ง€๋ฅผ ์ธก์ •ํ•ด์ฃผ๊ณ , ์ด๋Ÿฐ ์ดํ•ด๋„๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์šฐ๋ฆฌ๊ฐ€ ์•„์ง ํ’€์ง€ ์•Š์€ ๋ฏธ๋ž˜์˜ ๋ฌธ์ œ์— ๋Œ€ํ•ด์„œ ์šฐ๋ฆฌ๊ฐ€ ๋งž์„์ง€ ํ‹€๋ฆด์ง€ ์˜ˆ์ธก์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์ด๋Ÿฐ DKT๋ฅผ ํ™œ์šฉํ•˜๋ฉด ์šฐ๋ฆฌ๋Š” ํ•™์ƒ ๊ฐœ๊ฐœ์ธ์—๊ฒŒ ์ˆ˜ํ•™์˜ ์ดํ•ด๋„์™€ ์ทจ์•ฝํ•œ ๋ถ€๋ถ„์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ค ๋ฌธ์ œ๋“ค์„ ํ’€๋ฉด ์ข‹์„์ง€ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•˜๋‹ค.

โ— ์ฃผ์ œ ์„ค๋ช…

  • ํ•™์ƒ ๊ฐœ๊ฐœ์ธ์˜ ์ดํ•ด๋„๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ์ง€์‹ ์ƒํƒœ๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ์ผ๋ณด๋‹ค๋Š” ๊ฐ ํ•™์ƒ์ด ํ‘ผ ๋ฌธ์ œ ๋ฆฌ์ŠคํŠธ์™€ ์ •๋‹ต ์—ฌ๋ถ€๊ฐ€ ๋‹ด๊ธด ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ›์•„ ์ตœ์ข… ๋ฌธ์ œ๋ฅผ ๋งž์ถœ์ง€ ํ‹€๋ฆด์ง€ ์˜ˆ์ธกํ•œ๋‹ค.

๐Ÿ‘‹ ํŒ€์› ์†Œ๊ฐœ

๊ฐ•์‹ ๊ตฌ ๊น€๋ฐฑ์ค€ ๊น€ํ˜œ์ง€ ์ด์ƒ์—ฐ ์ „์ธํ˜
Avatar Avatar Avatar Avatar Avatar

๐Ÿ”จ Tools

Python 3.8.5
PyTorch torch 1.10.2
Scikit-Learn 1.0.2
Wandb 0.12.15

๐Ÿข Structure

โ”œโ”€โ”€ EDA
โ”‚   โ”œโ”€โ”€ hyeji_EDA.ipynb
โ”‚   โ””โ”€โ”€ inhyeok_EDA.ipynb
โ”œโ”€โ”€ Ensemble
โ”‚   โ””โ”€โ”€ ensemble.ipynb
โ”œโ”€โ”€ NMF
โ”‚   โ”œโ”€โ”€ NMF.ipynb
โ”‚   โ””โ”€โ”€ readme.md
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ SaintPlus
    โ”œโ”€โ”€ README.md
    โ”œโ”€โ”€ args.py
    โ”œโ”€โ”€ data_generator.py
    โ”œโ”€โ”€ elapsed.png
    โ”œโ”€โ”€ model.py
    โ”œโ”€โ”€ pre_process.py
    โ”œโ”€โ”€ structure.png
    โ”œโ”€โ”€ submission.py
    โ”œโ”€โ”€ sweep.yaml
    โ”œโ”€โ”€ train.py
    โ””โ”€โ”€ utils.py

๐Ÿ‘ฉโ€๐Ÿ”ฌ ์—ฐ๊ตฌ๊ณผ์ •

๐Ÿ”Ž EDA

userID : ์‚ฌ์šฉ์ž ๊ณ ์œ  ๋ฒˆํ˜ธ

  • train : 6698๋ช…์˜ ๊ณ ์œ  ์‚ฌ์šฉ์ž
  • test : 744๋ช…์˜ ๊ณ ์œ  ์‚ฌ์šฉ์ž

assessmentItemID : ๋ฌธํ•ญ์˜ ๊ณ ์œ  ๋ฒˆํ˜ธ

  • 9454๊ฐœ์˜ ๊ณ ์œ  ๋ฌธํ•ญ

testID : ์‹œํ—˜์ง€ ๊ณ ์œ  ๋ฒˆํ˜ธ

  • 1537๊ฐœ์˜ ๊ณ ์œ ํ•œ ์‹œํ—˜์ง€

answerCode : ์ •๋‹ต ์—ฌ๋ถ€

  • ํ‹€๋ฆฐ ๊ฒฝ์šฐ 0, ๋งž๋Š” ๊ฒฝ์šฐ 1

Timestamp : ๋ฌธ์ œ ํ’€๊ธฐ ์‹œ์ž‘ํ•œ ์‹œ๊ฐ„

  • train : 2019-12-31 15:08:01~2020-12-29 16:46:21
  • test : 2019-12-31 23:43:18~ 2020-12-29 16:44:10

KnowledgeTag : ๋ฌธ์ œ์˜ ์ค‘๋ถ„๋ฅ˜ ํƒœ๊ทธ

  • 912๊ฐœ์˜ ํƒœ๊ทธ

โ— EDA ๊ฒฐ๊ณผ

  • ์‚ฌ์šฉ์ž ๋ณ„ ์ •๋‹ต๋ฅ 
    image

  • ์‚ฌ์šฉ์ž ๋ณ„ ์ •๋‹ต๋ฅ  ํ‰๊ท ์€ 0.628

  • ๊ฐ€์žฅ ๋‚ฎ์€ ์ •๋‹ต๋ฅ ์€ 0.000000

  • ๊ฐ€์žฅ ๋†’์€ ์ •๋‹ต๋ฅ ์€ 1.000000

  • ๋ฌธํ•ญ ๋ณ„ ์ •๋‹ต๋ฅ 
    image

    • ๋ฌธํ–ฅ๋ณ„ ์ •๋‹ต๋ฅ  ํ‰๊ท ์€ 0.6542
    • ๊ฐ€์žฅ ๋‚ฎ์€ ์ •๋‹ต๋ฅ ์€ 0.04943
    • ๊ฐ€์žฅ ๋†’์€ ์ •๋‹ต๋ฅ ์€ 0.99631
  • ์‹œํ—˜์ง€ ๋ณ„ ์ •๋‹ต๋ฅ 
    image

    • ์‹œํ—˜์ง€ ๋ณ„ ํ‰๊ท ์€ 0.667982
    • ๊ฐ€์žฅ ๋‚ฎ์€ ์ •๋‹ต๋ฅ ์€ 0.327186
    • ๊ฐ€์žฅ ๋†’์€ ์ •๋‹ต๋ฅ ์€ 0.955474
  • ํƒœ๊ทธ ๋ณ„ ์ •๋‹ต๋ฅ 
    image

    • ํƒœ๊ทธ ๋ณ„ ์ •๋‹ต๋ฅ  ํ‰๊ท ์€ 0.615524
    • ๊ฐ€์žฅ ๋‚ฎ์€ ์ •๋‹ต๋ฅ ์€ 0.188940
    • ๊ฐ€์žฅ ๋†’์€ ์ •๋‹ต๋ฅ ์€ 0.977778
  • ์ •๋‹ต๋ฅ ๊ณผ ๋ฌธ์ œ๋ฅผ ํ‘ผ ๊ฐœ์ˆ˜ ์‚ฌ์ด ์ธ๊ณผ๊ด€๊ณ„ : 0.168
    image

    • ํ‰๊ท ๋ณด๋‹ค ๋ฌธํ•ญ์„ ๋งŽ์ด ํ‘ผ ํ•™์ƒ๋“ค์ด ๋‚ฎ์€ ํ•™์ƒ๋“ค ๋ณด๋‹ค ๋†’์€ ์ •๋‹ต๋ฅ ์„ ๋ณด์ด๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค.
  • ํƒœ๊ทธ๋ฅผ ํ’€์—ˆ๋˜ ์‚ฌ์šฉ์ž์˜ ์ˆ˜์™€ ์ •๋‹ต๋ฅ  ์‚ฌ์ด ์ƒ๊ด€๊ด€๊ณ„ : 0.376
    image

    • ํ‰๊ท ๋ณด๋‹ค ๋งŽ์ด ๋…ธ์ถœ๋œ ํƒœ๊ทธ๊ฐ€ ๋†’์€ ์ •๋‹ต๋ฅ ์„ ๋ณด์ด๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค.

๐Ÿข Models

Untitled

1๏ธโƒฃ Model

  • ๋งˆ์ง€๋ง‰ ๋ฌธ์ œ์˜ ์ •๋‹ต์—ฌ๋ถ€๋ฅผ ๋งž์ถ”๋Š” ๊ฒƒ์ด๊ธฐ์— ์•ž์˜ ๋ฌธ์ œ ํ’€์ด ์ด๋ ฅ๋“ค์ด ์˜ํ–ฅ์„ ๋ผ์น  ๊ฒƒ์œผ๋กœ ์˜ˆ์ธก๋˜์–ด sequential ๋ฌธ์ œ๋ฅผ ํ’€๊ธฐ์œ„ํ•œ ๋ชจ๋ธ์ธ BERT ์™€ LSTM์„ ์ ์šฉ
  • ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๊ฐ€ ์ ์–ด ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๊ฐ€ ๋งŽ์ด ํ•„์š”ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๋ณด๋‹ค๋Š” ๋”ฅ๋Ÿฌ๋‹์ด ์•„๋‹Œ ๋จธ์‹ ๋Ÿฌ๋‹์˜ ๋ชจ๋ธ๋“ค์ด ๋” ์ข‹์„ ๊ฒƒ์ด๋ผ ์˜ˆ์ธก๋˜์–ด LGBM , Catboost, XGBoost, HistGradeintBoosting ์ ์šฉ
  • ๋‹จ์ˆœ ํ–‰๋ ฌ ๋ถ„ํ•ด๋ฅผ ํ†ตํ•ด ํŠน์„ฑ์„ ๊ตฌํ•˜๋Š” ๊ฒƒ๋„ ์ข‹์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ฌ ๊ฒƒ์ด๋ผ ์˜ˆ์ธก๋˜์–ด SVD, NMF ์‚ฌ์šฉ
  • ์œ ์ €์˜ ์ˆ˜์ค€๊ณผ ๋ฌธ์ œ ๋‚œ์ด๋„๋ฅผ ๊ณ ๋ คํ•˜๋Š” Feature๋ฅผ ์ถ”๊ฐ€.
  • DKT๋ฅผ ์œ„ํ•œ ๋ชจ๋ธ์ธ SAINT+ (riiid) , CL4KT(upstage) ****๋ฅผ ์ ์šฉ

2๏ธโƒฃ Ensemble

  • ์„œ๋กœ ๋‹ค๋ฅธ ๋ฐฉ์‹์˜ ๋ชจ๋ธ๋“ค ์œ„์ฃผ๋กœ ์•™์ƒ๋ธ”ํ•˜์˜€์Œ. ( SAINT+NMF , SAINT+NMF+Boost, ...)

โœจ Final Model : SAINT+ ์™€ NMF ์•™์ƒ๋ธ”

  • SAINT+ ์™€ NMF ์˜ ๊ฒฐ๊ณผ ํ‰๊ท ์„ ์‚ฌ์šฉ.
  • NMF ๋ถ„์„์œผ๋กœ ์ƒ์„ฑ๋œ ํ–‰๋ ฌ ๋ณ€ํ™˜ ํ–‰๋ ฌ์„ ํ†ตํ•ด ์ •๋‹ต์„ ์ถ”๋ก ํ•˜์—ฌ ๊ฒฐ๊ณผ ๊ฐ’ ์ƒ์„ฑํ•˜๊ณ , Saint+ encoder / decoder์— attention์ด ์‚ฌ์šฉ.
  • SAINT+ ๋Š” ์ •๋‹ต ์—ฌ๋ถ€์— ๋ฌธ์ œ ํ‘ผ ์‹œ๊ฐ„์ด ์ค‘์š”ํ•œ๋ฐ ์‹œ๊ฐ„์„ ์ž„๋ฒ ๋”ฉํ•˜์—ฌ ์‚ฌ์šฉํ•˜์˜€๊ธฐ์— ์ข‹์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ ๊ฒƒ์œผ๋กœ ์˜ˆ์ธก.
  • ์ •๋‹ต ์—ฌ๋ถ€๋ฅผ 0 / 1 ๋กœ ๋‚˜ํƒ€๋‚ด๊ณ  ํŠน์„ฑ๋“ค๋„ ์Œ์ˆ˜๊ฐ€ ์—†์„ ๊ฒƒ์ด๊ธฐ์— SVD๋ณด๋‹ค NMF์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋” ์ข‹์€ ๊ฒƒ์œผ๋กœ ์˜ˆ์ธก.

๐Ÿ† ์ตœ์ข… ๊ฒฐ๊ณผ

Model Final Rank Final score
Saint Plus + NMF 6 0.8494

๐Ÿ“’ ๋ณด๊ณ ์„œ

๐Ÿ“œ ์ฐธ๊ณ ์ž๋ฃŒ