/Bus-Arrival-Time-Prediction

๐ŸšŒ CLOVA AI RUSH 2022 @navermaps

Primary LanguagePython

CLOVA AI RUSH 2022: Bus Arrival Time Prediction

4th place solution to CLOVA AI RUSH 2022: Bus arrival time prediction

๐ŸŽฏ Result

RMSE error 46.36 sec / baseline RMSE error 75.51 sec

๐Ÿ“Œ Dataset features

์‹ค์‹œ๊ฐ„ ๋กœ๊ทธ๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ, ๋…ธ์ด์ฆˆ๊ฐ€ ์กด์žฌํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ ๋กœ๊ทธ ์ค‘ k๋ฒˆ์งธ ํ•ด๋‹น๋˜๋Š” ์‹ค์‹œ๊ฐ„ ๋กœ๊ทธ๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๊ณ , ๋”ฐ๋ผ์„œ ์ •๋ฅ˜์žฅ ์‹œํ€€์Šค๊ฐ€ ์ˆœ์ฐจ์ ์ด์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋Š” ํŠน์ง•์ด ์กด์žฌ.

โœ”๏ธ Method

  • ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ณผ์ •๊ณผ ์ˆ˜์ง‘๋˜๋Š” ๊ณผ์ •์„ ํŒŒ์•… ํ›„, ์‹œํ€€์Šค๊ฐ€ ์ค‘๋ณต๋˜๊ฑฐ๋‚˜ ์ •๊ฑฐ์žฅ ๊ฐ„์˜ ์ด๋™์‹œ๊ฐ„์ด 5์ดˆ ๋ฏธ๋งŒ์ด๊ฑฐ๋‚˜ ํŠน์ • ์‹œ๊ฐ„์„ ์ดˆ๊ณผํ•˜๋Š” ๋“ฑ์˜ ๋…ธ์ด์ฆˆ ํŒŒ์•…

  • ๋ฒ„์Šค ๋กœ๊ทธ ๋ฐ์ดํ„ฐ์™€ ๋‹ค๋ฅธ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ€์ง€๋Š” ์ฐจ์ด์ ์„ ํŒŒ์•…. ๊ธฐ์ ๊ณผ ์ข…์  ์‚ฌ์ด๋ฅผ ์šดํ–‰ํ•˜๋‹ˆ ์ฒ˜์Œ๊ณผ ๋์ด ์ •ํ•ด์ ธ ์žˆ๊ณ , ๊ฐ™์€ ์ฃผํ–‰ ๋…ธ์„ ์—์„œ ๋‹ค๋ฅธ ์š”์ผ, ๋‹ค๋ฅธ ์‹œ๊ฐ„๋Œ€์— ์šดํ–‰๋œ ๊ธฐ๋ก ๋“ฑ์ด ์กด์žฌํ•˜๋Š” ์ ์œผ๋กœ๋ถ€ํ„ฐ ๋‹ค๋ฅธ ์ฃผํ–‰ ๋กœ๊ทธ๋ฅผ ์ด์šฉํ•˜์—ฌ ํ‰๊ท ๊ณผ ์ค‘์•™๊ฐ’์œผ๋กœ ์ ์ ˆํžˆ imputation ์ˆ˜ํ–‰

  • ๋™์ผํ•œ ์ฃผํ–‰ ๋…ธ์„ ์— ๋Œ€ํ•ด์„œ ๋‹ค๋ฅธ ์š”์ผ, ๋‹ค๋ฅธ ์‹œ๊ฐ„์ด ๋ชจ๋‘ ๋…ธ์ด์ฆˆ๋กœ ์œ ์‹ค๋œ ๊ฒฝ์šฐ, ๋ฒ„์Šค๋Š” ํšŒ์ฐจ ์ง€์ ์œผ๋กœ๋ถ€ํ„ฐ ๋ฐฉํ–ฅ์ด ๋ฐ”๋€ ์ฑ„๋กœ ๋‹ค์‹œ ๋ฐ˜๋Œ€ ๋ฐฉํ–ฅ์œผ๋กœ ์šดํ–‰ํ•˜๋Š” ์ ์„ ํ™œ์šฉ. ๋ฐ˜๋Œ€ํŽธ์—์„œ ๊ฑธ๋ฆฐ ์†Œ์š”์‹œ๊ฐ„์˜ ํ‰๊ท ์œผ๋กœ missing value๋ฅผ ์ฒ˜๋ฆฌ.

  • ํŠน์ • ๋ฒ„์Šค์˜ $k$ ~ $k+1$ ๊ฑฐ๋ฆฌ๊ฐ€ ๋ชจ๋“  ์ •๊ฑฐ์žฅ ๊ฐ„์˜ ๊ฑฐ๋ฆฌ ์ค‘ ์ œ์ผ ๋†’์€ ๊ฒฝ์šฐ, ํ•ด๋‹น ๊ฑฐ๋ฆฌ ์†Œ์š”์‹œ๊ฐ„๋ณด๋‹ค ์˜ค๋ž˜๊ฑธ๋ฆฌ๋Š” ์†Œ์š”์‹œ๊ฐ„์€ outlier๋กœ ์ฒ˜๋ฆฌํ•˜๊ณ  ํ‰๊ท ๊ณผ ์ค‘์•™๊ฐ’์„ ํ™œ์šฉํ•˜์—ฌ ์ฒ˜๋ฆฌ.

  • Informer๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ฐ€๋ณ€์ ์ธ ์‹œํ€€์Šค ๊ธธ์ด์˜ ์†Œ์š”์‹œ๊ฐ„๋“ค์„ ์˜ˆ์ธก.

โญ Final solution

n-seed ensemble(bagging)