Business_Analytics_ch5

[Ch.5] Semi-supervised Learning

VAT(Virtual Adversarial Training)

๐Ÿ“‚ Contents


  • Background
  • Dataset
  • Experiments
  • Result
  • Analysis

๐Ÿ“Œ Background

Virtual Adversarial Training(VAT)

- ๋…ผ๋ฌธ : Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning [paper](https://arxiv.org/abs/1704.03976)

๊ธฐ์กด adversarial training์—์„œ๋Š” ์กฐ๊ธˆ์˜ ๋ณ€ํ™”๋กœ ๋ชจ๋ธ์˜ ์˜ˆ์ธก์„ ํฌ๊ฒŒ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ๋Š” ๋ฐฉํ–ฅ์„ ์ ๋Œ€์  ๋ฐฉํ–ฅ์œผ๋กœ ์ด์šฉํ•ด ๊ทธ ๋ฐฉํ–ฅ์œผ๋กœ ๋งŒ๋“  ์ƒ˜ํ”Œ๋“ค์„ ํ•™์Šต์‹œ์ผœ ๋ชจ๋ธ์˜ ๊ฒฐ์ • ๊ฒฝ๊ณ„๋ฅผ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ๋งŒ๋“ค์–ด์คŒ

  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๊ฐ„๋‹จํ•œ ๋ณ€ํ˜•์ด ์•„๋‹Œ adversarialํ•œ ๋ณ€ํ˜• ์ฑ„ํƒ

  • virtual adversarial loss : ๊ฐ input ๋ฐ์ดํ„ฐ์˜ conditional label distribution์˜ robustness ํ‘œํ˜„

  • adversarial: loss์˜ ๊ฐ’์„ ์ตœ๋Œ€ํ•œ ํ•ด์น˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋ณ€ํ˜• (KL divergence ์ด์šฉ)

  • virtual adversarial training : label ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„ semi-supervised learning์— ์ ์šฉ์ด ๊ฐ€๋Šฅํ•จ

  • regularization technique ์ด์šฉ : overfitting ๋ฐฉ์ง€, unseen example๋“ค์— ๋Œ€ํ•ด ์ž˜ generalizaํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•จ

  • adversarial training๊ณผ์˜ ์ฐจ์ด์  : label์„ ์ด์šฉํ•˜์—ฌ adversarial perturbation ์ƒ์„ฑ

  • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” x, ์ •๋‹ต ๋ผ๋ฒจ์€ y, x*์˜ ๊ฒฝ์šฐ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ „์ฒด ์˜๋ฏธ LDS(x^(n), \theta)

  • ์ ˆ์ฐจ

  1. input data point x์—์„œ ์‹œ์ž‘
  2. ์ž‘์€ perturbation r์„ ์ด์šฉํ•˜์—ฌ x๋ฅผ ๋ณ€ํ˜•์‹œํ‚ด + transform๋œ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋Š” T(x) = x + r
  3. perturbation r (adversarial ๋ฐฉํ–ฅ์— ์žˆ์–ด์•ผ) perturb๋œ input์€ perturb๋˜์ง€์•Š์€ input์˜ output๊ณผ ๋‹ฌ๋ผ์•ผํ•จ (2๊ฐœ์˜ output distribution ์‚ฌ์ด์˜ KL divergence๋Š” ์ตœ๋Œ€ํ™” ๋˜์–ด์•ผํ•จ, r์˜ l2 normd์€ ์ž‘์•„์•ผ ํ•จ)
  4. adversarial perturbation๊ณผ transform๋œ input์„ ์ฐพ์€ ์ดํ›„, kl divergence๊ฐ€ ์ตœ์†Œํ™”๋˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋ชจ๋ธ์˜ weight์„ update ์‹œ์ผœ์ฃผ๊ณ , ๋ชจ๋ธ์„ ๊ฐ๊ธฐ ๋‹ค๋ฅธ perturbation์— ๋Œ€ํ•ด ๊ฐ•๊ฑดํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์คŒ
  • random perturbation training : vat์—์„œ power iteration method๋ฅผ ์“ฐ์ง€ ์•Š๋Š” ์—ดํ™”ํŒ์œผ๋กœ ๋ฌด์ž‘์œ„ ๋ฐฉํ–ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹

  • vat๋Š” ๊ฐ€์ƒ์˜ ์ ๋Œ€์  ๋ฐฉํ–ฅ์— ํ•ด๋‹นํ•˜๋Š” ๋ฐ์ดํ„ฐ์—๋งŒ ๋ผ๋ฒจ์„ ํ• ๋‹นํ•˜๋Š” ๋ฐ˜๋ฉด, RPT๋Š” ๊ทผ๋ฐฉ์˜ ๋ชจ๋“  ๋ฐ์ดํ„ฐ์—๊ฒŒ ๋™์ผํ•œ ๋ผ๋ฒจ์„ ๋ถ€์—ฌํ•˜๋ฏ€๋กœ ๋น„ํšจ์œจ์ 

[Tutorial]

๐Ÿ“‚ Dataset


  • Street View House Numbers (SVHN) download

    • 10๊ฐœ์˜ class๋กœ ๊ตฌ์„ฑ (1๊ฐœ์˜ digit์„ 1๊ฐœ์˜ class๋กœ ์„ค์ •)
  • Cifar10 download

    • 10๊ฐœ์˜ class๋กœ ๊ตฌ์„ฑ
    • 32 x 42 ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ 60000์žฅ์œผ๋กœ ๊ตฌ์„ฑ

๐Ÿ–๏ธ Experiments


  • SVHN ๋ฐ์ดํ„ฐ์…‹ : epsilon ๊ฐ’์„ ๋ฐ”๊ฟ”๊ฐ€๋ฉฐ ์‹คํ—˜ ์ง„ํ–‰

    • epsilon = 2.0, 2.5, 3.0์œผ๋กœ ์„ค์ •
  • Cifar10 ๋ฐ์ดํ„ฐ์…‹ : label ์ˆ˜๋ฅผ ๋ฐ”๊ฟ”๊ฐ€๋ฉฐ ์‹คํ—˜ ์ง„ํ–‰

    • labels = 1000, 2000, 4000์œผ๋กœ ์„ค์ •

๐Ÿ“Š Result & Analysis


  • SVHN ๋ฐ์ดํ„ฐ์…‹ |epsilon|2.0|2.5|3.0| |:--:|:--:|:--:|:--:|:--:| |accuracy|0.8770|0.8635|0.8883|

  • Cifar10 ๋ฐ์ดํ„ฐ์…‹ |labels|1000|2000|4000| |:--:|:--:|:--:|:--:| |accuracy|0.5148|0.5456|0.5745|

๐Ÿ–๏ธ Conclusion


๐Ÿ“‚ References