subinium/Deep-Papers

Self-Attention Generative Adversarial Networks

subinium opened this issue 3 years ago · 3 comments

subinium commented 3 years ago

https://arxiv.org/abs/1805.08318
SAGAN

subinium commented 3 years ago

개인적으로 이 visualization을 활용하면 재미있는 프로젝트를 진행할 수 있을 것 같은데 고민해볼 것.

기존 GAN의 문제점

기존 GAN들은 geometry 보다 texture에 대해 좋은 성능
일부 class에 대해 geometry나 구조적인 패턴 발견 실패
그 이유는 Convolution의 local receptive field 때문.
- 이미지에서 멀리 떨어져 있는 부분의 관계를 알기 위해서는 많은 layer 필요.
- 그렇다해서 kernel 사이즈를 키우면 계산적/통계적 정보의 손실이 생김
- 이를 self-attention으로 해결

Self-Attention Generative Adversarial Network

수식을 보면 힘들지만, 그림을 보면 이해하기 쉽다.

기존 self-attention에서 사용하는 key, query, value는 다음과 같이 매칭
- key : f
- query : g
- value : h
attention layer output인 `o는 scale 파라미터(lambda)와 곱해진 후 feature map에 더해짐
- scale 파라미터 값은 0으로 초기화 되고, non-local evidence를 학습하며 값이 점점 커짐 (0으로 안하면 불안정함)
- residual 없이는 아에 동작하지 않는다고 함
채널 수를 맞추기 위한 1x1 conv를 구현 상에서 주의해주면 될 것으로 보임
hinge loss 사용 (왜 이거 사용한건지 모르겠다.)
- 윤제님이 Geometric GAN을 살펴보라는데 추후 읽어볼 것. (BCELoss + r1 regularization도 잘된다고 함)

Techniques to Stabilize the Training of GANs

Spectral normalization for both G & D

파라미터 간 차이를 제한해줌
다른 hyper parameter 튜닝을 안해도 성능이 좋음
계산 비용도 상대적으로 좋음
학습에 안정성에도 도움
- 김준호님의 I2I Translation 영상, 강민국님 GAN 구현 팁 에 따르면 Spectral Norm이 상당히 잘된다고 한다.

Imbalanced learning rate for G & D updates

regularized discriminator는 학습이 느림
TTUR(Two time-scale update rule for training GAN)을 사용
Wall clock time 기준 더 효율적

Details

conditional batch normalization in G and Projection in D
Adam Optimizer
- b1=0, b2=0.9
- default lr d = 0.0004 / g = 0.0001

subinium commented 3 years ago

1x1 Conv에 대하여

Channel 수 조절
연산량 감소 (Efficient)
비선형성 (Non-linearity)

FCN(Fully Convolutional Layer)를 더 깊게 읽어보자.

Reference

subinium commented 3 years ago

Reference