Noisy-Label-References Abstract
Show / Hide
General Info
Show / Hide
- Learning with Label Noise Github Page
- Class Noise vs. Attribute Noise : A Quantitative Study of Their Impacts(Zhu, AI Review 2004)
- Classification in the Presence of Label Noise: A Survey(Frenay, IEEE 2014)
Taxonomy of Label Noise
Show / Hide
According to the classification of label noise in Classification in the Presence of Label Noise: A Survey(Frenay, IEEE 2014) A Comprehensive Introduction to Label Noise(Frenay, 2014)
1. NCAR - Noisy Completely at Random Model
- the occurrence of an error E is independent of the other random variables, including the true class itself
- biased coin of noise rate / fair dice to choose wrong label
- uniform label noise
2. NAR - Noisy at Random Model
- probability of error depends on the true class Y, but still independent of X
- allows modeling asymmetric label noise, when instances from certain classes are more prone to be mislabeled.
- NCAR label noise is a special case of NAR label noise. ex.) arbitrary labeling matrices & pairwise label noise
- pairwise label noise : Two classes c1 and c2 are selected. Each instance of class c1 has a probability to be incorrectly labeled as c2 and vice versa. For this label noise, only two nondiagonal entries of the labeling matrix are nonzero.
NCAR and NAR considers that the label noise affects all instances with no distinction. -> not realistic
Samples may be more likely mislabeled when they are similar to instances of another class.
More difficult samples or low density (low encountered cases) may have higher chances of mislabeling.
3. NNAR - Noisy Not at Random Model
- the occurrence of an error E is dependent on both variables X and Y,(mislabeling is more probable for certain classes and in certain regions of the X space.)
- The most general case of label noise.
- feature dependent한 경우(NNAR)와 feature independent한 경우(NCAR & NAR)의 경우로 나눌 수 있음.
Sources of Mislabelling
- Restricted information
- Restricted description language
- Mistakes (even made by experts)
- Subjective bias
Noisy Data Generation Methods
Objective : mimic the structure of noise in real life :
- mistakes for similar classes
- mistakes for unknown classes
Show / Hide
Learning with Biased Complementary Labels (Yu, ECCV 2018)
-Details
Where Y and Ybar is true and complementary labels, previous methods implicitly assume that P(Y¯ = i|Y = j), ∀i ≠ j are identical, which is not true in practice because humans are biased toward their own experience.(표범만 봤던 사람은 치타를 봐도 표범이라고 label함) Therefore the transition probabilities should be different.
Uses complementary label which specifies a class that an object does not belong to. Complementary labels are sometimes easily obtainable, especially when the class set is relatively large. Given an observation in multi-class classifcation, identifying a class label that is incorrect for the observation is often much easier than identifying the true label.
(맞는 거 하나를 고르는 것보다 확실히 답이 아닌 하나를 고르는 labeling이 난이도가 낮음. 이를 이용해서 학습하려는 시도)
Method : 확실히 틀린 class 하나를 빼고 나머지 9개에 대해 :
- uniform probability
- without 0 (3그룹으로 나누어서 합이 1이되게끔 0.2 0.1 0.033)
- with 0 (3개 label 골라서 합이 1이 되게끔)
related to Learning from Complementary Labels (Ishida, NIPS 2017) label noise가 각 사람의 경험의 차이에 의해 많이 일어나는데, 이를 해결하기 위한 방안으로 complementary label 제시. -> true class에 영향을 받는 NAR
NLNL: Negative Learning for Noisy Labels (Kim et al, ICCV 2019)
-Complementary Label을 이용하여 noisy label을 filterning 하는 방법을 차용. Negative Learning을 통해 clean label들을 선별하고 선택적으로 Positive Learning을 이용하여 모델을 학습함. PL로 학습한 모델은 다시 Noisy label로 분류되었던 데이터의 labeling을 다시 하는데 쓰이고, 결과적으로 cleaner dataset을 만드는 데에 도움을 줌. -> SelNLPL(Selective Negative Learning and Positive Learning)
Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels (Han, NIPS 2018)
-Details
Uses pair flipping and symmetric flipping. Pair flipping refers to a case where a certain label is misclassified to a certain label since it's similar(but doesn't imply similarity in a way that two classes are paired). Symmetric flipping refers to a case where a label is not identified, so it is given any other random label
How pair flipping is defined in this paper is different. Survey에서 말하는 pair flipping하고는 차이가 있음. 'Coteaching pair flipping' method is not realistic in a way that two labels are just matched randomly, not according to how similar they look like so that people might make mistakes. 개선 가능 지점. label noise를 handling하는 많은 기법들이 NCAR, NAR 방법으로 만들어낸 noise 많이 사용을 하는데, 이는 사람이 만들어낸 noise랑은 차이가 많이 있을 수 있다. 이를 개선하기 위해 이런 종류의 noise모델을 제시한다.
followed the noise generation method used in:
Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach (Patrini, CVPR 2017) - asymmetric, class-conditional noise, where each label y in the training set is flipped to ytilda while feature vectors are untouched. The noise transition matrix is row-stochastic and not necessarily symmetric across the classes. github codes
- def noisify_mnist_asymmetric()
# 1 <- 7 (automobile <- truck / Some trucks are mistaken as automobile)
# 2 -> 7 (bird -> airplane)
# 3 -> 8 (deer -> horse)
# 5 <-> 6 (cat <-> dog)
# 1 <- 7
P[7, 7], P[7, 1] = 1. - n, n
# 2 -> 7
P[2, 2], P[2, 7] = 1. - n, n
# 5 <-> 6
P[5, 5], P[5, 6] = 1. - n, n
P[6, 6], P[6, 5] = 1. - n, n
# 3 -> 8
P[3, 3], P[3, 8] = 1. - n, n
Noise Transition Matrix, P (Coteaching의 pair flipping을 일부 label에 적용한 경우에 해당)
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | |||||||||
1 | 1 | |||||||||
2 | 1-n | n | ||||||||
3 | 1-n | n | ||||||||
4 | 1 | |||||||||
5 | 1-n | n | ||||||||
6 | n | 1-n | ||||||||
7 | n | 1-n | ||||||||
8 | 1 | |||||||||
9 | 1 |
CIFAR 100의 경우 Superclass 써서 조금 다른 방법
같은 superclass에 속하는 다른 class로 noise가 발생할 확률이 높다고 보고, superclass내의 class 끼리만 shuffle하는 방법. NAR - class pair 만들고 n의 확률로 서로에게 mapping해주는 pair-flipping.
Training Deep Neural Networks on Noisy Labels with Bootstrapping (Reed, ICLR 2015)
Section 4.1 MNIST with Noisy Labels : Specifically, we used a fixed random permutation of the labels as visualized in figure 2, value on column is mapped to a value on row with some probability (didn't use CIFAR dataset)
# 0 -> 2
# 1 -> 5
# 2 -> 4
# ...
(original github code not available)
->label을 sort해서 적용하면 coteaching pair flipping과 동일
NAR
Learning with Symmetric Label Noise: The Importance of Being Unhinged (van Rooyen, NIPS 2015)
Symmetric label noise : where the learner observes samples from a distribution Dbar, which is a corruption of D where labels have some constant probability of being flipped. (Original Github Code Not Available) NCAR
Training deep neural-networks using a noise adaptation layer (Goldberger, ICLR 2017)
-Case 1: noisy labels are only dependent on the correct labels
Case 2: noisy labels are dependent on the features in addition to the correct labels
[MentorNet]
[]
Genre-based Decomposition of email class noise(Kolcz, ACM SIGKDD 2009)
-Studies of data cleaning techniques often assume a uniform label noise model, however, which is seldom realized in practice.
... class noise can have substantial content specific bias. We also demonstrate that noise detection techniques based on classifier confidence tend to identify instances that human assessors are likely to label in error.NNAR 찾기 어려운데 이 논문의 경우 NNAR을 해결하기 위한 방법 제시.
그냥 inspiration 줄 만한 논문들
Identifying Mislabeled Training Data(Brodley Friedl, JAIR 1999)
-Using two kinds of filtering methods : consensus filters and majority vote filters consensus filters - conservative at throwing away good data at the expense of retaining bad data majority filters - better at detecting bad data at the expense of throwing away good data.
Consequences of Label Noise on Learning
- Theoretical and empirical evidences of impact of label noise on classification performances
- Increases the necessary number of samples and complexity for learning
- Distortion of observed frequencies
- Deterioration of feature selection
Approaches to Handle Label Noise
- Label Noise-Robust Model
- Label Cleansing Methods for Noisy Datasets
- Label Noise-Tolerant Learning Model
Evaluation Measure
- Accuracy
- Label Precision