2020/10/17 21時〜 MTG アジェンダ

Question

2020/10/17 21時〜 MTG アジェンダ

Closed this issue 4 years ago · 5 comments

takapy0210 commented 4 years ago

次回MTGのアジェンダをぶら下げるISSUEです
MTGが始まるまでに、下記について何かトピックあればこのISSUEにコメントする形でぶら下げる形でやってみたいです
- 今週やったこと（なくても全然OK！）
- 共有・議論したいこと
リンクとか積極的に貼ってもらえると嬉しいです！

Answer 1 · 2020-10-13T00:52:26.000Z

【WIP】takapyがやったこと

sub
- 基礎統計量を追加した: #6 (comment)
- 基礎系統計量 + k-meansのクラスタを追加した: #6 (comment)
- さっきのsubが全然ダメだった
  - 転移学習とlabel smoothingやったけどスコア伸びなかったな汗
    - https://www.kaggle.com/takanobu0210/pytorch-mlp-transfer-labelsmoothing?scriptVersionId=44904035
クラスタ数推定
- #23 (comment)

話したいこと

この処理は絶対入れた方が良いsnippetなど、リポジトリに溜めても良いかも？と思いましたが、どうでしょう？
- 特徴量生成部分とか
submissionを1つのISSUEにぶら下げるの大変？
- submission というラベルをつけて、subごとにISSUE作っても良いかも？とも思いましたがどうでしょうか？

Answer 2 · 2020-10-14T10:39:26.000Z

【WIP】しんちろがやったこと

sub :
pytorchによるlabel smoothingの実施
#6 (comment)
#6 (comment)
#6 (comment)

話したいこと

・https://www.kaggle.com/demetrypascal/2-heads-deep-nets-residuals-pipeline-smoothing
このkernelのモデルの構造を理解したいです・・・なぜ高いスコアがでるのか

・lable smoothingについて
https://www.kaggle.com/sinchir0/understanding-label-smoothing
上記で(tensorflow内で行われている)label smoothingがどんなことしているのかを可視化してます。
predの値ではなく、targetの値(0 or 1)をsmoothingして(0.005,0.995)とかにする処理のようです。
その処理を行うと、lossの範囲が狭くなることが分かります。

これを使った場合に有効になるパターンがうまくイメージできず、是非相談させてください・・・！

こういうコメントもありますね・・・

My guess is that it's because of the metric - the competition information says they clip at 1E-15 and 1-1E-15 in the evaluation of the score. Which means that if the model makes a confident but incorrect prediction, that sample's contribution to the logloss is -log(1E-15) = 15. On the other hand, if I clip at 0.001, it reduces the contribution of such points to 3, at the cost of adding log(0.999) = 0.0004 to every correct prediction. So the threshold has to find the right tradeoff between decreasing the heavy penalty of false positives/negatives while not increasing the cost on the correct predictions too much. Experimentally, I found that 0.001/0.999 works best

Answer 3 · 2020-10-17T11:06:21.000Z

【WIP】増田がやったこと

話したいこと

label smoothingについては早速しんちろさんが試してくれたので特に問題なし。
TabNet論文については、概要レベルでは理解したものの、個々のコンポーネントがどう効いているのかがいまいち腑に落ちていない感じで、時間があれば一緒に見たいです。
- 「ハイパラチューニングもほぼ不要」とか、そんなムシの良いモデルがあるか…？という疑問が。

Answer 4 · 2020-10-17T12:19:24.000Z

【WIP】柏木がやったこと

#17

話したいこと

ネットワークの構造（layer/units）
- 特徴量の出力を次のインプットに入れるかなど
- https://www.kaggle.com/demetrypascal/2heads-deep-resnets-pipeline-smoothing

Answer 5 · 2020-10-17T12:51:11.000Z

submissionを1つのISSUEにぶら下げるの大変？
submission というラベルをつけて、subごとにISSUE作っても良いかも？とも思いましたがどうでしょうか？

1sub1Issueで管理する！