amzn/pecos

XLinearModel preprocessing and metrics

kizhonorium opened this issue · 2 comments

Hello, I like this library and I want to understand some aspects more deeply. Regarding the tutorial https://github.com/amzn/pecos/blob/mainline/tutorials/kdd22/Session%202%20Extreme%20Multi-label%20Classification%20with%20PECOS.ipynb

  1. (point 4.2.1.) Is it necessary to additionally clean up the text (lemmatization, etc.) or is it already available in standard tools?
  2. Are there any tools for forming the Y_tst for calculating metrics? (smat_util.Metrics.generate(Y_tst, Y_pred_cost))
  3. Are there any recommendations for setting parameters (XLinearModel and Preprocessor) for data with 10K labels?

I will be glad to any answer. Thank you

@jiong-zhang Can you please help?

help please)