usage from Python
Opened this issue · 5 comments
Hi,
Thanks for releasing this interesting platform.
I'm interested in how ULTRA could be used from another Python library - e.g. say I wanted to use an ULTRA model to re-rank results in PyTerrier.
With sklearn, xgboost, fastrank etc, I can give it an array of feature values for a given document, and it will return a score - see https://pyterrier.readthedocs.io/en/latest/ltr.html#learning for our integration.
Does ULTRA have a similar API?
Thanks for your kind comment!
Yes, I do think it's possible to use the ranking model build with ULTRA in PyTerrier. While the APIs are not exactly the same, it should be fairly easy to build an adapter to connect them. For example, the classes in ULTRA (i.e., ultra.ranking_models) have a function named "build", which takes a list of documents (feature vectors) as inputs and output a list of ranking scores together. We design the API in this way to allow the building of multi-variate ranking functions such as DLCM and GSF. @anhtran1010 may know more details about it.
which takes a list of documents (feature vectors) as inputs and output a list of ranking scores together
This yes is sufficient.
What about training - does your API look like xgboost/lightgbm?
PyTerrier is just wrappers for Pandas dataframes. We munge features, qrels into e.g. sklearn or LightGBM .fit() methods - e.g. see https://github.com/terrier-org/pyterrier/blob/master/pyterrier/ltr.py#L129
No, the current API is different from xgboost/lightgbm. However, I think it shouldn't be difficult to revise it to fit xgboost/lightgbm. We are considering adding support for LightGBM, but haven't done anything on this direction yet. We will definitely put it in our development agenda!
Perhaps we can discuss to somehow make a demonstration Colab notebook. An initial version would be to train using your command line scripts, then re-rank using the learned model's build()
function.
This notebook demonstrates LTR for TREC Covid test collection. It also shows John Foley's Fastrank in use. Perhaps it can be used as a starting point.
Sounds like a great plan! I will discuss it with @anhtran1010 and @Taosheng-ty to see how we can make it happen. Thanks a lot for the suggestion!