google-research-datasets/swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 languages, generated using PaLM 2 and summarize-then-ask prompting.
Stargazers
- akontra
- cheungdavenAmazon Web Services
- cin-hubertDa Nang, Vietnam
- ColdFusion2001
- cyrtaMetamedia Technologies
- d61h6k4Rasa
- danielcerGoogle, UC Berkeley
- din0sZeta Alpha
- elad619
- emptymalei@spikinglabs
- eugenesiowSingapore
- fcb1899
- fly51flyPRIS
- gbennett71
- ggsonic
- HansimovShanghai Jiaotong University
- HGMT96
- izhxPhD student @ HITsz
- jazzbearz
- jupyterjazz@jina-ai
- jwietingChicago
- luyaojieICIP, ISCAS
- mancevd
- monatis@qdrant
- mrdrozdovNew York, NY
- numb3r3@jina-ai
- paschembriAdvanced Stack
- sangkilpark-kidmam
- sarisel
- seshurajup@dolcera
- supercoderhawkPatsnap
- SushantDaga
- thakur-nandanUniversity of Waterloo
- tomsherborneInformatics, University of Edinburgh
- Trangle
- unverciftciMath & AI Institute