This Java projects downloads Wikipedia dump files and processes them to create a compact file of Wikipedia article titles and redirects in a given set of languages.
Simply run the class LanguagePairsFinder, e.g., as follows:
java -jar LanguagePairsFinder.jar data_folder 20201020 langlinks.tsv en,fr,ru,de
Parameters:
- Path to the data folder
- Date of the dump (see here)
- Name of the output file
- Comma-separated list of languages
Articles that are only available in one language are not added to the file!
en fr ru de en_redirects fr_redirects ru_redirects de_redirects
Musa_(genus) Bananier Банан_(род) Bananen Cold_hardy_bananas Callimusa Musa_spp. Ingentimusa Musa_sect._Musa Musa_(Musaceae) Australimusa Musa_sect._Callimusa Figuier_d\'Adam Bananière Musa_(genre) Musa_(plante) Bacove Банан_(растение) Musa Musa_(Pflanzengattung)