/wikidict-dsl-it

Wikidata Bilingual DSL Dictionaries (Italian)

OtherNOASSERTION

wikidict-dsl-it - Wikidata Bilingual DSL Dictionaries (Italian)

This repository makes available a collection of bilingual Italian dictionaries in DSL format derived from interwiki links (links between article titles in different languages) in Wikipedia. The data has been extracted from Wikidata.

Format

ABBYY Lingvo DSL is a flexible dictionary format that can be read by dictionary applications such as Goldendict and converted to other formats using tools such as pyglossary. There are also a number of tools for creating DSL format dictionaries available in the dsl-tools project.

DSL files must be saved as UTF-16 to be usable by dictionary programs. The raw source files in this repository are saved in UTF-8 format, which is both significantly smaller in terms of file size, and also readable (and diffable) by git. However, there are fully encoded and compressed .dsl.dz dictionaries ready for use available in the Releases section.

You can also use the rezip_dsl.rb and unzip_dsl.rb scripts provided by the dsl-tools repo to encode/compress and decode/uncompress the dictionaries either individually or as a group.

Data

The data directory contains the bilingual dictionaries in pairs according to ISO language code.

The basic filename pattern is [ISO]-it_wikidict.dsl, with [ISO] being the source language ISO code. A list of all language pairs is below.

Available language pairs

Language codes Language names
af-it Afrikaans => Italian
am-it Amharic => Italian
ang-it Anglo-Saxon => Italian
ar-it Arabic => Italian
arc-it Aramaic => Italian
bg-it Bulgarian => Italian
bi-it Bislama => Italian
bn-it Bengali => Italian
bo-it Tibetan => Italian
br-it Breton => Italian
bs-it Bosnian => Italian
ca-it Catalan => Italian
cdo-it Min Dong => Italian
chr-it Cherokee => Italian
chy-it Cheyenne => Italian
cr-it Cree => Italian
cs-it Czech => Italian
cy-it Welsh => Italian
da-it Danish => Italian
de-it German => Italian
el-it Greek => Italian
en-it English => Italian
eo-it Esperanto => Italian
es-it Spanish => Italian
et-it Estonian => Italian
eu-it Basque => Italian
fa-it Persian => Italian
ff-it Fula => Italian
fi-it Finnish => Italian
fr-it French => Italian
ga-it Irish => Italian
gan-it Gan => Italian
gd-it Scottish Gaelic => Italian
gu-it Gujarati => Italian
gv-it Manx => Italian
ha-it Hausa => Italian
hak-it Hakka => Italian
haw-it Hawaiian => Italian
he-it Hebrew => Italian
hi-it Hindi => Italian
hr-it Croatian => Italian
ht-it Haitian => Italian
hu-it Hungarian => Italian
hy-it Armenian => Italian
id-it Indonesian => Italian
ig-it Igbo => Italian
is-it Icelandic => Italian
iu-it Inuktitut => Italian
ja-it Japanese => Italian
jbo-it Lojban => Italian
jv-it Javanese => Italian
ka-it Georgian => Italian
kg-it Kongo => Italian
ki-it Kikuyu => Italian
kl-it Greenlandic => Italian
km-it Khmer => Italian
ko-it Korean => Italian
la-it Latin => Italian
lg-it Luganda => Italian
lo-it Lao => Italian
lt-it Lithuanian => Italian
lv-it Latvian => Italian
mg-it Malagasy => Italian
mi-it Maori => Italian
mn-it Mongolian => Italian
ms-it Malay => Italian
mt-it Maltese => Italian
nah-it Nahuatl => Italian
ne-it Nepali => Italian
nl-it Dutch => Italian
nn-it Norwegian (Nynorsk) => Italian
no-it Norwegian => Italian
nv-it Navajo => Italian
ny-it Chichewa => Italian
oc-it Occitan => Italian
pa-it Punjabi => Italian
pi-it Pali => Italian
pl-it Polish => Italian
ps-it Pashto => Italian
pt-it Portuguese => Italian
qu-it Quechua => Italian
ro-it Romanian => Italian
ru-it Russian => Italian
sa-it Sanskrit => Italian
se-it Northern Sami => Italian
sh-it Serbo-Croatian => Italian
sk-it Slovak => Italian
sl-it Slovenian => Italian
sn-it Shona => Italian
so-it Somali => Italian
sq-it Albanian => Italian
sr-it Serbian => Italian
sv-it Swedish => Italian
sw-it Kiswahili => Italian
ta-it Tamil => Italian
te-it Telugu => Italian
th-it Thai => Italian
tl-it Tagalog => Italian
tpi-it Tok Pisin => Italian
tr-it Turkish => Italian
ug-it Uyghur => Italian
uk-it Ukrainian => Italian
ur-it Urdu => Italian
vi-it Vietnamese => Italian
wo-it Wolof => Italian
wuu-it Wu => Italian
xh-it Xhosa => Italian
yi-it Yiddish => Italian
yo-it Yoruba => Italian
za-it Zhuang => Italian
zh-it Chinese (Mandarin) => Italian
zh_classical-it Classical Chinese => Italian
zh_min_nan-it Min Nan => Italian
zh_yue-it Cantonese => Italian
zu-it Zulu => Italian

Statistics

Dictionary size

Language pair # of entries
af-it 22617
am-it 5571
ang-it 2333
ar-it 128085
arc-it 1267
bg-it 99813
bi-it 443
bn-it 19754
bo-it 2447
br-it 35629
bs-it 26142
ca-it 224769
cdo-it 1861
chr-it 445
chy-it 531
cr-it 85
cs-it 154575
cy-it 25567
da-it 97169
de-it 476592
el-it 58443
en-it 796075
eo-it 130542
es-it 425237
et-it 58427
eu-it 112157
fa-it 175965
ff-it 190
fi-it 175958
fr-it 563341
ga-it 20495
gan-it 4362
gd-it 10731
gu-it 3809
gv-it 3926
ha-it 378
hak-it 2360
haw-it 1815
he-it 91345
hi-it 23807
hr-it 68594
ht-it 17847
hu-it 143786
hy-it 52214
id-it 106133
ig-it 712
is-it 21728
iu-it 345
ja-it 240737
jbo-it 1135
jv-it 14339
ka-it 48964
kg-it 780
ki-it 296
kl-it 1535
km-it 1509
ko-it 128523
la-it 93498
lg-it 148
lo-it 1033
lt-it 69053
lv-it 36430
mg-it 51798
mi-it 2110
mn-it 10016
ms-it 129580
mt-it 2431
nah-it 6570
ne-it 6569
nl-it 397359
nn-it 53478
no-it 168336
nv-it 1501
ny-it 99
oc-it 76565
pa-it 7741
pi-it 2202
pl-it 414393
ps-it 2461
pt-it 382253
qu-it 11878
ro-it 157353
ru-it 383527
sa-it 4695
se-it 5464
sh-it 121351
sk-it 129693
sl-it 58469
sn-it 1428
so-it 2319
sq-it 27965
sr-it 139878
sv-it 272864
sw-it 17345
ta-it 21548
te-it 7500
th-it 48226
tl-it 36947
tpi-it 1278
tr-it 108088
ug-it 2055
uk-it 233653
ur-it 32046
vi-it 194655
wo-it 884
wuu-it 1956
xh-it 245
yi-it 6776
yo-it 26333
za-it 617
zh-it 268936
zh_classical-it 2058
zh_min_nan-it 9730
zh_yue-it 16436
zu-it 559

Top ten dictionaries by number of entries

Language pair # of entries
en-it 796075
fr-it 563341
de-it 476592
es-it 425237
pl-it 414393
nl-it 397359
ru-it 383527
pt-it 382253
sv-it 272864
zh-it 268936

License

According to the Wikidata website:

All structured data from the main and property namespace is available under the Creative Commons CC0 License

The data in this repository is therefore made available under the same Creative Commons CC0 License as that used by the Wikidata project. All of the data has been derived from the Wikidata JSON format database dumps.