/compact-dictionaries

📚 Compact dictionaries in English that automatically update weekly

GNU General Public License v3.0GPL-3.0

CI pipeline status

Compact Dictionaries

Compact dictionaries in English that automatically update weekly.

Copyright © 2023 Teal Dulcet

Preprocessed free dictionaries/thesauruses in JSON Lines format that are automatically updated weekly. The dictionaries include the part of speech, definitions (senses), forms of the word, synonyms, antonyms, pronunciation and more information.

All dictionaries are provided uncompressed and in a compact JSON format with minimal single character keys and no whitespace. They include much more information then would be found in a traditional paper dictionary or thesaurus, including the full list of meanings for each word. While the definitions are currently in English, they are available with words in over 100 languages. The dictionaries are designed so that applications can directly download them, without developers needing to release an entire software update. This allows users to enjoy much more frequent updates and thus more accurate information.

❤️ Please visit tealdulcet.com to support this project and my other software development.

The dictionaries are hosted on GitLab because while it now has a 100 MiB file size limit for regular files, it has no maximum file size for Git Large File Storage (LFS) files, just a 10 GiB repository size limit. In contrast, GitHub has a 100 MiB file size limit and strict bandwidth limits on Git LFS files. Commits older than one month (previously one year) are automatically squashed to keep the repository size under that limit. Please see the CHANGELOG for the full history. The dictionaries are now updated on GitHub as it has no limit for CI minutes for public repositories. In contrast, GitLab has a 400 CI minutes/month limit.

Dictionary comparison

Dictionary License Updated Download
Wiktionary 🅭🅯🄎
CC BY-SA 3.0
GFDL
Weekly
Acholi (ach)
⬇️ dictionary-ach.json
823B – 20 words
Checksums (click to show)
MD5: 6ea00ace5c9e36e9cdd533239d0d34a6
SHA1: a713fa54019f32a132f77ae61a5a4269f6e25faa
SHA256: ee2ee6afe5eeb79889d05f54344b6bf7b3b9bd0e3307ee7ee92f70f1579ec428
Afrikaans (af)
⬇️ dictionary-af.json
960.7KiB (983.7KB) – 8,387 words
Checksums (click to show)
MD5: 9ef07141d61eb527befedd7a90170faa
SHA1: d70a41631c7e718825a5c7fbbe1f9a8e07ec8132
SHA256: 2d11aa27e2f9b49e6817e49779016e56aced7252ad48f4240cb9b7d1e7a44d5f
Aragonese (an)
⬇️ dictionary-an.json
103.3KiB (105.7KB) – 1,030 words
Checksums (click to show)
MD5: 88d5ef890ed487d1c97f9ef6253a367b
SHA1: 28fd0f3f89d11725dec3c5e8138647f25ce821f7
SHA256: 37ca11b33714b922e6179193d0b888ed4c4e9499ed9e67871127be87cf62cda7
Arabic (ar)
⬇️ dictionary-ar.json
36.54MiB (38.31MB) – 53,798 words
Checksums (click to show)
MD5: 8e2e393cefe8395eb3f5b10a61ba1c24
SHA1: 30598566902b93099629bdc44e5f791ceaedcc5e
SHA256: 5625f2acc112b0766e094f3d303107ee4cb96f5594a19d33db08793fcf179559
Asturian (ast)
⬇️ dictionary-ast.json
2.978MiB (3.122MB) – 30,786 words
Checksums (click to show)
MD5: e93ee6f7aaf41a376a1207c5e43335d4
SHA1: d92ad1f0bb80d399c77142ad1023af5bb7975943
SHA256: 95826949af011f7866947f5c65516a449ce36a71b8612d4fb5203c57b782cc73
Azerbaijani (az)
⬇️ dictionary-az.json
13.81MiB (14.48MB) – 12,037 words
Checksums (click to show)
MD5: 9d6913a9f5e672714d95dbb51bc52c1a
SHA1: d40348a4bebf4e1b5589b1069ff478669bdd0750
SHA256: 1253859f9a1ef000a7aad5d4c682b7d863add4068c6d172c34e1205598a41da6
Belarusian (be)
⬇️ dictionary-be.json
1.804MiB (1.892MB) – 6,408 words
Checksums (click to show)
MD5: 47f4f270f98ef93c90b4429284ac7c05
SHA1: d8e76fa179802461acfd76c62a7a966fecbfadcd
SHA256: 202f443bd3c6a3e6e5699b53b2ee160f3e514d98952ce7aed96265a0c29ad375
Bulgarian (bg)
⬇️ dictionary-bg.json
12.38MiB (12.98MB) – 45,596 words
Checksums (click to show)
MD5: 02c7253d15be78b2436e0b9d872222df
SHA1: 2e55e821492426481b21afd81f7a4de83e4c588a
SHA256: 729f7c0bb32649f645bb06a7c4d62e15d4327730597bb3e2c7eb782549844351
Bengali (bn)
⬇️ dictionary-bn.json
2.161MiB (2.266MB) – 7,442 words
Checksums (click to show)
MD5: da86ed22a8c1b9b1685e4048ccf497cb
SHA1: d716d0acaa38fe76de1d1db783d108eadc804f4a
SHA256: 047075c75accf4f2bf8709617a28726d283500e4c670ef7e476329c34e532a8e
Tibetan (bo)
⬇️ dictionary-bo.json
298.4KiB (305.5KB) – 3,077 words
Checksums (click to show)
MD5: dc71f47f7a039728908de2afd31c8789
SHA1: 661d5ea0c719a651e8b6a6793df666c5ddb9d732
SHA256: 55b47992065c59f7f285d371f7f385af056340037e45544fca68065c92b23281
Breton (br)
⬇️ dictionary-br.json
182.5KiB (186.9KB) – 2,012 words
Checksums (click to show)
MD5: 536f7e5f6d0b34b152cb6a3f0bf371ec
SHA1: a1f1d0c6087a102848b7c3ab2dfabaeaa359883e
SHA256: c80625e7a297aca795c450a563eb33fb0004215f1635f66da4948bf927946ad2
Bodo (India) (brx)
⬇️ dictionary-brx.json
4.883KiB (5KB) – 68 words
Checksums (click to show)
MD5: de4b6952d5697a6d235200da5d7af4bc
SHA1: a870e48006a1f54268c5b64d1031a203c7c46fbf
SHA256: b177ecfc0b801d632b7a43e817637fb363ab3bf7ab6a6820c2eba11b855d96f8
Catalan (ca)
⬇️ dictionary-ca.json
18.29MiB (19.18MB) – 183,035 words
Checksums (click to show)
MD5: fceaa633ed2fa471374e1cb3881245b3
SHA1: f5e6057ac26de57b40f150c2351aba665be8e3e6
SHA256: 0051c7fa02a36a1d6c75fd7012acfb3189cfd075caed006a2f1045bf555cea4c
Kaqchikel (cak)
⬇️ dictionary-cak.json
5.978KiB (6.121KB) – 111 words
Checksums (click to show)
MD5: 366e03af649ef1b06017a86d62b3f4aa
SHA1: 9b7a65f9103349514f71f8d1147d969d8f6f9618
SHA256: df70103780ece8e34a08f33bb25375dab3805fc2e0488827ebd732cd56e9f398
Central Kurdish (ckb)
⬇️ dictionary-ckb.json
98.25KiB (100.6KB) – 831 words
Checksums (click to show)
MD5: f9a11b7d73e4dd92313d414d3460e108
SHA1: 7f3b52c9988f2a9f2512696cbf322a2b900f38b3
SHA256: d8d7396dd868c0f644a288866bd99a7154041a978ffa25f277a8e52f2dec6398
Mandarin (cmn)
⬇️ dictionary-cmn.json
6.357MiB (6.666MB) – 63,440 words
Checksums (click to show)
MD5: 64260b4b32b07cfd37ea1393c86a625b
SHA1: d93d15c876eea27a9ce1f605654f5635e5f37a99
SHA256: b45415ed87ea4cf9babd8162435fd5e6c54e1a62d8561e4765d3610e0115fbd8
Czech (cs)
⬇️ dictionary-cs.json
9.820MiB (10.30MB) – 52,978 words
Checksums (click to show)
MD5: c41915a7e79cb1f245e463f730df88a3
SHA1: 422af7f17def473425a092eb8cee1f685f708090
SHA256: 8ca05e54c9299b1393cf0b6d43e400e079c511b7873cfff228f3580eabd4d222
Welsh (cy)
⬇️ dictionary-cy.json
2.482MiB (2.603MB) – 16,076 words
Checksums (click to show)
MD5: 57798179f4a93218f89591af518bdb2b
SHA1: 7ba419aef709d9f66a0a3b04fd4ad76dd6e5b851
SHA256: 3c3e1c10ee0616621ef352e387e228e93fb717096f045e4fad7c28d3a33e7e3b
Danish (da)
⬇️ dictionary-da.json
4.454MiB (4.670MB) – 45,180 words
Checksums (click to show)
MD5: 5e0af421ea64bdf52602a15208b83aca
SHA1: d7b1b30975ad98b4286776616d11121ec0b029ab
SHA256: 1bce1e9c880cfa44de4aa96ff8b9720e54531f1420ac39b3a5975b860c7132c2
German (de)
⬇️ dictionary-de.json
39.67MiB (41.60MB) – 306,896 words
Checksums (click to show)
MD5: 20de47ac9aaf24283bdbb7534c87c4d1
SHA1: 079189886a1d554a59c18dce95a9e5130d9a2b9d
SHA256: 599877efbeb723d09d86710cf7cdf206e8ce48d1f51b69cf6ce02e7c8d382e9f
Lower Sorbian (dsb)
⬇️ dictionary-dsb.json
468.0KiB (479.2KB) – 3,354 words
Checksums (click to show)
MD5: b50ed60c632ba9704e04eda1b9d7c250
SHA1: c75591bb1e5755cfbb8b8da8c8c13f78d9ae8e6d
SHA256: 3bba474f40eae8fcccc8edb3a3c34b94fb6b51ba7cd3b5e83ddd334fbf451b7c
Greek (el)
⬇️ dictionary-el.json
15.82MiB (16.59MB) – 74,611 words
Checksums (click to show)
MD5: 656ba4d31ec161b1aac90e015d176992
SHA1: befbdc3bc5ae4a91c2a81172bf2982363e126ecc
SHA256: efc73272c6cdde5ca28c4bf71625d08a43e9a379afef0556ecdefc2ccf2dea12
English (en)
⬇️ dictionary-en.json
104.8MiB (109.9MB) – 969,030 words
Checksums (click to show)
MD5: e068b5536d55b6abfec2e1e586566dc4
SHA1: a27f0083d4bbc508ca180559a0760e4054c3899f
SHA256: 8e6b2f7ef6327e14283537a1d795f1ceab3af7133f6d96a151742ec548d2491d
Esperanto (eo)
⬇️ dictionary-eo.json
13.54MiB (14.20MB) – 127,322 words
Checksums (click to show)
MD5: cb933ee68bb7bcfc6a67a6db7b1eeeb1
SHA1: 2bf456e6d224d8a0d7abd095e2ccf20929fc819e
SHA256: b1320fbc49692bd38a19bac9713952dc41c811b55693ac931cf8d80e00fed1aa
Spanish (es)
⬇️ dictionary-es.json
75.89MiB (79.57MB) – 730,289 words
Checksums (click to show)
MD5: e927c8e8f6e2376fa8e2380a253c18b7
SHA1: e653c108bcde4a37aff56163ca7f75d5fa967c20
SHA256: eed4d07cdb52eee572e717479bae21cba020e3b5a0198af43dd24d071a3796ee
Estonian (et)
⬇️ dictionary-et.json
2.929MiB (3.071MB) – 11,154 words
Checksums (click to show)
MD5: abc04fef3d2fadec813086a05481b405
SHA1: 2e50c7d5afd667e8a2b370289e3642af62e40c87
SHA256: 78037684199358a4a1b2f0580ac45df298003b866918c41dde81302f21d14afe
Basque (eu)
⬇️ dictionary-eu.json
2.927MiB (3.070MB) – 8,333 words
Checksums (click to show)
MD5: cf66c5a29541f3cdfbdf8c31ecaec9ef
SHA1: 44a10c1f7be932e32fc8a81e9b6340a7a72d3503
SHA256: 7b9da23d3a1ceaaea63586758c176e38f1c6ffe81da74024d3d529ced56158f1
Persian (fa)
⬇️ dictionary-fa.json
2.226MiB (2.334MB) – 13,062 words
Checksums (click to show)
MD5: 098d8a94f8f4c9a15e4f831994f7488a
SHA1: f7af221ab9dd84d181d974dd14d2ec2f6672a8c0
SHA256: 4e855c324170080759dc206d383a00a65020c59226967c9c1d64a9e52032fb64
Fula (ff)
⬇️ dictionary-ff.json
116.2KiB (119.0KB) – 1,632 words
Checksums (click to show)
MD5: a0bdfe1f1e104916f4066c78c567967a
SHA1: ef4e38b56c5dffd5960e29873f2463880dcad68c
SHA256: d3bba83faaa047f8da6788eea90ada1fa7ce9282edea0d07afb12592b413d67d
Finnish (fi)
⬇️ dictionary-fi.json
472.8MiB (495.7MB) – 232,521 words
Checksums (click to show)
MD5: 3b1991d863bbf53ccf75ada6a04efb2f
SHA1: 71f752fdd3ff4126e9f8369291852b02bfea57cc
SHA256: b11f00367633488d980b3f8574403275515a7d4c5be333b07f1b1c910e12bcbb
French (fr)
⬇️ dictionary-fr.json
40.49MiB (42.45MB) – 362,788 words
Checksums (click to show)
MD5: 1eeac1bf1f953fb26e802c29d7697997
SHA1: 4d010db795210129944af2276c170372c4de44f2
SHA256: 4c2fbba46e1a89ec3acac0338078940e418de4d8612744e9638d146d905dac50
Friulian (fur)
⬇️ dictionary-fur.json
146.0KiB (149.5KB) – 1,962 words
Checksums (click to show)
MD5: fe18db19106a6b9dc1c49a3d7460d03a
SHA1: ab1370f55afaec930c1290f7d4058e82c75e04f2
SHA256: f67543d53942d45e08c45b482e00f1a7a1bfc7481d3242adf53c107efab18664
West Frisian (fy)
⬇️ dictionary-fy.json
265.3KiB (271.7KB) – 3,169 words
Checksums (click to show)
MD5: 6aa536a69b9c0c66414173a180cdfaa6
SHA1: c0e2da3647dee7543f715f6428408a6b09bd49a1
SHA256: 9f18c723c773556a1a67c16daa1cc25bcd4982e277b63c540f9509938a6af8a9
Irish (ga)
⬇️ dictionary-ga.json
4.606MiB (4.830MB) – 27,213 words
Checksums (click to show)
MD5: 5a3a1771ca0319bd15dead517ba7e9b8
SHA1: a4efe8dc9d2643e1cff0828f87b9c7b4dc98164c
SHA256: 879f92dc33c4eba27ae37c797ede640b04a5f59084c21754ccd8af9bec087615
Scottish Gaelic (gd)
⬇️ dictionary-gd.json
1.343MiB (1.408MB) – 14,354 words
Checksums (click to show)
MD5: 0f9134bf160aeca4b12771995bb1bcc5
SHA1: ce42991e9346ca32a09b0690ab2ed9ab75cc4d86
SHA256: fc83fae88ba93d963f9f15c0ba3cccccc6c5d6d96a384adcc52f3826d76e9d88
Galician (gl)
⬇️ dictionary-gl.json
21.08MiB (22.10MB) – 197,363 words
Checksums (click to show)
MD5: 68512fce6199e4abb4684b1e13ed603f
SHA1: 804ffbc2b23888527d2b0695458eb7e060f0f514
SHA256: 1b4a59fb5b5d2561bc150a765dac056b693a63958541598faa8e02cfae6b679a
Guaraní (gn)
⬇️ dictionary-gn.json
83.72KiB (85.73KB) – 739 words
Checksums (click to show)
MD5: 66b2a1ba9fece93afd8bbd07bead14d6
SHA1: 52a5dfb05c91a8a85139ef3eefe8b200d1a0adec
SHA256: 1a6d305b7338bb8a9057edf6c6e4c29cc86aebcacc43e76e75d373551dcd5eb6
Gujarati (gu)
⬇️ dictionary-gu.json
689.2KiB (705.8KB) – 5,078 words
Checksums (click to show)
MD5: e972e41b59a5f56cdd0da857e307c0d2
SHA1: 489240260d0cba00fce482f862ee5e698ee3c55f
SHA256: e6e93c19c424cad30b09ead349f128cc5c836d4f5a73e8cdccc23ce261ceff37
Hebrew (he)
⬇️ dictionary-he.json
3.096MiB (3.247MB) – 10,686 words
Checksums (click to show)
MD5: 14b18331dc6ed18b3234e0d5054f9669
SHA1: b786c0c5da3f18fd416a73dd5abc521c00a0c6e7
SHA256: 040daa21bd3555e6b5bdd615b91a79cb3ffe1fca30927ee60f98e79168b66484
Hindi (hi)
⬇️ dictionary-hi.json
5.150MiB (5.400MB) – 26,575 words
Checksums (click to show)
MD5: 5e2fc63800e7b81f9066588b95d96ab1
SHA1: ab7328138aecfed0ea59b7bd7382ee5abd212706
SHA256: 32d5817712ed25b3a1a1e8f7c1ba23bcc59a44aa1f1c90c6c20417380e0a3293
Upper Sorbian (hsb)
⬇️ dictionary-hsb.json
222.4KiB (227.7KB) – 1,168 words
Checksums (click to show)
MD5: bd9f9810463de0bded43818477376101
SHA1: d2f993894b7c58b9b3f05ddfb1258d2ca8ecff2e
SHA256: e357cd8c29de78bd4a9f8be8a6382412c598656715cda077dddccb9794435dac
Hungarian (hu)
⬇️ dictionary-hu.json
37.31MiB (39.13MB) – 68,011 words
Checksums (click to show)
MD5: a2adbdb3aef9562877dfea47e1506e22
SHA1: a2315860c62269baab9b559f4ee6efc05f6f6b3e
SHA256: fee078c40a56347ec86f3b0ea458159d4032bb7f0fde1c0d16cf50cdcd63d498
Armenian (hy)
⬇️ dictionary-hy.json
16.75MiB (17.56MB) – 17,869 words
Checksums (click to show)
MD5: a9f2665e304239d796630c77e23797bd
SHA1: 6c778fa4268559f439e4bf585d68de60a83085af
SHA256: 105eaf9a86a3f42c0edb4120a780d468217a1fa8facd73e96f4c6efcb0f364e0
Interlingua (ia)
⬇️ dictionary-ia.json
233.8KiB (239.4KB) – 3,428 words
Checksums (click to show)
MD5: ab9e3523a0df48578dba6b500f876ce6
SHA1: 46776149ba7028f87d5ccd7616f3661eb63869dd
SHA256: b6f4b6fe959bd971503390a56d02e2404c72280089d0c150e978f357920a4c07
Indonesian (id)
⬇️ dictionary-id.json
2.262MiB (2.372MB) – 16,114 words
Checksums (click to show)
MD5: 4a3cd360b618674844fa4c2edede8ab4
SHA1: e51ac209f90b5305c6ca4c8968c3cefe2d9ae4ad
SHA256: 8480953dd766bd079257073e76e4f59e1892ea634c20d6356e085b8926f7feeb
Icelandic (is)
⬇️ dictionary-is.json
4.381MiB (4.594MB) – 21,300 words
Checksums (click to show)
MD5: ab90b6100eb495bfafa2c6a90c0262c1
SHA1: 5085ede392aad3749119b683b9fdb92a41b209bf
SHA256: a9047fcb61ad8f2d22c695284e3aabfdd43f20570cb6a163093d4ad765cb6a37
Italian (it)
⬇️ dictionary-it.json
55.33MiB (58.02MB) – 568,638 words
Checksums (click to show)
MD5: de9fde6136adf2fc7ab82665a8601784
SHA1: 77c54356e833699027517123c181cc5db03d4b6b
SHA256: 3c2924cec6fdc8051d4417ef49b771a42a9527d2350e2a607987f61e8eafc9a0
Japanese (ja)
⬇️ dictionary-ja.json
21.55MiB (22.60MB) – 104,423 words
Checksums (click to show)
MD5: a1da6740d23daadf49792b1d049ad55e
SHA1: 30beaea403b9593c3c40bafa16d4b0dded9cdbbf
SHA256: 0e10a52c715e64ffffa005bdd71845e3991ce184c417ba4877bdf7a3f512d005
Kabyle (kab)
⬇️ dictionary-kab.json
17.53KiB (17.95KB) – 250 words
Checksums (click to show)
MD5: bb38db22b474c05011bac588de6d06c1
SHA1: fd6f8b4013468292555fbd70a07ce85d810c9cf0
SHA256: a9391e041ce7b2af5827b6473191a98debbd50dfa1bb7b0c35fffab07396c84e
Georgian (ka)
⬇️ dictionary-ka.json
17.64MiB (18.50MB) – 18,850 words
Checksums (click to show)
MD5: 319fed7c1b72ab8585938a1c196da3b8
SHA1: 19df589eeb1bd854aa72a2ff8e2376987a8cfa9d
SHA256: 8d79c8ca557e098d8d884d7c5ed396d0d30df06a9e99e0631e4b33f7d2f74dbd
Kazakh (kk)
⬇️ dictionary-kk.json
2.168MiB (2.274MB) – 9,307 words
Checksums (click to show)
MD5: 89d4c4e7602ee10d9158adf5c91c36cc
SHA1: d6e0042898d2d4fc0c2756532206147296b8adea
SHA256: 6344b95d60be27cb60dc9bca0e7a2b29e68e7ca57fbed419743f1915ac641a44
Khmer (km)
⬇️ dictionary-km.json
1.014MiB (1.064MB) – 8,809 words
Checksums (click to show)
MD5: 8fe17bdba543450efdea94675a2ccacb
SHA1: c65240dc0be16e7cab6ba418cc8c78a9fd23abd1
SHA256: 65bc2507089cee385ead59a91c8ae82129bd6d4dacdf0b89bfd48eaf3bbebaf3
Kannada (kn)
⬇️ dictionary-kn.json
588.9KiB (603.0KB) – 1,958 words
Checksums (click to show)
MD5: a2a192a8381447fe95191a0d82c9c3b8
SHA1: 2ab808df4ad281c0918199abc886bd2f0506191e
SHA256: 788a71047bd6ff3831e0886838789bbc1c803cb14749d975a77dd4306cc2cc9f
Korean (ko)
⬇️ dictionary-ko.json
8.958MiB (9.394MB) – 36,646 words
Checksums (click to show)
MD5: 2b3ac5e19aa44b659946488ad63679c5
SHA1: 7aaa6d752ecfcb39e6093ef0cddcbf4982e7b7aa
SHA256: f8dd4f048e4e0e8289f6a07169d828332e7658b9f43eadf33c1a6b80ab732ee0
Latin (la)
⬇️ dictionary-la.json
104.7MiB (109.7MB) – 822,854 words
Checksums (click to show)
MD5: 0463f01c01d522a2d03f78fcf62cf89c
SHA1: e5ae1b612ce41870ba0a465c82df43032139a086
SHA256: 9e94158127420adc4ed919030cbcfb62bde96f9441005c1d0678971d830322af
Ligurian (lij)
⬇️ dictionary-lij.json
162.5KiB (166.4KB) – 1,619 words
Checksums (click to show)
MD5: ecca03744b8d7be6f1b6d961e8231019
SHA1: 0c25acb03f63054222fa36c67f2c0f5071e3719d
SHA256: 6cf0b93ec5e3134322468b2b13e26e748209980bde8d7e18e0a0b981f181ad12
Lao (lo)
⬇️ dictionary-lo.json
248.4KiB (254.4KB) – 2,164 words
Checksums (click to show)
MD5: 4d6f431ddeb7e89be54bbd347900aaaa
SHA1: 9376a9bc744c71b241978796ac4c993170cc0c85
SHA256: 64de62b7fdbd1c03772bf2d3839344108846f0f426feff5cfab81f0757011b0b
Latgalian (ltg)
⬇️ dictionary-ltg.json
95.12KiB (97.41KB) – 530 words
Checksums (click to show)
MD5: 5ca4ce94e2595f87772f145ec0370d4d
SHA1: 4e1bd85be29b273eec418ac54e0e1df11318c4d5
SHA256: 5bff1cc482be14dc59b786ebba79580c782c5f3fff76839665e383adeb831508
Lithuanian (lt)
⬇️ dictionary-lt.json
5.175MiB (5.426MB) – 25,162 words
Checksums (click to show)
MD5: 0f2b5b90fb8f83105b0082090d6144a2
SHA1: dbe1424b9bea1851198b5b546f3ae4cbf80a8459
SHA256: e9b2c9b705dbb89dbd3c79ad12023d3264360bd71a9e2a84da24b9da2f27ffa9
Latvian (lv)
⬇️ dictionary-lv.json
13.22MiB (13.86MB) – 121,236 words
Checksums (click to show)
MD5: 10ae751095fd28ad1f678e19819ef962
SHA1: eb8bbcd38fed44e2a8ae71433d351b95d5939823
SHA256: bd6e78c1dd4d925133c1b774fe2ecf703f8e60dcd64d8697128ff55b60821b60
Macedonian (mk)
⬇️ dictionary-mk.json
24.41MiB (25.60MB) – 62,689 words
Checksums (click to show)
MD5: 1ae400ac0e24d3dbd1ee81f066a4683e
SHA1: f88a70a30727c3a6a3d873fbdc68a4234f706d3b
SHA256: 7b2adfbb1a6bef5a0063c6b5c452301e40ed95d8a01f2f405d138acf61751a5c
Malayalam (ml)
⬇️ dictionary-ml.json
1.550MiB (1.625MB) – 9,477 words
Checksums (click to show)
MD5: a0498fc406264691ad37c5b33823c3c3
SHA1: 63de35d4c98f3302903c1579c0792c71a3c5c7c9
SHA256: 4f365b2c8ccffe9189f7652080101ad127f78feef3b2048d03207e8a05aa6d91
Marathi (mr)
⬇️ dictionary-mr.json
824.1KiB (843.9KB) – 3,104 words
Checksums (click to show)
MD5: e089e8a6e3afebe431fd225a4c603185
SHA1: 0fe9e1ffcf4315bbf787b21898c08b28bddfa7a1
SHA256: 0d568dbef2cd0644ad42e0f7a2dd6d2e1809083f20b749dd5f7550344d314744
Malay (ms)
⬇️ dictionary-ms.json
1.112MiB (1.166MB) – 8,468 words
Checksums (click to show)
MD5: a23ab0c69a2b7a957fd8d8921b9ccc45
SHA1: 5578c2c8449fc9fe1fcefae9547a2fd762c5a9af
SHA256: 3105099f30592a3402dd08360ed4d037a2ca17afe42181598dff7227db60df6b
Translingual (mul)
⬇️ dictionary-mul.json
2.226MiB (2.334MB) – 18,904 words
Checksums (click to show)
MD5: 3da7ac6ba0bd70a8b52fa784e7d8c36d
SHA1: cf0b49530076001158fc3f6c9a0531a890af9832
SHA256: 52f7018bbf01b7231f971262d8d8eabb59a30436361827cdd0da7ab7a18103e5
Burmese (my)
⬇️ dictionary-my.json
795.4KiB (814.5KB) – 6,151 words
Checksums (click to show)
MD5: 32483309289172c51d81eba8250c8fd8
SHA1: f391f2fe157f5d68c08c20436d7d2a8999b1e73c
SHA256: 9e82b8771a9edc324ca307a09c3c0e99e31d1d40e88910f534f0d231377dc6b3
Norwegian Bokmål (nb)
⬇️ dictionary-nb.json
5.902MiB (6.189MB) – 68,961 words
Checksums (click to show)
MD5: 310f8c25779aed72195783f017a19ddb
SHA1: 21ba9ea2048feb77d356221a5b1d1ad400fe02b8
SHA256: c5f0a314266a37661753a457cfb6f44a5454a682a836740acc395fb5cae3b4d1
Nepali (ne)
⬇️ dictionary-ne.json
2.012MiB (2.109MB) – 1,939 words
Checksums (click to show)
MD5: 1f95957b0664d2643e84f8da0711b40c
SHA1: 583b196bd8eb52726afbb070b5b7aa444adc46b3
SHA256: ca5067a428cb90759552ceb50be7ef790373e92d5a57f15d764f41604e821c7f
Dutch (nl)
⬇️ dictionary-nl.json
14.75MiB (15.47MB) – 117,108 words
Checksums (click to show)
MD5: 98cf6db79d8141c93fc163bc6a3b6c50
SHA1: ca23ef8cfd200476d142f3afe65e344f3999a1f1
SHA256: e632acc2cf8db62a71e4deae0effaa70cb5fa27fb27af0355104df994f158e77
Norwegian Nynorsk (nn)
⬇️ dictionary-nn.json
5.046MiB (5.291MB) – 55,817 words
Checksums (click to show)
MD5: d09c3e1a838c2dfc34efd78505125597
SHA1: c2e5d5bc9f0268ea75985c3e00d907abfd6468f0
SHA256: d9b7b766052d6c596b247419b00f63518aa1d70a5f1afccde22a2fd97dfe6da1
Occitan (oc)
⬇️ dictionary-oc.json
1.386MiB (1.454MB) – 6,338 words
Checksums (click to show)
MD5: c445bcda38494a181e644dac6e837c59
SHA1: d0a7484cae0cf3c533b21a7114e3e55913f811aa
SHA256: 254a6cd9a375ba2e1f7bcda78fc790b7a732b91f362696e14dd83aac99dfb50f
Punjabi (pa)
⬇️ dictionary-pa.json
927.1KiB (949.4KB) – 4,956 words
Checksums (click to show)
MD5: 5c979254cdaf37bb2418b8798e4e1d1b
SHA1: e21b49c4e7912b7e8209756cc26ff8cc487ad430
SHA256: f625c3fcc223711f4d9e462c14b400aacf4585e5f9b12fc0c80526bad4a54fcd
Polish (pl)
⬇️ dictionary-pl.json
32.09MiB (33.64MB) – 134,676 words
Checksums (click to show)
MD5: 6f72024798a95b7bc73ae85f694031bc
SHA1: bfcd47ec12e2a48941c1bd20db3cd08757580c85
SHA256: 3a4d47e62f39bb74b0bf30c07bcb80a4f14af6055a03037deae90644d6cf7cd6
Portuguese (pt)
⬇️ dictionary-pt.json
35.98MiB (37.73MB) – 358,190 words
Checksums (click to show)
MD5: 22f1125e33bd0520300de88081f50a70
SHA1: 303c26210437567e9f620ec92f8a176fb74ef630
SHA256: 95941a69d81b790b2dfd809d3c8769db6b4b024c536439b143eef0a056af5c80
Romansch (rm)
⬇️ dictionary-rm.json
193.4KiB (198.0KB) – 2,135 words
Checksums (click to show)
MD5: 0d23b50de8cbce8be474ceab33635dc1
SHA1: 72e54222c9a80a73ef012024d3f2ed6bc9a9c1e9
SHA256: ce022ffbf6d8e0eab81ed0c30d51920b0fae20660fe0508c7aa087098ad5a3f0
Romanian (ro)
⬇️ dictionary-ro.json
17.86MiB (18.73MB) – 105,877 words
Checksums (click to show)
MD5: 8dc0305b4026a866687e03d2a407a5a1
SHA1: 4f2249698272ffa1b337449f26a7aca41ca9334e
SHA256: 7ed7087a2bd77912431f2778f5310549a3cf9e000eb14b66a54f37cbf723c8cb
Russian (ru)
⬇️ dictionary-ru.json
94.72MiB (99.32MB) – 402,190 words
Checksums (click to show)
MD5: 4e9592ae0542e059f06b390b83cb2ee3
SHA1: 63283b59d27d0f52146630644ff0af52142eb4df
SHA256: 29ba7d436c7f66907cf81c5f6a067e8a952f6694527aaedd36d070d08c13fbc8
Santali (sat)
⬇️ dictionary-sat.json
155.3KiB (159.1KB) – 708 words
Checksums (click to show)
MD5: 6d11a7a27e70fb4ad1b465dfbb72869c
SHA1: 3080374c535784b66e21fa2efe0a4e1c712c827e
SHA256: 48ed6dcf25d0c6a25d7b4752b408394969717b8da8b05e069ab7ed5401869458
Sardinian (sc)
⬇️ dictionary-sc.json
121.1KiB (124.0KB) – 1,244 words
Checksums (click to show)
MD5: 151d009b2ff98b0c5d78ac4ac4a1408d
SHA1: 149bbeaeb260c0f3e8b6047cacc9157f24a25e8d
SHA256: 9753f024ecd0b50ba96b1d4d433816a91e4e87ca25d255629220a02cb671fd8c
Sicilian (scn)
⬇️ dictionary-scn.json
284.5KiB (291.3KB) – 2,663 words
Checksums (click to show)
MD5: 56e1cbd6368c54eb2d9bfd630be9dd46
SHA1: 0e18a7b17601a938fed0080d45e500ee66f0f61b
SHA256: f0a665acbd7c06f335ff1fdcf9e147fe799010fbbc92d4505213089dff5cbce9
Scots (sco)
⬇️ dictionary-sco.json
395.5KiB (405.0KB) – 4,518 words
Checksums (click to show)
MD5: a805d7df23c291a3b0326bb565cbf540
SHA1: c547bb450309f1201a923b423f6f14af387d0a67
SHA256: 3a5399e08258b836c0e43cb18aca0f067bbf040b7028b90d629ce4ea407195bc
Serbo-Croatian (sh)
⬇️ dictionary-sh.json
18.10MiB (18.98MB) – 60,561 words
Checksums (click to show)
MD5: 691c49173ca413c4389cca777180be1d
SHA1: 76d984a1d71f01b4e8fcc96d8a4554b3e5ac0447
SHA256: 6f32f8caddbac02a2e0c6e580bee9ed222897457ac463a966bd84e627fefdcb1
Sinhalese (si)
⬇️ dictionary-si.json
74.72KiB (76.52KB) – 805 words
Checksums (click to show)
MD5: 1e6b4132be29f14adc468a7765c260f2
SHA1: fc03fbfe96b361491df88b78071efb06e2bc09c1
SHA256: ce43342187a1be9ce115457a03393f0e9cfde79ccdf69d5098cb8fb9c7010439
Slovak (sk)
⬇️ dictionary-sk.json
1.402MiB (1.470MB) – 10,121 words
Checksums (click to show)
MD5: 6229338e284241b423dbb8417f005eba
SHA1: ea5af1d3bbc8b0eabac43c158a77c7e58973fafc
SHA256: ed3f7d9e5c0c4dd20b34c39148e30fca6079aaa5f208411a3e10dea683c49a25
Saraiki (skr)
⬇️ dictionary-skr.json
33.24KiB (34.04KB) – 292 words
Checksums (click to show)
MD5: a0451913b771e62c18cbd44538f0ff35
SHA1: 14835218b7113253ef763752318c09b4376c34c8
SHA256: a937d9c836e4f2c8f7b50abf574cf2004108933138ec5229100c93919de77904
Slovenian (sl)
⬇️ dictionary-sl.json
1.050MiB (1.101MB) – 6,359 words
Checksums (click to show)
MD5: 3e678fdb9a3d102ffa33110abb547b9c
SHA1: de127959abc1fe3549aefbe97c71c4fa8d2f6b74
SHA256: c86ac5279ff2af5aae936f8184691c6d0daf62d7561a59b7b5287cd2482ce613
Albanian (sq)
⬇️ dictionary-sq.json
1.313MiB (1.377MB) – 9,331 words
Checksums (click to show)
MD5: 91aac6a93debd1f347ac0219f0597a43
SHA1: ebfb3874e5fb71e90c5825b24fe67b923ec49378
SHA256: be3497e64e8d1cd8e7eb64f8cd8d5878ddfb61753f286ae33bc90d486675e451
Swedish (sv)
⬇️ dictionary-sv.json
24.44MiB (25.63MB) – 258,424 words
Checksums (click to show)
MD5: aadc04911c91248329715017e84d74b1
SHA1: 68442a96367ec4fdc6f2e176948aef65f624766e
SHA256: 3a885301802ed0a2d141bd120056d094eed79c7e00cf2385682753a28f17098e
Silesian (szl)
⬇️ dictionary-szl.json
372.8KiB (381.7KB) – 2,012 words
Checksums (click to show)
MD5: dfcf09dd713a20389529ed34ad97161b
SHA1: 22f225cc54ec6335dbeff6ca1a8b9eba19c4a4ca
SHA256: 3a5f4f5913d458185ca2f470bcd46af914bdcb98e15941bf2e7c11dc5e1f978a
Tamil (ta)
⬇️ dictionary-ta.json
5.838MiB (6.122MB) – 7,368 words
Checksums (click to show)
MD5: ef6ccb92fd0dd5748da7dee63d11928a
SHA1: bf557c7269777e68bb820dbda7a8b08078a6e343
SHA256: 0c16c50499ed4a2dcc90c2518ce6fdc55b659f318722c28f304c9351bc3ea9c1
Telugu (te)
⬇️ dictionary-te.json
2.647MiB (2.775MB) – 19,087 words
Checksums (click to show)
MD5: 69df786f6df16294525d9d559548d293
SHA1: deca5824d3b4621b86f2f93e2194d4992e047b3e
SHA256: 958243bbf784f090832a8b3610da4d8dd80d0d9520005dd273683160f528f6e6
Tajik (tg)
⬇️ dictionary-tg.json
666.8KiB (682.8KB) – 1,713 words
Checksums (click to show)
MD5: 2a6ef55d038e447fe8b92642cae0e7a7
SHA1: f0ec84b399037d7056dc34c493a64cf611b4c1e6
SHA256: 177581afd24f3e41d2a8ef3cffa2d86afc77c103ad8bfb009a80e75a3a38376e
Thai (th)
⬇️ dictionary-th.json
2.740MiB (2.873MB) – 15,749 words
Checksums (click to show)
MD5: c221e124004b58bf4dbc5f886ed362e7
SHA1: aa1541dd117681b63981fa4368119b12165d8570
SHA256: b80b7b76aac81663d4ef8b3be10b09bb10629ad4ced0e6b155115a035a871122
Tagalog (tl)
⬇️ dictionary-tl.json
4.383MiB (4.596MB) – 21,784 words
Checksums (click to show)
MD5: 50aaa0e1547625f4e874fb872f2c6610
SHA1: c27f3f1cc9c951a60ca397c1d38053e0d101a5ea
SHA256: f790124c4470504c1d81c46a9e9c19d2c68e52b779e449157b1a4cc9b4e51fa5
Turkish (tr)
⬇️ dictionary-tr.json
13.57MiB (14.23MB) – 29,648 words
Checksums (click to show)
MD5: ca7ed82b95cb744133ef804419a04815
SHA1: 1800f40ed5ad7102f552c1bed784d3c91756a4c6
SHA256: f5c06e23774ab9eb860dcd066c607f6bc59b3071b24a3d57cef25888a0974354
Chicahuaxtla Triqui (trs)
⬇️ dictionary-trs.json
1.456KiB (1.491KB) – 26 words
Checksums (click to show)
MD5: e2c9c453d9e7a6ac668fc6beea4d9bd5
SHA1: ba7a7d40e88b3111812fb7b1a9014d0620021b8d
SHA256: ffbd01e39bdcf549196c779e4d2103f4d14dbfa51c756993ce928c232c563310
Ukrainian (uk)
⬇️ dictionary-uk.json
14.84MiB (15.56MB) – 42,549 words
Checksums (click to show)
MD5: 99d313d29d5d3bfd6c7b56a576c37ccc
SHA1: 8b17739aef94fb3be3e0821f2effc0f9c6043cdf
SHA256: 75fe049309aa0db8c4e9cebafdebb2c75f462dd1af3a83b6c075bff27b852d34
Urdu (ur)
⬇️ dictionary-ur.json
992.0KiB (1.016MB) – 5,818 words
Checksums (click to show)
MD5: 4e7a89b2dfc1c8e8664e2f5ee073bf85
SHA1: 01d21cc1573522efe82bc47561ed399987a6dd12
SHA256: ed369dbe6737916cade674f75394f51b9066f1459c3aaa5ac945682b6d39b00a
Uzbek (uz)
⬇️ dictionary-uz.json
1.394MiB (1.462MB) – 3,273 words
Checksums (click to show)
MD5: 31b106309d8f27972974cf0c4d8ad3ee
SHA1: 821b98177eaabbb92fd90bcf35ba0f85fd8807ff
SHA256: 83e958d01c69274df464b5a7ea2ad1cce9702c550bf3154666e464a91ce6e4bd
Vietnamese (vi)
⬇️ dictionary-vi.json
1.387MiB (1.454MB) – 10,818 words
Checksums (click to show)
MD5: 30819a8f34ee4174aba6fbe3430dd604
SHA1: 7f2b309405f9682ca770e426dad2b986c1e69e11
SHA256: d79a559d9fa10e082b9b919fb2b05239611fe0a4f0dc6bd80c6108bd1c46a35d
Wolof (wo)
⬇️ dictionary-wo.json
40.42KiB (41.39KB) – 663 words
Checksums (click to show)
MD5: 7b6176e14beeb41ba87d69538dde3857
SHA1: 950914008028d8d4581b7824b61a8223a7d2ca2e
SHA256: 97062f638b66dd8f1ca4b9a086bcc5657ea9f0c368bc4b1210cb3d45ab36761e
Xhosa (xh)
⬇️ dictionary-xh.json
374.4KiB (383.4KB) – 3,197 words
Checksums (click to show)
MD5: 5ad169b7541f5b26c2552eed2d187ced
SHA1: cc40a525bc25e5e9058f7630a7585255dc4bce80
SHA256: 62f7a979b35c49fbf7436744ddaaa667152ac86a0f0f076e74da83e49b76a471
Central Mahuatlán Zapoteco (zam)
⬇️ dictionary-zam.json
38B – 1 word
Checksums (click to show)
MD5: 9a51d9bb061770d2bd703c129ea2ada7
SHA1: df4d4cf9f7feb21743c349628b2d582dcd7f10f7
SHA256: 128f4ee65779ef8b2eab83bafba6d87b595cc8b6d24aff01d5b2f751b69a2f63
Chinese (zh)
⬇️ dictionary-zh.json
23.51MiB (24.65MB) – 144,922 words
Checksums (click to show)
MD5: ae85778a2de20f884ae6ecb240f64823
SHA1: b2d43b40422feed0a7e79140b3d74e676dc55b34
SHA256: a12cbf67ee3dc8d526a61240abec3662524826120e34acd8e3b01a8cb4bf1d6a

Dictionaries

Wiktionary

Uses the English Wiktionary dictionary data. It is created from the Wiktionary dumps, which is converted to a JSON Lines format by kaikki.org using their open source Wiktextract tool. See the Wiktextract paper for more information. The resulting over 15 GiB file for all languages combined is then preprocessed to create a minimal dictionary for each language using the scripts in this repository. The English Wiktionary currently includes words in over 4,400 languages, so the scripts automatically select the around 100 languages supported by Mozilla (Firefox and/or Thunderbird) or those with 50,000 words or more. This includes most modern languages, as well as Latin. The underlining Wiktionary dump files are updated monthly, but kaikki.org updates the extracted JSON files weekly to incorporate improvements made to their Wiktextract tool. If users notice any errors in the data, they should correct them by directly editing Wiktionary and this will automatically be included in the next monthly update.

Licensed under both the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0) and the GNU Free Documentation License (GFDL), so users must attribute it to Wiktionary.

JSON format

Uses the JSON Lines format, where each line is a JSON object for a word in the dictionary. Each JSON object may have the following keys:

  • "" (empty string) - String with the word (required)
  • "p" - Array of strings with the parts of speech (POS) (required)
  • "d" - Array of strings with the definitions of the word (required)
  • "f" - Array of strings with the forms of the word (optional)
  • "s" - Array of strings with the synonyms of the word (optional)
  • "n" - Array of strings with the antonyms of the word (optional)
  • "i" - String with the International Phonetic Alphabet (IPA) pronunciation (optional)
  • "a" - String with the filename for the pronunciation audio file in OGG Vorbis format (.ogg), add the https://upload.wikimedia.org/wikipedia/commons/ prefix to get the full URL (optional)
  • "w" - Array of strings with the titles of the Wikipedia pages about the word, possibly prefixed with a language ID (optional)

Words, forms, synonyms and antonyms with any whitespace characters are excluded, as well as some POS categories that are not words, such as "character", "symbol", "prefix" and "suffix".

JSON format

See above for the specific format of each dictionary.

Contributing

Merge requests welcome! Ideas for contributions:

  • Improve the performance of the update scripts.
  • Reduce the size of the dictionaries.
  • Provide localized versions of the dictionaries.
  • Add more dictionaries.