/dictionaries

Hunspell dictionaries in UTF-8

Primary LanguageJavaScriptMIT LicenseMIT

dictionaries

Collection of normalized and installable hunspell dictionaries.

Contents

What is this?

This monorepo is a bunch of scripts that crawls dictionaries from several sources, normalizes them, and packs them so that they can each be installed and used in one single way. Dictionaries are not maintained here but they are usable from here.

When should I use this?

You can particularly use the packages here as a programmer when integrating with other tools (such as nodehun, nspell) or when making such tools.

Install

In Node.js (version 12.20+, 14.14+, or 16.0+), install with npm:

npm install dictionary-en

👉 Note: replace en with the language code you want.

⚠️ Important: this project itself is MIT, but each index.dic and index.aff file still has its original license!

Use

import dictionaryEn from 'dictionary-en'

dictionaryEn(function (error, en) {
  if (error) throw error
  console.log(en)
  // To do: use `en` somehow
})

Yields:

{dic: <Buffer>, aff: <Buffer>}

List of dictionaries

👉 Note: preferred BCP-47 codes are used (according to Unicode CLDR). To illustrate, as American English and Brazilian Portuguese are the most common types of English and Portuguese respectively, they get the codes en and pt.

In total 92 dictionaries are provided.

Name Description License
dictionary-bg Bulgarian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-br Breton (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ca Catalan (GPL-2.0 OR LGPL-2.1)
dictionary-ca-valencia Catalan (Valencian) (GPL-2.0 OR LGPL-2.1)
dictionary-cs Czech GPL-2.0
dictionary-cy Welsh LGPL-3.0
dictionary-da Danish (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-de German (GPL-2.0 OR GPL-3.0)
dictionary-de-at German (Austria) (GPL-2.0 OR GPL-3.0)
dictionary-de-ch German (Switzerland) (GPL-2.0 OR GPL-3.0)
dictionary-el Modern Greek (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-el-polyton Modern Greek (Polytonic Greek) GPL-3.0
dictionary-en English (MIT AND BSD)
dictionary-en-au English (Australia) (MIT AND BSD)
dictionary-en-ca English (Canada) (MIT AND BSD)
dictionary-en-gb English (United Kingdom) (MIT AND BSD)
dictionary-en-za English (South Africa) LGPL-2.1
dictionary-eo Esperanto GPL-2.0
dictionary-es Spanish (or Castilian) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ar Spanish (or Castilian; Argentina) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-bo Spanish (or Castilian; Bolivia) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cl Spanish (or Castilian; Chile) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-co Spanish (or Castilian; Colombia) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cr Spanish (or Castilian; Costa Rica) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cu Spanish (or Castilian; Cuba) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-do Spanish (or Castilian; Dominican Republic) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ec Spanish (or Castilian; Ecuador) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-gt Spanish (or Castilian; Guatemala) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-hn Spanish (or Castilian; Honduras) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-mx Spanish (or Castilian; Mexico) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ni Spanish (or Castilian; Nicaragua) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pa Spanish (or Castilian; Panama) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pe Spanish (or Castilian; Peru) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ph Spanish (or Castilian; Philippines) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pr Spanish (or Castilian; Puerto Rico) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-py Spanish (or Castilian; Paraguay) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-sv Spanish (or Castilian; El Salvador) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-us Spanish (or Castilian; United States) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-uy Spanish (or Castilian; Uruguay) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ve Spanish (or Castilian; Venezuela) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-et Estonian LGPL-2.1
dictionary-eu Basque GPL-2.0
dictionary-fa Persian Apache-2.0
dictionary-fo Faroese (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-fr French MPL-2.0
dictionary-fur Friulian GPL-2.0
dictionary-fy Western Frisian GPL-3.0
dictionary-ga Irish GPL-2.0
dictionary-gd Scottish Gaelic (or Gaelic) GPL-3.0
dictionary-gl Galician GPL-3.0
dictionary-he Hebrew AGPL-3.0
dictionary-hr Croatian (LGPL-2.1 OR SISSL)
dictionary-hu Hungarian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-hy Armenian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-hyw Western Armenian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ia Interlingua GPL-3.0
dictionary-ie Interlingue (or Occidental) Apache-2.0
dictionary-is Icelandic CC-BY-SA-3.0
dictionary-it Italian GPL-3.0
dictionary-ka Georgian MIT
dictionary-ko Korean (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-la Latin GPL-2.0
dictionary-lb Luxembourgish (or Letzeburgesch) EUPL-1.1
dictionary-lt Lithuanian BSD-3-Clause
dictionary-ltg Latgalian LGPL-2.1
dictionary-lv Latvian LGPL-2.1
dictionary-mk Macedonian GPL-3.0
dictionary-mn Mongolian LPPL-1.3c
dictionary-nb Norwegian Bokmål GPL-2.0
dictionary-nds Low German (or Low Saxon) GPL-3.0
dictionary-ne Nepali LGPL-2.1
dictionary-nl Dutch (or Flemish) (BSD-3-Clause OR CC-BY-3.0)
dictionary-nn Norwegian Nynorsk GPL-2.0
dictionary-oc Occitan (post 1500) GPL-2.0
dictionary-pl Polish (GPL-3.0 OR LGPL-3.0 OR MPL-2.0)
dictionary-pt Portuguese (LGPL-3.0 OR MPL-2.0)
dictionary-pt-pt Portuguese (Portugal) (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ro Romanian (or Moldavian; or Moldovan) (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ru Russian LGPL-3.0
dictionary-rw Kinyarwanda GPL-3.0
dictionary-sk Slovak (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-sl Slovenian (GPL-3.0 OR LGPL-2.1)
dictionary-sr Serbian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1 OR CC-BY-SA-3.0)
dictionary-sr-latn Serbian (Latin script) (GPL-2.0 OR LGPL-2.1 OR MPL-1.1 OR CC-BY-SA-3.0)
dictionary-sv Swedish LGPL-3.0
dictionary-sv-fi Swedish (Finland) LGPL-3.0
dictionary-tk Turkmen Apache-2.0
dictionary-tlh Klingon (or tlhIngan Hol) Apache-2.0
dictionary-tlh-latn Klingon (or tlhIngan Hol; Latin script) Apache-2.0
dictionary-tr Turkish MIT
dictionary-uk Ukrainian GPL-3.0
dictionary-vi Vietnamese GPL-2.0

Examples

Example: use with nspell

This example uses dictionary-en in combination with nspell.

Show install command for this example
npm install dictionary-en nspell
import dictionaryEn from 'dictionary-en'
import nspell from 'nspell'

dictionaryEn(function (error, en) {
  if (error) throw error
  const spell = nspell(en)
  console.log(spell.correct('color'))
  console.log(spell.correct('colour'))
})

Yields:

true
false

Example: load files from ESM

This example loads the index.dic and index.aff files located in dictionary-hyw (Western Armenian) from a Node.js JavaScript module (ESM).

It uses a ponyfill (import-meta-resolve) for an experimental Node API.

Show install command for this example
npm install dictionary-hyw import-meta-resolve
import fs from 'node:fs/promises'
import {resolve} from 'import-meta-resolve'

main()

async function main() {
  const base = await resolve('dictionary-hyw', import.meta.url)
  const dic = await fs.readFile(new URL('index.dic', base))
  const aff = await fs.readFile(new URL('index.aff', base))
  console.log(dic, aff)
}

Example: load files from CommonJS

This example loads the index.dic and index.aff files located in dictionary-tlh (Klingon) from a Node.js CommonJS script (CJS).

Show install command for this example
npm install dictionary-tlh
const fs = require('node:fs')
const path = require('node:path')

main()

async function main() {
  const base = require.resolve('dictionary-tlh')
  const dic = await fs.readFile(path.join(base, 'index.dic'))
  const aff = await fs.readFile(path.join(base, 'index.aff'))
  console.log(dic, aff)
}

Example: use with macOS

Follow these steps to use a dictionary on macOS:

  1. Navigate to the dictionary you want on GitHub, such as dictionaries/$code (replace $code with the language code you want)
  2. Download the index.aff and index.dic files (i.e., open them, right-click “Raw”, and “download linked files”)
  3. Rename the download files to $code.aff and $code.dic
  4. Move $code.aff and $code.dic into the folder ~/Library/Spelling/
  5. Go to System Preferences > Keyboard > Text > Spelling and select your added language (it should come with the (Library) suffix and is situated at the bottom)

Types

The dictionaries are typed with TypeScript.

Contribute

Yes please! See How to Contribute to Open Source.

Build

To build this project, on macOS, you at least need to install:

  • wget: brew install wget (crawling)
  • hunspell: brew install hunspell (many dictionaries)
  • sed: brew install gnu-sed (crawling, many dictionaries)
  • coreutils: brew install coreutils (many dictionaries)
  • ispell: brew install ispell (German)

👉 Note: sed and the GNU replacements should be setup in PATH to overwrite macOS defaults.

Updating a dictionary

Dictionaries are not maintained here. Report problems upstream.

Adding a new dictionary

Dictionaries are not maintained here. Most languages have a small community or institute that maintains a dictionary, and they often do so on GitHub or similar. Please ask in the issues to request that such a dictionary is included here.

👉 Note: acceptable dictionaries must:

  • have a significant affix file (not just a .dic file)
  • have an open source license
  • have recent contributions

License

MIT © Titus Wormer

See license files in each dictionary for the licensing of index.dic and index.aff files.