This project is intended to be a modern, from-the-ground-up, maintainable alternative to GoldenDict(-ng), developed with Flask and React.
You can access the live demo here (the button to delete dictionaries is removed). It lives inside a free Okteto container, which sleeps after 24 hours of inactivity, so please bear with its slowness and refresh the page a few times if you are seeing a 404 error, and remember that it may be (terribly) out of sync with the latest code changes.
The dark theme is not built in, but rendered with the Dark Reader Firefox extension.
- The wildcard characters are
^
and+
(instead of%
and_
of SQL or the more traditional*
and?
) for technical reasons. Hint: imagine%
and_
are shifted one key to the right on an American keyboard. - This project creates a back-up of DSL dictionaries, overhauls1 them and silently overwrites the original files. So after adding a DSL dictionary to SilverDict, it may no longer work with GoldenDict.
- During the indexing process of DSL dictionaries, the memory usage could reach as high as 1.5 GiB (tested with the largest DSL ever seen, the Encyclopædia Britannica), and even after that the memory used remains at around 500 MiB. Restart the server process and the memory usage will drop to a few MiB.
- Python2-powered
- Cleaner code
- Deployable both locally and on a self-hosted server
- Fast enough
- Minimalist web interface
- Separable client and server components
- Linux: RPM/Deb packaging (will do when the project is more mature)
- Windows: package everything into a single click-to-run executable (will do when the project is more mature)
- Add support for Babylon BGL glossary format
- Add support for StarDict format
- Add support for ABBYY Lingvo DSL format3
- Reduce the memory footprint of the MDict Reader
- Inline styles to prevent them from being applied to the whole page (The commented-out implementation in
mdict_reader.py
breaks richly-formatted dictionaries.)4 - Reorganise APIs (to facilitate dictionary groups)
- Ignore diacritics when searching (testing still wanted from speakers of Turkish, the Semitic languages and Asian languages other than CJK)
- Ignore case when searching
- GoldenDict-like morphology-awareness (walks -> walk) and spelling check (fuzzy-search, that is, malarky -> malady, Malaya, malarkey, Malay, Mala, Maalox, Malcolm)
- Transliteration for the Cyrillic, Greek, Arabic, Hebrew and Devanagari scripts
- Add the ability to set sources for automatic indexing, i.e. dictionaries put into the specified directories will be automatically added
- Recursive source scanning
- Multithreaded article extraction
- Improve the performance of suggestions matching (partially done, 'contains' search is still slow)
- Make the suggestion size customisable
- Allow configure suggestion matching mode, listening address, running mode, etc. via a configuration file, without modifying code
- Offer readily built static files for users unfamiliar with the front-end development process (Artefacts built with GitHub Actions are only visible to me and the URL is not permanent)
- Allow zooming in/out of the definition area
- Make the strings translatable
- Beautify the dialogues (help wanted!)
- GoldenDict-like dictionary group support
- A mobile-friendly interface (retouch needed)
- A real mobile app
- Malformed DSL tags
- Make the dialogues children of the root element (How can I do this with nested dialogues?)
- (Possibly) pango's colour tags
- Only display dictionaries containing the headword searched for in the right pane (requires API change)
-
Button to clear queryBetter idea: select the query on focus - ?? Button to search in page (see https://stackoverflow.com/questions/8080217/use-browser-search-ctrlf-through-a-button-in-website)
This project utilises some Python 3.10 features, such as the match syntax, and a minimal set of dependencies:
PyYAML # for better efficiency, please install libyaml
Flask
Flask-Cors
waitress
lxml
The simplest method to use this app is to run it locally. I would recommend running the custom HTTP server in the http_server
sub-directory, which forwards requests under /api
to the back-end, and serves static files in ./build/
.
cd client
yarn install
yarn build
mv build ../http_server/
And then:
pip3.10 install -r http_server/requirements.txt # or install with your system package manager
python3.10 http_server/http_server.py # working-directory-agnostic
pip3.10 install -r server/requirements.txt
python3.10 server/server.py # working-directory-agnostic
Then access it at localhost:8081.
Alternatively, you could use dedicated HTTP servers such as nginx to serve the static files and proxy API requests. Check out the sample config for more information.
I recommend nginx if you plan to deploy SilverDict to a server. Before building the static files, be sure to modify API_PREFIX
in config.js
, and then place them into whatever directory where nginx looks for static files. Remember to reverse-proxy all API requests and permit methods other than GET
and POST
.
Assuming your distribution uses systemd, you can refer to the provided sample systemd config and run the script as a service.
NB: currently the server is memory-inefficient: running the server with eight mid- to large-sized MDict dictionaries consumes ~200 MiB of memory, which is much higher than GoldenDict.5 If you want an MDict server with low memory footprint, take a look at xiaoyifang/goldendict-ng#229 and subscribe to its RSS feed. A possible work-around: ditch MDict. Convert to other formats with pyglossary (might not work). There are no such issues with StarDict or DSL.
Check out my guide.
[Horribly outdated. Will work on this soon.]
The favicon is the icon for 'Dictionary' from the Papirus icon theme, licensed under GPLv3.
This project uses or has adapted code from the following projects:
Name | Developer | Licence |
---|---|---|
mdict-analysis | Xiaoqiang Wang | GPLv3 |
python-stardict | Su Xing | GPLv3 |
dictionary-db | Jean-François Dockes | GPL 2.1 |
idzip | Ivo Danihelka | |
pyglossary | Saeed Rasooli | GPLv3 |
I would also express my gratitude to Jiang Qian for his suggestions, encouragement and great help.
Footnotes
-
What it does: (1) decompress the dictionary file if compressed; (2) remove the BOM, non-printing characters and strange symbols (only
{·}
currently) from the text; (3) normalize the initial whitespace characters of definition lines; (4) overwrite the.dsl
file with UTF-8 encoding and re-compress with dictzip. After this process the file is smaller and easier to work with. ↩ -
A note about type hinting in the code: I know for proper type hinting I should use the module
typing
, but the current way is a little easier to write and can be understood by VS Code. ↩ -
I tested with an extremely ill-formed DSL dictionary, and before such devilry my cleaning code is powerless. I will look into how GoldenDict handles this. ↩
-
The use of a custom styling manager such as Dark Reader is recommended until I fix this, as styles for different dictionaries meddle with each other. ↩
-
I grabbed a profiler and found the root of the cause: the MDict library stores many things in memory, so it is impossible for me to fix this without rewriting the library. Besides, I cannot instantiate
MDX
lazily, or the waiting time would easily get well beyond half a second. ↩