mdict (*.mdd *.mdx) file reader based on jeka-kiselyov/mdict .
Very thanks to fengdh and jeka-kiselyov.
- upgrade xmldom to v0.7.5
- support rangeKey API
- add rangeKeyBlock interface
- enhance performance of readKeyBlock
- fix associate can't find special character bug
- fix accurate record start offset parse_definition bug
- fix
findList
returnundefined
will crash theassociate
andprefix
method bug
- fix babel-runtime dependencies issue
- fix UpperCase key sensitive options logic, details #41
- fix 1.2 mdx keyblock read bug
- correct some Header properties (StripKey..)
very thanks to @songxiaocheng
- fix typings declaring and reformat codebase
- fix some
.mdd
file reading issues, and if you search mdd file, uselookup
method, and it will return base64 data
- rewrite
typings/mdict.d.ts
- rename
typings/Mdict.d.ts
totypings/mdict.d.ts
- fix uppercase words comparing missed bug
- fix
out of index error
when cannot locate word offset - if cannot find the word key block, return
undefined
- rename Mdict.js to mdict.js , rename MdictBase.js to mdict-base.js, fix import error on ubuntu.
- support search words by prefix
associate
(the phrase as the words' prefix, not the phrase's prefix as search token just likeprefix
function) - some security updates
very thanks to @Danjame
- ES6 implemention
- rewrite the decode code, more readable decode api
- NOT SUPPORT BROWSER CURRENTLY
- add
fuzzy_search
method, which supports fuzzy word search
not support browser yet
npm install js-mdict
import Mdict from 'js-mdict';
// Note: *.mdd file only support lookup method.
// loading dictionary
const dict = new Mdict('mdx/testdict/oale8.mdx');
// console.log(mdict.lookup('interactive'));
// console.log(mdict.bsearch('interactive'));
// console.log(mdict.fuzzy_search('interactive', 5));
// console.log(mdict.prefix('interactive'));
console.log(dict.lookup('hello'));
/*
{ keyText: "hello",
definition: "你好",
}
*/
console.log(dict.prefix('hello'));
/*
[
{ key: 'he', rofset: 64744840 },
{ key: 'hell', rofset: 65513175 },
{ key: 'hello', rofset: 65552694 }
]
*/
let word = 'informations';
dict.suggest(word).then((sw) => {
// eslint-disable-next-line
console.log(sw);
/*
[ 'information', "information's" ]
*/
});
word = 'hitch';
const fws = dict.fuzzy_search(word, 20, 5);
console.log(fws);
/*
[
{ key: 'history', rofset: 66627131, ed: 4 },
{ key: 'hit', rofset: 66648124, ed: 2 },
{ key: 'hit back', rofset: 66697464, ed: 4 },
{ key: 'hit back', rofset: 66697464, ed: 4 },
{ key: 'hit big', rofset: 66698789, ed: 4 },
{ key: 'hitch', rofset: 66698812, ed: 0 },
{ key: 'hitched', rofset: 66706586, ed: 2 },
{ key: 'hitcher', rofset: 66706602, ed: 2 },
{ key: 'hitches', rofset: 66706623, ed: 2 },
{ key: 'hitchhike', rofset: 66706639, ed: 4 },
{ key: 'hitchhiker', rofset: 66710697, ed: 5 },
{ key: 'hitching', rofset: 66712273, ed: 3 },
{ key: 'hi-tech', rofset: 66712289, ed: 2 },
{ key: 'hit for', rofset: 66713795, ed: 4 }
]
*/
console.log(dict.parse_defination(fws[0].key, fws[0].rofset));
/*
{
keyText: 'history',
definition: `<link rel="stylesheet" type="text/css" href="oalecd8e.css"><script src="jquery.js" charset="utf-8" type="text/javascript" language="javascript"></script><script src="oalecd8e.js" charset="utf-8" type="text/javascript" language="javascript"></script><span id="history_e" name="history" idm_id="000017272" class="entry"><span class="h-g"><span class="top-g"><span...
}
*/
Mdict#loading time: 0 sec
Mdict#lookup x 34.17 ops/sec ±0.52% (59 runs sampled)
wooorm#levenshtein x 173,386 ops/sec ±1.51% (88 runs sampled)
Mdict#prefix x 26.98 ops/sec ±5.89% (47 runs sampled)
Mdict#fuzzy_search x 6.89 ops/sec ±8.83% (22 runs sampled)
Mdict#associate x 16.59 ops/sec ±2.94% (44 runs sampled)
Fastest is Mdict#lookup
depreciate if you use js-mdict@3.x, please upgrade to js-mdict@4.0.5+, because js-mdict@3.x was loaded the whole dictionary data to build the index, took a lot of time.
and the api has already changed, please do not use that version.
the api of js-mdict@3.x:
import Mdict from 'js-mdict';
const mdict = new Mdict('mdx/oale8.mdx');
console.log(mdict.lookup('hello'));
console.log(mdict.prefix('hello'));
// get fuzzy words
fuzzy_words = mdict.fuzzy_search(
'wrapper',
5,
/* fuzzy words size */ 5 /* edit_distance */
);
/*
example output:
[ { ed: 0, idx: 108605, key: 'wrapper' },
{ ed: 1, idx: 108603, key: 'wrapped' },
{ ed: 1, idx: 108606, key: 'wrappers' },
{ ed: 3, idx: 108593, key: 'wrangler' },
{ ed: 3, idx: 108598, key: 'wrap' },
{ ed: 3, idx: 108607, key: 'wrapping' },
{ ed: 4, idx: 108594, key: 'wranglers' },
{ ed: 4, idx: 108595, key: 'wrangles' },
{ ed: 4, idx: 108609, key: 'wrappings' } ]
*/
// get definition
console.log(mdict.parse_defination(fuzzy_words[0].idx));
depreciate if you use js-mdict @2.0.3, you can use api shown below:
Note: 2.0.3 not supports mdd file, and record info encrypted file
import path from 'path';
import Mdict from 'js-mdict';
const dictPath = path.join(__dirname, '../resource/Collins.mdx');
const mdict = new Mdict(dictPath);
mdict
.build()
.then((_mdict) => {
console.log('hello', _mdict.lookup('hello'));
console.log('world', _mdict.lookup('world'));
console.log(_mdict.attr());
})
.catch((err) => {
console.error(err);
});
this picture is from @ikey4u/wikit
code by terasum with ❤️