N-gram search is a module for Omeka S that enables CJK-ready full-text search using MySQL's n-gram tokenizer.
The default installation of the full-text search feature of the Omeka-S is not CJK (Chinese, Japanese, Korean) ready because the apropriate tokenizer is not used. This module simply activates n-gram tokenizer by modifying the table information that internally used by Omeka-S.
First of all, backup the database. This module modifies the table schema, and that may cause unrecoverable failure.
This modules requires MySQL 5.6 or later. MariaDB currently does not provide n-gram tokenizer. If you want to enable CJK-ready search with MariaDB, try Mroonga search instead.
See the release page
and download the latest NgramSearch.zip
from the list. Then unzip it in the
modules
directory of Omeka-S, then enable the module from the admin
dashboard. Read the
user manual of Omeka-S
for further information.
Please do not forget to rename the directory from Omeka-S-ngram-search
to
NgramSearch
in the modules
directory.
This module highly depends on the database structure of Omeka-S 2.x. If you are upgrading Omeka-S from 2.x to 3.x or later, we highly recommend you to uninstall this module before upgrading.
We have not heavily tested MySQL's n-gram tokenizer with large sized data yet. For advanced full-text search, we recommend you to check the Solr module.
Copyright (c) 2020 Kentaro Fukuchi
This module is released under the MIT License. See the LICENSE
file for the
details.