travis-r6s/gridsome-plugin-flexsearch

About Chinese Support

chenhongen opened this issue · 6 comments

What happened

When I search in English keywords that title and content contain,It works well.But When I try Chinese,It didn't work.

Expected result

Support Chinese

Environment

gridsome: 0.7.12
gridsome-plugin-flexsearch:0.1.12

@chenhongen I should think this is more a flexsearch issue - have you tried checking out their docs?
https://github.com/nextapps-de/flexsearch#add-language-specific-stemmer-andor-filter

I find a issue in flexsearch about,and change my config to :

{
            use: 'gridsome-plugin-flexsearch',
            options: {
                collections: [
                    {
                        typeName: 'ChePost',
                        indexName: 'ChePost',
                        fields: ['title', 'content', 'description']
                    }
                ],
                searchFields: ['title', 'content', 'description'],
                flexsearch: {
                    encode: false,
                    tokenize: function(str){
                        return str.replace(/[\x00-\x7F]/g, "").split("");
                    }
                }
            }
        },

But,It doesn't work in neither Chinese nor English.It's a correct way to specify optional flexsearch configurations under a flexsearch key like this?

@chenhongen Sorry for the delay on this, haven't had much spare time recently.
However, I think I know what the issue is here. The flexsearch options are passed to the client as JSON- so you obviously can't pass a function in that object.

I'm currently looking into a solution for this.

@chenhongen I have just added support for a custom tokenize/encoder function, but it requires a bit of manual setup. Would you be able to test the below, and let me know if it works?

Install v0.1.17

Add this configuration in gridsome.config.js

{
  use: 'gridsome-plugin-flexsearch',
  options: {
    autoSetup: false  
    collections: [
      {
         typeName: 'ChePost',
         indexName: 'ChePost',
         fields: ['title', 'content', 'description']
        }
      ],
      searchFields: ['title', 'content', 'description'],
      flexsearch: {
         encode: false,
         tokenize: function(str){
            return str.replace(/[\x00-\x7F]/g, "").split("");
          }
       }
     }
},

Then in your header/nav/search component, manually setup the flexsearch instance:

<script>
import FlexSearch from 'flexsearch'
export default {
  data: () => ({
    searchTerm: '',
    search: null
  }),
  computed: {
    searchResults () {
      const searchTerm = this.searchTerm
      if (searchTerm.length < 3) return []
      const results = this.search.search({ query: searchTerm, limit: 5, suggest: true })
      console.log(results)
      return results
    }
  },
  async mounted () {
    // Some flexsearch options, and helper functions
    const { flexsearch, loadIndex } = this.$flexsearch
    // Create a flexsearch instance, and load our config options, plus our custom tokenizer function
    const search = new FlexSearch({
      ...flexsearch,
      tokenize: function (str) {
        return str.replace(/[\x00-\x7F]/g, '').split('')
      }
    })
    // Make search available on this
    this.search = search
    // Load our index data into flexsearch
    await loadIndex(search)
  }
}
</script>

Good job! Follow your steps,It works well~ thx

To augment this (very helpful thread), I found a snippet of code that will make the flexsearch support both Latin alphabet (English) and CJK (Chinese) in the same search box:

alex-shpak/hugo-book#80 (comment)

Just apply it both in gridsome.config.js as well as in the search component in place of return str.replace(/[\x00-\x7F]/g, "").split("")