Fulltext search is not working
haristariqmage4 opened this issue · 8 comments
Preconditions
Magento Version : 2.4.0
ElasticSuite Version : 2.1.0
Environment : Developer
Third party modules :
Amasty_Base
Amasty_CronScheduleList
Amasty_Customform
Amasty_InvisibleCaptcha
Amasty_RequestQuote
Amasty_QuoteAttributesManagement
Amasty_RequestAQuoteProSubscriptionPackage
Amasty_QuoteAttributes
Amazon_Core
Amazon_Login
Amazon_Payment
Clarion_CustomerAttribute
Codazon_AjaxCartPro
Codazon_AjaxLayeredNav
Codazon_AjaxLayeredNavPro
Codazon_Core
Codazon_GoogleAmpManager
Codazon_ImproveBundle
Codazon_Lookbookpro
Codazon_MegaMenu
Codazon_OneStepCheckout
Codazon_ProductFilter
Codazon_ThemeOptions
Codazon_QuickShop
Codazon_ShippingCostCalculator
Codazon_Shopbybrandpro
Codazon_Slideshow
Codazon_ProductLabel
Codazon_Utility
Dotdigitalgroup_Email
Dotdigitalgroup_ChatHarrigo_EverCrumbs
Klarna_Core
Klarna_Ordermanagement
Klarna_Onsitemessaging
Klarna_Kp
Klaviyo_Reclaim
MageMe_HidePrice
MageWorx_SearchSuiteAutocomplete
Magefan_Community
Magefan_Blog
Magefan_WysiwygAdvanced
Magemonkeys_CategoryFilter
Magemonkeys_CompanyName
Magemonkeys_Customerinfo
Magemonkeys_FeaturedProduct
Magemonkeys_HideMyOrders
Magemonkeys_Ordermail
Magemonkeys_Product
Magemonkeys_Quote
Magemonkeys_RemoveQuoteCartPrice
Magemonkeys_RepresentativeAttr
Magemonkeys_RestrictCategory
Magemonkeys_WelcomeEmailCc
Mageplaza_Core
Mageplaza_BannerSlider
Mageplaza_BackendReindex
Mageplaza_MassProductActions
Mageplaza_Smtp
Magestat_SplitOrder
OlegKoval_RegenerateUrlRewrites
PayPal_Braintree
PayPal_BraintreeGraphQl
RapideWeb_ProductListTable
Smile_ElasticsuiteCore
Smile_ElasticsuiteCatalog
Smile_ElasticsuiteCatalogGraphQl
Smile_ElasticsuiteCatalogRule
Smile_ElasticsuiteCatalogOptimizer
Smile_ElasticsuiteTracker
Smile_ElasticsuiteThesaurus
Smile_ElasticsuiteSwatches
Smile_ElasticsuiteIndices
Smile_ElasticsuiteAnalytics
Smile_ElasticsuiteVirtualCategory
Temando_ShippingRemover
Ulmod_Ordernotes
Vertex_Tax
Vertex_AddressValidation
WebShopApps_MatrixRate
Yotpo_Yotpo
Zero1_Patches
How do we make results for "dextrose 5% water" show the same as results for "d5w"? Since one is multiple words and the other is technically just one?
How can we make sure that items like Sharps Container 26 1/4 ° 20 w * 14 3/4 D Inch 19 BD Gallon are not included in the search results for 'D5W'?
Expected result
More narrow product search that will only allow for exact terms to be fetched
Searches like 'D5W' should not have results that include hits for 'D' '5' 'W'
Actual result
Hello @haristariqmage4,
This is probably due to the "word_delimiter" of the "standard" (text) analyzer which will transform your product names before indexing it.
This "word_delimiter" component DO split words like "D5W" when switching from a letter to a digit and vice versa, so you are correct assuming that we do search for "D", "5" and "W" when searching for "D5W".
The issue is then that you have other product names with those isolated letters (for example coming from a product name string like "3/5 H X 10 7/10 W X 6 D").
You can check what's happening on the analyzer side of things from the admin interface in the Elasticsuite > System > Analysis.
If you can't have "simpler" product names, I would recommend trying to change the configuration of the "word_delimiter" token filter by disabling "split_on_numerics" in the elasticsuite_analysis.xml (through a composer patch or a re-definition of the XML in a custom module)
Then for the original issue, a thesaurus entry for associating "D5W" or "d5w" to "Dextrose 5% Water" should to the trick.
Regards,
Hello @haristariqmage4,
Indeed, I've just saw
Magento Version : 2.4.0
ElasticSuite Version : 2.1.0 (I guess it's 2.10.0)
That's ... old :)
Indeed, you will not have that screen which has been introduced in 2.10.13, so in Magento 2.4.1 and above only.
You can install cerebro locally and reproduce what that screen does from cerebro's "analysis" screen :
Or use directly the _analyze endpoint of your Elasticsearch in CLI
/var/www/html $ curl -H "Content-Type: application/json" -XPOST http://opensearch:9200/magento2_fr_fr_catalog_product/_analyze?pretty -d '{"analyzer":"standard","text":"d5w"}'
{
"tokens" : [
{
"token" : "d5w",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "d",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "d5w",
"start_offset" : 0,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "5",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<ALPHANUM>",
"position" : 1
},
{
"token" : "w",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<ALPHANUM>",
"position" : 2
}
]
}
(Replace http://opensearch:9200/magento2_fr_fr_catalog_product/
by http://[your_elasticsearch_server_address_or_hostname]/[your_catalog_product_index_name]
)
Regards,
Hi @rbayet ,
Hereis my analysis, now what should i do?
Hello @haristariqmage4,
So now that your thesaurus is in place, you have two options (that could be combined, actually)
- reducing the score penalty for products matching a synonym
- altering the "word_delimiter" token filter in the way I described
1. reducing the score penalty for products matching a synonym
When searching for "d5w" you will now also search for "dextrose 5% water" but by default the products matching only "dextrose 5% water" will suffer a score penalty with a tenth of their expected score.
You can change that by reducing (up to 1, ie "no penalty") the setting available at Elasticsuite > Search Relevance > Thesaurus Configuration > Synonyms Configuration > Synonyms Weight Divider
The products matching individually "D", "5" and "W" will still be present in the search results list but maybe at a lower place for you to be satisfied.
2. altering the "word_delimiter" token filter in the way I described
If you're not satisfied, or as an alternative, you can redefine or finetune the word_delimiter token filter which is responsible for splitting "D5W" into "D", "5" and "W".
You probably only need to change the "split_on_numerics" from "true" to "false".
You can do that either with a composer patch on that distribution file OR create a custom module in app/code with a local elasticsuite_analysis.xml which will contain just the re-defined word_delimiter token filter.
In both cases, this will require clearing the Magento cache and performing a full reindex.
Please be aware that this approach could have adverse side effects, for instance preventing finding products with a "L48B" in their name or their SKU by searching for "L 48 B" for instance.
Regards,
@rbayet
Should i generate_word_parts -> false too in elasticsuite_analysis.
Also what can be the solution to avoid:
Please be aware that this approach could have adverse side effects, for instance preventing finding products with a "L48B" in their name or their SKU by searching for "L 48 B" for instance.
this side effects
@rbayet ?