Product attributes, such as brand name, size, weight, or dimension, are critical in e-commerce as they help customers find and select the right product for their needs. However, obtaining, adding, and maintaining these values is extremely labour intensive, especially on larger sites. Therefore we ventured to use 1. SOTA transformer based models like Google's BERT & Facebook's RoBERT, to make Product Attribute Extraction (PAE) and 2. Train question-answering model to predict brand name based on product description. For PAE, we used Bi-LSTM model to create chunkings of proper nouns from description and formulated relationship with the associated non-noun phrases. Using this Parts-of-speech tagging, we identified clusters and based on these we came up with an unsupervised technique to relate a product's attribute name with its attribute value. For brand detection task we modified a question-answering transformer model (DistilBERT) to answer the question 'what is the brand name of this product?' based on the descriptions. After finetuning the model we achieved a desirable accuracy.
Detection of All Possible Product Attribute unsupervisedly and supervised brand name detection using our method on products listed on Amazon India Website is demonstated in the video below
method.mp4
Noun Attribute Search | Non-Noun Attribute Search |
---|---|
Data Cleaning | Chunking |
---|---|
Product Attribute Clustering | Cluster Labelling |
Two fine-tuned models related to this work can be found at the HuggingFace Model Hub: