/product-matching-model

๐ŸŒ Product matching model for an eCommerce platform using FastText, Simple LSTM, Siamese MaLSTM

Primary LanguagePython

product-matching-model

2018 SNU-FIRA AI Agent Course, Final Capstone Project (18.11.06-12.14)

This project is supported by Seoul National University Big Data Institute.

๐Ÿ…Our team won 1st place in the final project presentation.

Goal

โ–ช Project Background

โ–ช ๋”ฅ๋Ÿฌ๋‹์„ ํ™œ์šฉํ•œ ์ƒํ’ˆ๋งค์นญ ๋ชจ๋ธ ๊ฐœ๋ฐœ

Building Product Matching Models for an Ecommerce Platform(KOR) using Deep Learning Methods.

Dataset

โ–‘โ–‘โ–‘โ–‘CLOSED DATASETSโ–‘โ–‘โ–‘โ–‘ ํ”„๋กœ์ ํŠธ ํ˜‘๋ ฅ๊ธฐ๊ด€๊ณผ์˜ ๋น„๋ฐ€ ์œ ์ง€ ํ˜‘์•ฝ์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต๊ฐœํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

Requirements

Initial requirements are as follows.

 python 3.6.5
 gensim 3.6.0
 keras 2.2.4
 tensorflow 1.12.0
 numpy 1.15.4
 pandas 0.23.4

Result

*---main.py---*

โœ” | ํžˆ๋ง๋ผ์•ผ ๊ณ ์ˆ˜๋ถ„ํฌ๋ฆผ (์ธํ…์‹œ๋ธŒ) 150ml ===> ์ธํ…์‹œ๋ธŒ ๋ชจ์ด์Šค์ฒ˜๋ผ์ด์ง• ํฌ๋ฆผ@150ml
โœ” | [HANYUL]์–ด๋ฆฐ์‘ฅ ์ˆ˜๋ถ„ ์ง„์ •ํฌ๋ฆผ 50ml ===> ์–ด๋ฆฐ์‘ฅ ์ˆ˜๋ถ„ ์ง„์ •ํฌ๋ฆผ@50ml
โœ” | ๋งˆ๋ชฝ๋“œ ๋ชจ์ด์Šค์ฒ˜ ์„ธ๋ผ๋งˆ์ด๋“œ ์ธํ…์Šค ํฌ๋ฆผ ===> ๋ชจ์ด์Šค์ฒ˜ ์„ธ๋ผ๋งˆ์ด๋“œ ์ธํ…์Šค ํฌ๋ฆผ@50ml
โœ” | ํ† ๋‹ˆ๋ชจ๋ฆฌ ๋” ์ด‰์ด‰ ๊ทธ๋ฆฐ ํ‹ฐ ์ˆ˜๋ถ„ ํฌ๋ฆผ 60ml ===> ๋” ์ด‰์ด‰ ๊ทธ๋ฆฐํ‹ฐ ์ˆ˜๋ถ„ ํฌ๋ฆผ@60ml
โœ” | ์•„์ด์˜คํŽ˜ ๋”๋งˆ ๋ฆฌํŽ˜์–ด ์‹œ์นดํฌ๋ฆผ ===> ๋”๋งˆ ๋ฆฌํŽ˜์–ด ์‹œ์นดํฌ๋ฆผ@50ml 
โœ” | ํ—ค๋ผ D_ํ—ค๋ผ ๋กœ์ง€ ์‚ฌํ‹ด ํฌ๋ฆผ 50ml  ===> ๋กœ์ง€ ์‚ฌํ‹ด ํฌ๋ฆผ@50ml
โœ” | ๋งˆ๋ชฝ๋“œ ๋ชจ์ด์Šค์ฒ˜ ์„ธ๋ผ๋งˆ์ด๋“œ ์ธํ…์Šค ํฌ๋ฆผ 50ml -์„ ๋ฌผํฌ์žฅ1 ===> ๋ชจ์ด์Šค์ฒ˜ ์„ธ๋ผ๋งˆ์ด๋“œ ์ธํ…์Šค ํฌ๋ฆผ@50ml
โœ” | (27%ํ• ์ธ)์„คํ™”์ˆ˜ ํƒ„๋ ฅํฌ๋ฆผ 75ml ===> ํƒ„๋ ฅ ํฌ๋ฆผ@75ml
โœ” | [์นด๋“œ 5% ํ• ์ธ][CJmall]์•„์ด์˜คํŽ˜ [๊ตฌ๋งค๊ธˆ์•ก์ฆ์ •์ œ์™ธ]์•„์ด์˜คํŽ˜ ๋ชจ์ด์ŠคํŠธ์   ํฌ๋ฆผ ์Šคํ‚จ ํ•˜์ด๋“œ๋ ˆ์ด์…˜ 50ml ===> ๋ชจ์ด์ŠคํŠธ์   ํฌ๋ฆผ ์Šคํ‚จ ํ•˜์ด๋“œ๋ ˆ์ด์…˜@50ml 
โœ– | ํžˆ๋ง๋ผ์•ผ ์ •ํ’ˆ-50ml ํžˆ๋ง๋ผ์•ผ์ธํ…์‹œ๋ธŒ์ˆ˜๋ถ„ํฌ๋ฆผ/ํžˆ๋ง๋ผ์•ผ์ธํ…์‹œ ===> ๋„ˆ๋ฆฌ์‹ฑ ์Šคํ‚จ ํฌ๋ฆผ@50ml | โœช์ •๋‹ต: ์ธํ…์‹œ๋ธŒ ๋ชจ์ด์Šค์ฒ˜๋ผ์ด์ง• ํฌ๋ฆผ ...

Team Members

References

  • Joulin, Armand, et al. "Fasttext. zip: Compressing text classification models." arXiv preprint arXiv:1612.03651 (2016).
  • Shah, Kashif, Selcuk Kopru, and Jean David Ruvini. "Neural Network based Extreme Classification and Similarity Models for Product Matching." Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). Vol. 3. 2018.
  • How to predict Quora Question Pairs using Siamese Manhattan LSTM