/avito-duplicate-ads-detection

team luoq's code and solution for Avito Duplicate Ads Detection competition on kaggle

Primary LanguageJupyter Notebook

avito-duplicate-ads-detection

code and solution for kaggle: Avito Duplicate Ads Detection (team luoq)

solution

Please read solution.md

slide

A slide to discuss this solution

environment setup

The base environment is linux with Anaconda3

A lot of extra libraries are needed to run this code, an incomprehensive list is

A GPU is highly recommended to run mxnet. It takes about 5 days to generate the features.

how to run the code

  1. extract data(except image) to data/data_files
  2. cp config.example.json to config.json; change the config to match the data dir
  3. change working dir to root of this repo
  4. run prepare_data.sh to generate features
  5. run leaderboad_solution.py to generate final solution