A Multi-task Learning Framework with Disposable Auxiliary Networks for Early Prediction of Product Success

1. Introduction

Consider the scenario in which an investor seeks to iden- tify potential products before they are unveiled to the public. For such a scenario, the investor may pose questions such as “What characteristic better represents a product?” or “What features make a product popular?” Unlike traditional recom- mendation problems, in this case, there is no user feedback for such upcoming products, which makes associated predic- tion extremely challenging. To address this challenging yet common scenario, in this paper, we present a multi-task learn- ing framework that trains the prediction model on information for mature products that have user feedback, and then uses the model to predict the success of upcoming products with- out any user feedback. To achieve this goal, the framework consists of a main task network to extract product features from their descriptions and a novel disposable auxiliary net- work that learns domain-specific words and popular trends from user reviews at the same time. This disposable auxiliary network is beneficial during the training of the main task net- work, and is unused at the inference stage. Empirical results on two real-world datasets demonstrate that this multi-task learning framework not only significantly improves the over- all rating prediction for products but also effectively identifies the top successful products without any user reviews.

1.1. Requirements

python3.X
pytorch
numpy
gensim
wikipedia2vec

1.2. Datasets

We provide two dataset, IMDB and filmark.

|--data/
  |--imdb
  |--filmark
  |--sample

The dataset of IMDB is used for English and the other one is crawled from filmark for Japanese.

1.3. Pretrained Word Embeddings

There are two pre-trained word embeddins needed. One is for English and the other is for Japanese.

Download

$ cd ./pretrained_embedding
$ bash get_pkl.sh

1.4. Getting Started

Download

$ git clone
$ cd ./

2. Usage

2.1. Train a new model

Parameters

$ cd ./code

$ python main.py -h
usage: Training [-h] [--gpu GPU] [--epoch EPOCH] [--batch BATCH]
                [--sample_size SAMPLE_SIZE] [--lang LANG] [--task TASK]
				[--train TRAIN]

Arguments

optional arguments:
	-h,		--help	show this help message and exit
	--gpu GPU,	-g GPU	-1=cpu, 0, 1,...= gpt
	--epoch EPOCH,	-epoch	EPOCH
	--batch BATCH,	-batch	BATCH batch size
	--sample_size SAMPLE_SIZE,	-sample	SAMPLE_SIZE
	--lang LANG,	-lang LANG	en=English, jp=Japanese
	--task TASK,	-task TASK	reg=Regression, rank=Ranking
	--train TRAIN,	-train TRAIN	path of training data

cnclabs/codes.review.multitask

A Multi-task Learning Framework with Disposable Auxiliary Networks for Early Prediction of Product Success

1. Introduction

1.1. Requirements

1.2. Datasets

1.3. Pretrained Word Embeddings

Download

1.4. Getting Started

Download

2. Usage

2.1. Train a new model

Parameters