<1> Introduction

code of Camel in WWW2018 paper: Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification

Contact: Chuxu Zhang (czhang11@nd.edu)

<2> How to use

(install tensorflow, keras, de-compress word_embedding.txt.zip)

python Camel.py [parameters]

#dataset used this demo corresponds with AMiner-T data (T = 2012) in the paper

<3> Data requirement

paper-author-list-train.txt: author list of paper in training data

paper-author-list-test.txt: author list of paper in test data

paper_author_neg_ids.txt: negative author candidate of test paper for evaluation

metapath_walk.txt: random/metapath walk for indirect relation augmentation

word_embedding.txt: pre-train word embedding of paper abstract

content.pkl: paper abstract content (paper_content, paper_content_id)

<4> use data_process.py to generate related data for Camel

small_data_with_map_id.txt: AMiner-T raw data with new (author, paper, venue) map id

find original data from: https://aminer.org/citation

<5> use word2vec.py to generate word embedding of paper content