/FeedForwardBLL

Hacettepe University Computer Science Text Mining Graduate Course Project

Primary LanguagePython

Download Links

Python packages

  • nltk
  • google-cloud-translate
  • numpy
  • gensim (word2vec)
  • fasttext
  • pytorch
  • matplotlib
  • scikit-learn

Dataset

[English word, italian word]

  • Total words: 15278
    • Training set: 11590
    • Validation set: 2939
    • Test set: 749

We pair each english word with true translation and wrong translation.

  • After filtering words not included in word2vec
    • Training pairs : 20977
    • Validation pairs : 5322
    • Testing pairs : 1381

Experiments

shallow2-3 consists 258817 parameters. earlyfusion consists 550593 parameters.

adam/earlyfusion/first (all-bn-relu)

Test accuracy: 86.96596669080377 precision: 0.8593530239099859 recall 0.8842257597684515 f1 0.8716119828815977

adam/earlyfusion/lrelu (all-bn)

Test accuracy: 88.26937002172339 precision: 0.8598639455782313 recall 0.914616497829233 f1 0.8863955119214586

adam/earlyfusion/no-bn-lrelu

Test accuracy: 90.65894279507603 precision: 0.9194029850746268 recall:0.8914616497829233 f1: 0.9052167523879501

adam/earlyfusion/no-bn-relu

Test accuracy: 89.93482983345402 precision: 0.896551724137931 recall:0.9030390738060782 f1: 0.8997837058399423

adam/shallow2-3/all-bn-lrelu

Test accuracy: 83.7074583635047 precision: 0.8218232044198895 recall:0.8610709117221418 f1: 0.8409893992932863

adam/shallow2-3/all-bn-relu

Test accuracy: 84.79362780593772 precision: 0.8160315374507228 recall:0.8986975397973951 f1: 0.8553719008264463

adam/shallow2-3/no-bn-relu

Test accuracy: 74.14916727009413 precision: 0.7021791767554479 recall:0.8393632416787264 f1: 0.7646671061305208

adam/shallow2-3/no-bn-lrelu

Test accuracy: 80.30412744388124 precision: 0.7668789808917198 recall:0.8712011577424024 f1: 0.8157181571815719

Todo

  • Tokens will be extracted from corpora
  • Pairs will be generated by google translate
  • Training, validation and test sets will be splitted
  • python word2vec for it and en will be installed
  • pytorch special dataloader will be implemented
  • models will be trained on pytorch
  • evaluation will be implemented