/Pointer_Transformer_Generator

:scorpius::heavy_plus_sign::sagittarius::arrow_right::heavy_check_mark: Build a summarizer models combining transformers and pointing mechanism

Primary LanguagePythonMIT LicenseMIT

Pointer_Transformer_Generator tensorflow 2.0.0

For the abstractive summarization task, I wanted to experiment the transformer model. I recreated a transformer model (thanks to tensorflow transformer tutorial) and added a pointer module (have a look at this paper for more informations on the pointer generator network : https://arxiv.org/abs/1704.04368 ).

PS : I will add very soon a section explaining the integration of the pointer module in the transformer

Please follow the next steps to launch the project :

Step 1 : The data

Option 1 : Download the data

Download the data (chunk files format : tfrecords) https://drive.google.com/open?id=1uHrMWd7Pbs_-DCl0eeMxePbxgmSce5LO

Option 2 : Download raw data and process it

Use this project : https://github.com/steph1793/CNN-DailyMail-Bin-To-TFRecords

Step 2 : launch the project :

python main.py --max_enc_len=400 \
--max_dec_len=100 \
--batch_size=16 \
--vocab_size=50000 \
--num_layers=3 \
--model_depth=512 \
--num_heads=8 \
--dff=2048 \
--seed=123 \
--log_step_count_steps=1 \
--max_steps=230000 \
--mode=train \
--save_summary_steps=10000 \
--checkpoints_save_steps=10000 \
--model_dir=model_folder \
--data_dir=data_folder \
--vocab_path=vocab \

PS : Feel free to change some of the hyperparameters
python main.py --help , for more details on the hyperparameters

Requirements

  • python >= 3.6
  • tensorflow 2.0.0
  • argparse
  • os
  • glob
  • numpy