MART Implementation of "MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning"