keras-team/keras-nlp

Data-Parallel Training with KerasNLP and tf.distribute example dataset problem

Opened this issue · 4 comments

Describe the bug
Data-Parallel Training with KerasNLP and tf.distribute This is an example using a dataset that shows 403: Forbidden. Giving the message "Access Denied.".

To Reproduce
Provide a link to a Colab Notebook, which reproduces the bug.

Expected behavior
The expected behavior is that the dataset should be downloaded properly so that an example can be run and an error like "Access Denied" should not appear.

Additional context

Would you like to help us fix it?

Yes, if I can help in any way, I will be glad to do so!

Hi @sitamgithub-MSIT ,

I think the link https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip not working right ? Please feel free to fix the issue if you want to contribute. Thanks!

Hi @sitamgithub-MSIT ,

I think the link https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-v1.zip not working right ? Please feel free to fix the issue if you want to contribute. Thanks!

Yeah, the link is not working, saying "Access Denied." Now that this is a dataset link issue, I don't think I can do much. Better will be tagging the author; surely we can get a solution then.

@shivance Can you please help with this?

Hi @sitamgithub-MSIT

Thanks for reporting this issue. As discussed here it seems that the dataset link doesn't work anymore. I think this dataset is available on Hugging Face too: https://huggingface.co/datasets/wikitext
I'll update the colab with the Hugging Face dataset. The Hugging Face wikitext-2-v1 dataset seems to be a different format though so I should spend some time to figure this out.