Is there any issue in extending context length to 1 million using your script
vkaul11 opened this issue · 1 comments
vkaul11 commented
Just checking if there is any reason to restrict the use to 128k or can we use the script to 1 million tokens also ? Secondly is there a NOTICE file we have to use if we modify the code and use it somewhere?
hsiehjackson commented
You can definitely use our script to test 1 million tokens! A potential problem is the speed to generate dataset files can be slow. I think we can use a better solution (binary search) to find the size of distraction since we only linearly increasing it for now in here.
For the NOTICE file, you can copy Apache-2.0 LICENSE.