YouTube video walk-through of this codebase
gordicaleksa opened this issue · 2 comments
Hi @kuprel!
First of all awesome work, you made my job that much easier. :)
I created a YouTube video where I do a deep dive/walk-through of this repo.
Maybe someone finds it useful:
https://youtu.be/x_8uHX5KngE
Hopefully it's ok to share it here in the form of an issue, do let me know!
Wow this is great! I just added your video to the readme. You're right the clamping is unnecessary. It originally served to avoid a cryptic cuda runtime error. Later I implemented a more precise solution to limit the BART decoder to 2**14 tokens to match the VQGAN. I'm not sure why there's a mismatch in vocabulary counts. Also I didn't realize those are shared weights. There's probably a simpler solution here. Great video!