document sampling parameters and/or minimal "viable" codegen model ?
laurentperez opened this issue · 1 comments
hello. your work is great 👍
I wrapped your binary under my bot/API project https://github.com/laurentperez/ava#what-models-or-apis-does-it-support-
I'm mostly interested in code (python) generation from Bloom as a developer assist. I'm not using it for creative writing. However I'm playing with it for translations to evaluate how the 7b1 model might respond to more complex python prompts.
I infered using the bloomz-1b1 bloomz-3b and bloomz-7b1 models. So far, 7b1 model gives the best results but it's being "too creative".
see example below, the "Me encuentro muy bien/me alegro" were too creative, they did more than a translation :
curl -v -XPOST -H 'Content-Type: application/json' -d '{"msg":"translate \"Hi, how are you?\" in Spanish:"}' http://localhost:8080/hf/bloom
sampling parameters: temp = 0.800000, top_k = 40, top_p = 0.950000, repeat_last_n = 64, repeat_penalty = 1.300000
translate "Hi, how are you?" in Spanish: Me encuentro muy bien. ¿Cómo estas tú? Yo estoy?: me alegro</s> [end of text]
- is the creative answer related to the sampling parameters and if so how would you change them to let the model be more consise and deterministic ? as in, less creative.
- is 7b1 just "big enough for code generation" ? it was trained with https://huggingface.co/datasets/bigscience/xP3. in your opinion what would be the best huggingface model for python code generation, besides https://huggingface.co/Salesforce/codegen-350M-mono ?
thanks !
- I highly recommend this amazing blogpost https://huggingface.co/blog/how-to-generate to understand how sampling parameters affect the generated text
- AFAIK, the best code model on huggingface less than 2B is https://huggingface.co/bigcode/santacoder and for larger models, https://huggingface.co/Salesforce/codegen-6B-mono and https://huggingface.co/Salesforce/codegen-16B-mono are the best