I just made a version of this that actually works. I made this repo when I was screwing around; now I have an actual model (Llama2 merge) finetuned on a dataset that has been formatted with a roleplay prompt, so that it actually produces sensible outputs.
Do not judge me by this thing here
I'll release the quant and code of the better thing when I have time in a week hopefully