A easy to deploy/run kudos model training script
Opened this issue · 2 comments
Jug did the initial training of the model. Having not looked, and merely not knowing the answers at this time, I wonder:
- Would the process would work out of the box as is with little setup/config (and therefore easily incrementally updated when performance altering patches are released)?
- Would it be possible to trivially add input variables (such as the proposed
n_iter
)? - Could we somehow reconcile results from a baseline machine (I believe the initial model used in production used a 3080) with results from machines on the extremes (IE, low end cards and high end cards) for the purposes of drawing better kudos distribution curves?
- Can the process be easily adapted to text gen?
It can only be updated using the same variables. So basically the same PC, same worker.
To update the mode, effectively it would require that one of us start recording values from scratch.
PS: I would like this time to also capture VRAM requirements per payload as well and have the model also calculate how much VRAM would be required for each type of request, in order to try and utilize that number into the kudos calculation as well.
PS: I would like this time to also capture VRAM requirements per payload as well and have the model also calculate how much VRAM would be required for each type of request, in order to try and utilize that number into the kudos calculation as well.
I agree, I was thinking of this when I mentioned reconciling high end/low end results. It would be straight forward to identify the rollover thresholds typical for when VRAM is saturated and it goes into regular RAM (which is characterized by a massive slow down).
The API could have optional reports for this (opt-in only) too? (for VRAM/card gen)