tweaks to data collection
willwade opened this issue · 3 comments
- Can we change the way a user gets a new uniqueID on changing settings? i.e if a user changes settings just keep their id the same. (since we can easily separate sessions of one user with different settings in the analytics stage)
- Somehow get better data per letter. Like it would be very useful to record amount of right and wrong choices for each letter in the JSON progress field. maybe we keep the score - and a different string per letter - or a list per letter..
1 - If we didnt give people it would mess up the aggregate table because if a user changes settings after completing the whole thing then it would appear as they have completed the whole thing with new settings which would be in accurate. Maybe we don't need to maintain an aggregate table and all the calculations can be done at the analytics stage? Alternatively if we want to keep the aggregate table we could keep the ID switching process and add an additional table that tracked ID changes. ie:
Old ID | New ID |
---|---|
1 | 2 |
3 | 4 |
With this setup you can basically just count how many entries there are to see how many times settings have been changed. This way we can still maintain an accurate aggregate table.
- Seems possible. Right now it looks like this:
{
a: 1,
b: 3,
c: 2
}
probably wouldnt be hard to switch too.
{
a: {
score: 0,
correctGueses: 4,
wrongGueses: 2
},
b: {
score: 3,
correctGueses: 3,
wrongGueses: 0
},
c: {
score: 2,
correctGueses: 4,
wrongGueses: 1
}
}
It would be fairly easy change to make.
However, I have a few questions about how you are doing your parsing. Right now we just dump the progress as a JSON string. If I changed the shape of the JSON like suggested you would have to make your code handle both shapes being in the database. Alternatively if you dont want to do that I could add an extra column to the DB to collect this data?
We don't need to maintain the aggregate table. We can create aggregated data pretty easily now...
About parsing - its won't be hard to parse it anyway, but if it was in a extra column – that would be easier for sure!
Cool ill just stop putting anything into the aggregate table. This means the grafana dashboards will stop updating but thats not the end of the world just something to bear in mind.
Ill keep the data in the progressDump
the exact same shape as it currently is, just so not to mess with any work already done. Then ill add a new column called progressDumpFull
and that will have the new shape of data and null
for any rows that were inserted previously, then you can just ignore the nulls.
Tried to be a bit lazy and avoid DB schema migration by doing JSON data as strings but looks like thats come back to bite me, lesson learned