Split select multiple column order is not stable when choice list is randomized
chrissyhroberts opened this issue · 8 comments
Software versions
Briefcase v1.7.0 - v1.7.3,
openjdk version "13.0.2" 2020-01-14
OpenJDK Runtime Environment (build 13.0.2+8)
OpenJDK 64-Bit Server VM (build 13.0.2+8, mixed mode, sharing)
OS X 10.15
Aggregate v2.0.3
Problem description
Using the -ssm option with -smart-append is not functional.
Each time -ssm parses a select-multiple type question, the column order of output changes.
I ran three exports
Export 1
rumours.trust_rumours_source.internet
rumours.trust_rumours_source.facebook_priv
rumours.trust_rumours_source.other
rumours.trust_rumours_source.friends
rumours.trust_rumours_source.radio
rumours.trust_rumours_source.twitter
rumours.trust_rumours_source.family
rumours.trust_rumours_source.facebook_pub
rumours.trust_rumours_source.coworkers
rumours.trust_rumours_source.church
rumours.trust_rumours_source.tv_news
rumours.trust_rumours_source.teachers
rumours.trust_rumours_source.whatsapp
rumours.trust_rumours_source.school
rumours.trust_rumours_source.papers
rumours.trust_rumours_source.congregation
Export 2
rumours-trust_rumours_source.twitter
rumours-trust_rumours_source.whatsapp
rumours-trust_rumours_source.congregation
rumours-trust_rumours_source.internet
rumours-trust_rumours_source.family
rumours-trust_rumours_source.coworkers
rumours-trust_rumours_source.church
rumours-trust_rumours_source.radio
rumours-trust_rumours_source.friends
rumours-trust_rumours_source.papers
rumours-trust_rumours_source.tv_news
rumours-trust_rumours_source.other
rumours-trust_rumours_source.teachers
rumours-trust_rumours_source.facebook_priv
rumours-trust_rumours_source.school
rumours-trust_rumours_source.facebook_pub
Export 3
rumours-trust_rumours_source.facebook_pub
rumours-trust_rumours_source.other
rumours-trust_rumours_source.congregation
rumours-trust_rumours_source.family
rumours-trust_rumours_source.friends
rumours-trust_rumours_source.papers
rumours-trust_rumours_source.twitter
rumours-trust_rumours_source.whatsapp
rumours-trust_rumours_source.teachers
rumours-trust_rumours_source.facebook_priv
rumours-trust_rumours_source.church
rumours-trust_rumours_source.coworkers
rumours-trust_rumours_source.tv_news
rumours-trust_rumours_source.radio
rumours-trust_rumours_source.school
rumours-trust_rumours_source.internet
and ended up with garbage data in the incrementally saved CSV.
For now the fix is to not use -ssm and -smart-append together.
Steps to reproduce the problem
run -ssm and -smart-append together.
this is possibly triggered by update to form definition with version update.
first saw weird behaviours when updated form.
Expected behavior
reshuffling of data columns in -ssm columns, possibly only after form updated.
I'm really glad you caught that, @chrissyhroberts. Thanks for the detailed report.
With @ggalmazor no longer focused on ODK we have less capacity on Briefcase but will try to get this resolved ASAP.
this is possibly triggered by update to form definition with version update.
Hi @chrissyhroberts. I’m trying to reproduce the problem. Is this a necessary step?
Thanks for looking into it, @dcbriccetti! I'm guessing not but it's something to verify. I would start with a form with just a single select question with a few choices. I'd make a few submissions, run an export, add a few more submissions, run an export and do that until it repros or up to ~5 times. If it's not reproing then I'd try to update the form definition in some way.
@dcbriccetti I'm also in Slack if you want to discuss. Another approach might be to try to write a test for the case without reproducing with a real setup. I have some time this afternoon, so let me know if you want to collaborate.
@chrissyhroberts We've tried a number of things to try to reproduce or track down with the information provided but haven't been able to yet. Can you please share the form? You could email it to me if it's sensitive. Alternately you could share a form with just that question, any groups or repeats it's in, and its choices.
We've tried a very simple select multiple, one with the choices you've provided, updating the form version, using a choice filter. We've also audited the code. In our scenarios the choice order has been stable and based on the form definition order of the choices.
I have confirmed that if you change the order of the choices in your form update, you will change the order in the export as well. Could this have happened? Were those three runs back to back with no pull in between?
Thanks for sending the form definition, @chrissyhroberts. I haven't reproduced but from looking at the definition, I think it's almost certainly because of choice order randomization. When Briefcase loads the form definition, it loads it the same way that a form filling client loads it and the choice order is randomized. The export order is based on the form definition and that order is no longer stable.
To support this, we'd need to either go back to the original form definition XML or intercept the choice order before JavaRosa randomizes it.
I have reproduced using this simple form with randomized select choices. I sent in a few submissions where I selected only 'a', did an export. Then I made another submission selecting only 'a', did an export with -ssm
and -sa
and the choice was marked in a different column.
java -jar /Users/ln/Downloads/ODK-Briefcase-v1.17.3.jar -U https://sandbox.aggregate.getodk.org -plla -id briefcase-ssm-order -sd /Users/ln/Documents/projects/odk
java -jar /Users/ln/Downloads/ODK-Briefcase-v1.17.3.jar -e -ed /Users/ln/Downloads -f rand-order.csv -id briefcase-ssm-order -sd /Users/ln/Documents/projects/odk -ssm -sa
@chrissyhroberts The fix is in the just released Briefcase v1.17.4