No way to figure out read orientation used after executing an action

Question

No way to figure out read orientation used after executing an action

Opened this issue 7 years ago · 5 comments

Current Behavior
From the help text in classify-sklearn:

  --p-read-orientation [reverse-complement|same]
                                  [optional]
                                  Direction of reads with respect
                                  to reference sequences. same will cause
                                  reads to be classified unchanged; reverse-
                                  complement will cause reads to be reversed
                                  and complemented prior to classification.
                                  Default is to autodetect based on the
                                  confidence estimates for the first 100
                                  reads.

When this parameter is not specified, there doesn't seem to be a way to figure out what orientation was used. For example, when I look at the provenance information of this artifact, this parameter is listed as "null":

Answer 1 · 2017-09-06T18:12:46.000Z

Perhaps having a separate action that guesses the read orientation for you would solve the issue? I'm not sure how to record this in provenance. Or would it work to have classify-sklearn print what the guessed read orientation is (e.g. so it's available in the debug logs)?

Answer 2 · 2017-09-06T18:23:06.000Z

I would say it makes more sense to make the provenance indicate that the parameter was "guessing", and also print the information as part of the debug logs. It could also be included as part of the results, as a new column (maybe read orientation).

…

On (Sep-06-17|18:12), Jai Ram Rideout wrote: Perhaps having a separate action that guesses the read orientation for you would solve the issue? I'm not sure how to record this in provenance. Or would it work to have `classify-sklearn` print what the guessed read orientation is (e.g. so it's available in the debug logs)? -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #88 (comment)

Answer 3 · 2017-09-06T21:09:29.000Z

Just my two cents: `read_orientation: null` already means that it’s guessing, but perhaps we could include “guess” as the default parameter option for greater readability. Printing the guessed read orientation probably makes a bit more sense. We could include it as another column but it would be the same for every row.

…

On 7 Sep 2017, at 4:23 am, Yoshiki Vázquez Baeza ***@***.***> wrote: I would say it makes more sense to make the provenance indicate that the parameter was "guessing", and also print the information as part of the debug logs. It could also be included as part of the results, as a new column (maybe read orientation). On (Sep-06-17|18:12), Jai Ram Rideout wrote: >Perhaps having a separate action that guesses the read orientation for you would solve the issue? I'm not sure how to record this in provenance. Or would it work to have `classify-sklearn` print what the guessed read orientation is (e.g. so it's available in the debug logs)? > >-- >You are receiving this because you authored the thread. >Reply to this email directly or view it on GitHub: >#88 (comment) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFCcv5Hwzda3K90Pnnk3AgfLaqz7HSN9ks5sfuMLgaJpZM4PNry0>.

Answer 4 · 2017-09-06T21:30:06.000Z

It could also be included as part of the column name, say instead of Confidence, it would be "Confidence (reverse complemented sequences)" or something like that.

…

On (Sep-06-17|21:09), BenKaehler wrote: Just my two cents: `read_orientation: null` already means that it’s guessing, but perhaps we could include “guess” as the default parameter option for greater readability. Printing the guessed read orientation probably makes a bit more sense. We could include it as another column but it would be the same for every row. > On 7 Sep 2017, at 4:23 am, Yoshiki Vázquez Baeza ***@***.***> wrote: > > I would say it makes more sense to make the provenance indicate that the > parameter was "guessing", and also print the information as part of the > debug logs. It could also be included as part of the results, as a new > column (maybe read orientation). > > On (Sep-06-17|18:12), Jai Ram Rideout wrote: > >Perhaps having a separate action that guesses the read orientation for you would solve the issue? I'm not sure how to record this in provenance. Or would it work to have `classify-sklearn` print what the guessed read orientation is (e.g. so it's available in the debug logs)? > > > >-- > >You are receiving this because you authored the thread. > >Reply to this email directly or view it on GitHub: > >#88 (comment) > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub <#88 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFCcv5Hwzda3K90Pnnk3AgfLaqz7HSN9ks5sfuMLgaJpZM4PNry0>. > -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #88 (comment)

Answer 5 · 2017-09-06T21:46:08.000Z

I'd prefer printing the guessed read orientation while the method is executing, and possibly updating the default parameter value to 'guess' (though None seems fine, it's documented in the command).

I'm less excited about renaming the Confidence column to indicate read orientation because confidence isn't associated with read orientation, it's the confidence of the classification for each feature. The guessed read orientation seems more informational in nature -- one day provenance will capture logging info, print statements, warnings, etc. that are emitted by actions, so perhaps printing the guessed orientation is a quick & easy way to go for now, and will be future-proof when provenance supports logging 🚀 🌞