Typo in README for single-end example CSV

Question

Typo in README for single-end example CSV

hoelzer opened this issue 3 years ago · 4 comments

I think there is smt wrong. The README states:

Source labels are optional - the header is still required, the value can be empty as in the single-end example above.

and the example shows:

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,,mock,A,0
mock_rep2,/path/to/reads/mock2.fastq.gz,,mock,B,0
mock_rep3,/path/to/reads/mock3.fastq.gz,,mock,C,0
treated_rep1,/path/to/reads/treat1.fastq.gz,,treated,A,0
treated_rep2,/path/to/reads/treat2.fastq.gz,,treated,B,0
treated_rep3,/path/to/reads/treat3.fastq.gz,,treated,C,0

but should this not be

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,mock,,0
mock_rep2,/path/to/reads/mock2.fastq.gz,mock,,0
mock_rep3,/path/to/reads/mock3.fastq.gz,mock,,0
treated_rep1,/path/to/reads/treat1.fastq.gz,treated,,0
treated_rep2,/path/to/reads/treat2.fastq.gz,treated,,0
treated_rep3,/path/to/reads/treat3.fastq.gz,treated,,0

??

Answer 1 · 2022-04-01T15:59:16.000Z

another typo:

Genomes and annotation can also be specified via --genome and --annotaion, see here.

Answer 2 · 2022-04-01T16:31:06.000Z

... and in this context we could add some info regarding handling of multiple-mapped reads. Per default, we only count uniquely mapped reads via featureCounts but this can be adjusted if the user wants via

--featurecounts_additional_params '-t exon -g gene_id -M'

Here, it's important to also provide the -t and -g parameters bc/ only providing -M would overwrite them. It's basically fine, bc/ they are anyway default in featureCounts, but better to add the -M to keep control also of the actual features that are counted and accumulated.

Answer 3 · 2022-04-19T08:11:26.000Z

I think there is smt wrong. The README states:

Source labels are optional - the header is still required, the value can be empty as in the single-end example above.

and the example shows:

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,,mock,A,0
mock_rep2,/path/to/reads/mock2.fastq.gz,,mock,B,0
mock_rep3,/path/to/reads/mock3.fastq.gz,,mock,C,0
treated_rep1,/path/to/reads/treat1.fastq.gz,,treated,A,0
treated_rep2,/path/to/reads/treat2.fastq.gz,,treated,B,0
treated_rep3,/path/to/reads/treat3.fastq.gz,,treated,C,0

but should this not be

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,mock,,0
mock_rep2,/path/to/reads/mock2.fastq.gz,mock,,0
mock_rep3,/path/to/reads/mock3.fastq.gz,mock,,0
treated_rep1,/path/to/reads/treat1.fastq.gz,treated,,0
treated_rep2,/path/to/reads/treat2.fastq.gz,treated,,0
treated_rep3,/path/to/reads/treat3.fastq.gz,treated,,0

??

Am I understanding you right you want the source labels to be removed in the CSV example because they are optional? Because then we could also remove the strandedness mode, that is also (kind of) optional right? Well at least the pipeline will start without the parameter set in neither the CSV or the CL.

Answer 4 · 2022-04-24T14:52:18.000Z

@fischer-hub No the labels should stay. The thing is, that in the README we explain that the Source label is needed but can be empty and then we refer to the single-end example:

Source labels are optional - the header is still required, the value can be empty as in the single-end example above.

But then the example shows a non-empty example regarding the Source column. So I simply suggest changing

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,,mock,A,0
mock_rep2,/path/to/reads/mock2.fastq.gz,,mock,B,0
mock_rep3,/path/to/reads/mock3.fastq.gz,,mock,C,0
treated_rep1,/path/to/reads/treat1.fastq.gz,,treated,A,0
treated_rep2,/path/to/reads/treat2.fastq.gz,,treated,B,0
treated_rep3,/path/to/reads/treat3.fastq.gz,,treated,C,0

into

Sample,R1,R2,Condition,Source,Strandedness
mock_rep1,/path/to/reads/mock1.fastq.gz,,mock,,0
mock_rep2,/path/to/reads/mock2.fastq.gz,,mock,,0
mock_rep3,/path/to/reads/mock3.fastq.gz,,mock,,0
treated_rep1,/path/to/reads/treat1.fastq.gz,,treated,,0
treated_rep2,/path/to/reads/treat2.fastq.gz,,treated,,0
treated_rep3,/path/to/reads/treat3.fastq.gz,,treated,,0

However, the point about the strandness you raise is also important and as already discussed here #176 Let's also keep the discussion about the strandness in that separate issue