Not summarized values across conditions
elenichri opened this issue · 3 comments
@jleechung @jonathangoeke
I managed to go through the error (see issue #30) after communication with the server admin. I now run proActiv 1.0.0 which is the version that is compatible with Bioconductor 3.12 (this is the one that the server is running). The problem is that now I encounter a warning as the calculation of result via the proActiv function finishes :
... ... ... ... ... ...
Calculating junction counts
Calculating normalized read counts...
converting counts to integer mode
Calculating log2 absolute promoter activity...
Calculating gene expression...
Calculating relative promoter activity...
Calculating positions of promoters...
Warning message:
In proActiv(files = paste0("~/Alternative_Promoters/GBM_junction_TCGA/", :
Condition argument is invalid.
Please ensure a 1-1 map between each condition and each file.
Returning results not summarized across conditions.
Do you have any idea what the problem might be? Indeed I have unique files, so I cannot understand why I get the "not 1-1 map" warning. The gene expression is part of the metadata slot and not one of the assays:
show(result)
class: SummarizedExperiment
dim: 98780 155
metadata(1): geneExpression
assays(4): promoterCounts normalizedPromoterCounts absolutePromoterActivity relativePromoterActivity
rownames(98780): 1 2 ... 122631 122633
rowData names(8): promoterId geneId ... promoterPosition txId
colnames(155): X00b3b72b.0ac4.4016.bfa2.0bfcd5645d4f_junctions X03ea882b.fd10.4f19.8760.e32c763d31da_junctions ...
fc4c962b.1149.4569.a6a7.8082e32ce505_junctions fe10e7c4.6e22.4b9d.a658.ac316086125d_junctions
colData names(0):
In addition, rowData(result)
does not return the summarized expression across conditions, as it should according to the vignette.
rowData(result)
DataFrame with 98780 rows and 8 columns
promoterId geneId seqnames start strand internalPromoter promoterPosition
<integer> <character> <factor> <integer> <factor> <logical> <integer>
1 1 ENSG00000000003.15 chrX 100637104 - TRUE 2
2 2 ENSG00000000003.15 chrX 100639991 - FALSE 1
3 3 ENSG00000000005.6 chrX 100584936 + FALSE 1
4 4 ENSG00000000005.6 chrX 100593624 + TRUE 2
5 5 ENSG00000000419.12 chr20 50945861 - TRUE 2
... ... ... ... ... ... ... ...
122628 122628 ENSG00000288597.1 chrX 103919047 - FALSE 1
122629 122629 ENSG00000288598.1 chr13 41517168 + FALSE 1
122630 122630 ENSG00000288598.1 chr13 41547948 + FALSE 2
122631 122631 ENSG00000288600.1 chr13 41488357 + FALSE 1
122633 122633 ENSG00000288602.1 chr8 66667596 + FALSE 1
txId
<list>
1 ENST00000373020.9,ENST00000612152.4,ENST00000614008.4,...
2 ENST00000373020.9,ENST00000612152.4,ENST00000614008.4,...
3 ENST00000373031.5,ENST00000485971.1
4 ENST00000373031.5,ENST00000485971.1
5 ENST00000371588.9,ENST00000466152.5,ENST00000371582.8,...
... ...
122628 ENST00000674469.1,ENST00000674236.1,ENST00000674363.1,...
122629 ENST00000674506.1,ENST00000674416.1,ENST00000674320.1
122630 ENST00000674506.1,ENST00000674416.1,ENST00000674320.1
122631 ENST00000674216.1
122633 ENST00000520044.5,ENST00000519289.
Note that I use TopHat2 SJ files. A colleague of mine ran proActiv successfully on these files with Bioconductor v3.11.1 and proActiv v0.1.0.
Hi @elenichri , the warning indicates that the condition vector that you pass in is of a different length than the number of input files you have. Can you check if they are of the same length?
proActiv does not summarize results by condition if the length of the condition vector and the length of the input files are different - there needs to be a 1 to 1 correspondence between your condition vector and input files.
Thank you very much @jleechung. I had different vector lengths. I fixed them and now it works. It was so simple...I somehow missed this.
Glad it works!