seekr_domain_pearson percentile function overwrites data in rows with identical names
Opened this issue · 6 comments
Leaving note for future fix -- seekr_domain_pearson percentile function overwrites data in rows with identical names; r value output does not overwrite data in rows with identical names
@jmcalabr Would it be easy for you to upload a small dataset that triggers this bug? Also, which function (or command-line call) was run? seekr has a few percentile functions: percentileofscore
, calc_percentiles
and calc_internal_percentiles
.
Hey Jessime, thanks! abosolutely nothing critical, but pasting data to use below. For the percentile to work in seekr_domain_pearson, I think you would need to pass an additional large set of sequences, like all gencode lncRNAs. Also, I am not aware/have forgotten about those other percentile functions -- are they documented in the help file/description for seeks-command-line? If not, that would be another tiny item to update.
seekr_domain_pearson testdata.fa testseq.fa mean4.npy std4.npy -rp gencode.vM25.lncRNA_transcripts.fa -k 4 -r r_values.csv -p percentiles.csv -s 40 -w 400
testdata.fa
5p_tarSL_1_407_407
CTTCAGAGTGCGCGAACATGAAGCACAGAACCCACCAGGGCATAGAGACTCAAAACTCCGGAGTGCGTAATACGCCCTCCCGCACGTGCGTTTGGCAAATTATCATTGGATATTAGAGAGCCCCACGCATAACAAGTTACCCACCAACGTCCCTGGTCCACTTAAATCATGACGATGTGTCGGGCAACGTTAGAATGGAATGGTATGTCGGATGCCCGCGAAGACGGGGGGATTAGGGTTAATGTCAGATGCTTACCCGACGTGCATCCATGTCGGTTGCGTACCTGAAAGCGGGTCGTCAGGAATTGAGAATCAGGCCCAAAGGATGATATCCAGGATCCACCGATATGGCTTACCGGTGGTTATTGTTAGTCGCCATCTGGCCTTGGGCCATGAGGTAGCTCGCA
5p_tarSL_1_407_407
CTGCGTCGAATCAGGTTTTCTACGGCACGAAGGTAGGTTGTTAAGCACGGTGTGGGCGAGTGGAGACTTAGGTACGACAAGGGACAGCAGCCAAAATGCACGTGTCACCGTCGGTACAACTACCTATCACGTGGTAACGCTTCAACTAAGCCATTTACTTAAAGAAGAAGAATCCCTTTCTTGTTTCGTAGTTCGTCTCATGTCTCCGTACGGTCGAAGGCTGCAAGTGAACTGACACTTACATAATGAGCAAAATCGTGTTATGCGACAGCGATACCTTAGGAAAGTAAGGTCACAAAATGAATAGATGCATGTGTGGGGGACATATGACAAGCACTGTTGATGATTCAGCCTCGCAGCAGAATGTTGGTGGCGATGTCTCGTACCGCTAATTGTCCTTCGACATT
5p_tarSL_1_407_407
ATTGACGGAACTCTGGTGTGAAATCCAGGACGAAACCACCGTTAGGCCCGTACAATTCTGGAGGCAGGCTCTTACTGAATCGGCTAACGTAGTCGAAACTAAAGTCACGCACTATTCCAAAGGGATCTCATAAAGCATGAAACATAACCTTCGGCGACATTGGGCGCAGTTACGCATACGTATAGAAAGTCCTCTCTGGCTATGCGTTCGTCTTGAGGGATAGGCTGAAAGTCCCCATCTTCAGTAAAAAATCTAGGTTTAGAAGAGTTCCGACGGGCATGGGCCGAGTACCGACCGCTGCACGGTGGCTGCAAGCTGGCCCCTAGTTGGATCGTGCGCTCTCTCTAGTGCTGGTGACCGCAAATATATGAGAAATAGCTAGCTCCGCGGAACCATGACTGTACGGT
5p_tarSL_1_407_407
TGCGAAAAACTAAGTTGAGTCTTTCTTGGTGGGAATGGACTCTTGCACTGAATGGGAAACGTAGATACCTTGGGTACCACCCAGGGGAGGTACTAAGCCTGTGCGTGGAACCTTAACGAAGATAGGAAAGTTATTCGTGTGTGATGTCATGGACCCGGAAAAATAGACCTCCAAGCATGGTATCCTAGGTGGTTATCCTTCCTATTCTGTGCTGCATAGGAGTCCATTGCCATAAGGGAAAGAAATAGCAGGGTTAGCGTCGATGGGATAAGAAGACCGCGCAGTACCGGAGTTCAGAGAGTCAGGAACAAACATCCGCTGGGATTGAGCCGGGGGAAAGGTCGGGACGAAGATAAGCCATGGGAGGAGGGGAATCACTTCACCTGGGGAAGCCAAGCAAACACTCG
5p_tarSL_1_407_407
GGAAAACGGACGTGATGAAAGGGAGAGAGGAGTGAACAGAGTCACGCTAGGCATCAGCAGCTAGTCGCCGGGCCCCGCACTTCGAATCAGGAGTGGCTTACGCGGGATTGGATATCCGTTGCAATGGTCACTATGAAGATTATATCTCGAGACGGTCGACATCAAAAACAGAGCATTTAACCCGTACAGTGCGTGCTACGTAGACGCCGAACCTATCCTCAATAGGACTATGGTTGTTGCGCCGAAGAAATCTAGCGGAGGGGTAAAGTGTAAGTAGCAAAGATGAGCGCTCAAATTGTGCCATTTACCGGAACTTATTCGCGCTGGTGTCCGATGTACTCGCCTAGGTACTTTGATAGCTGGCTCCCTTGAGGATGCATCTCGGGCATAGCATGCAAATTTGCGGG
testseq.fa
full_length_1_9173_9173
GGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTCAAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTGGCGCCCGAACAGGGACTTGAAAGCGAAAGTAAAGCCAGAGGAGATCTCTCGACGCAGGACTCGGCTTGCTGAAGCGCGCACGGCAAGAGGCGAGGGGCGGCGACTGGTGAGTACGCCAAAAATTTTGACTAGCGGAGGCTAGAAGGAGAGAGATGGGTGCGAGAGCGTCGGTATTAAGCGGGGGAGAATTAGATAAATGGGAAAAAATTCGGTTAAGGCCAGGGGGAAAGAAACAATATAAACTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAATCCTGGCCTTTTAGAGACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCATCCCTTCAGACAGGATCAGAAGAACTTAGATCATTATATAATACAATAGCAGTCCTCTATTGTGTGCATCAAAGGATAGATGTAAAAGACACCAAGGAAGCCTTAGATAAGATAGAGGAAGAGCAAAACAAAAGTAAGAAAAAGGCACAGCAAGCAGCAGCTGACACAGGAAACAACAGCCAGGTCAGCCAAAATTACCCTATAGTGCAGAACCTCCAGGGGCAAATGGTACATCAGGCCATATCACCTAGAACTTTAAATGCATGGGTAAAAGTAGTAGAAGAGAAGGCTTTCAGCCCAGAAGTAATACCCATGTTTTCAGCATTATCAGAAGGAGCCACCCCACAAGATTTAAATACCATGCTAAACACAGTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGACCATCAATGAGGAAGCTGCAGAATGGGATAGATTGCATCCAGTGCATGCAGGGCCTATTGCACCAGGCCAGATGAGAGAACCAAGGGGAAGTGACATAGCAGGAACTACTAGTACCCTTCAGGAACAAATAGGATGGATGACACATAATCCACCTATCCCAGTAGGAGAAATCTATAAAAGATGGATAATCCTGGGATTAAATAAAATAGTAAGAATGTATAGCCCTACCAGCATTCTGGACATAAGACAAGGACCAAAGGAACCCTTTAGAGACTATGTAGACCGATTCTATAAAACTCTAAGAGCCGAGCAAGCTTCACAAGAGGTAAAAAATTGGATGACAGAAACCTTGTTGGTCCAAAATGCGAACCCAGATTGTAAGACTATTTTAAAAGCATTGGGACCAGGAGCGACACTAGAAGAAATGATGACAGCATGTCAGGGAGTGGGGGGACCCGGCCATAAAGCAAGAGTTTTGGCTGAAGCAATGAGCCAAGTAACAAATCCAGCTACCATAATGATACAGAAAGGCAATTTTAGGAACCAAAGAAAGACTGTTAAGTGTTTCAATTGTGGCAAAGAAGGGCACATAGCCAAAAATTGCAGGGCCCCTAGGAAAAAGGGCTGTTGGAAATGTGGAAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGACAGGCTAATTTTTTAGGGAAGATCTGGCCTTCCCACAAGGGAAGGCCAGGGAATTTTCTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAAGAGAGCTTCAGGTTTGGGGAAGAGACAACAACTCCCTCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTTTAGCTTCCCTCAGATCACTCTTTGGCAGCGACCCCTCGTCACAATAAAGATAGGGGGGCAATTAAAGGAAGCTCTATTAGATACAGGAGCAGATGATACAGTATTAGAAGAAATGAATTTGCCAGGAAGATGGAAACCAAAAATGATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGATACTCATAGAAATCTGCGGACATAAAGCTATAGGTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAATCTGTTGACTCAGATTGGCTGCACTTTAAATTTTCCCATTAGTCCTATTGAGACTGTACCAGTAAAATTAAAGCCAGGAATGGATGGCCCAAAAGTTAAACAATGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAAATTTGTACAGAAATGGAAAAGGAAGGAAAAATTTCAAAAATTGGGCCTGAAAATCCATACAATACTCCAGTATTTGCCATAAAGAAAAAAGACAGTACTAAATGGAGAAAATTAGTAGATTTCAGAGAACTTAATAAGAGAACTCAAGATTTCTGGGAAGTTCAATTAGGAATACCACATCCTGCAGGGTTAAAACAGAAAAAATCAGTAACAGTACTGGATGTGG
Mauro, I don't get notifications whenever issues are opened up here, unfortunately. Did Jessime ever get to this bug? I'd be happy to take a looksie
Hey Dan,
- Nope, haven't gotten around to fixing this.
- If you do want to get email notifications (no pressure to), you can click the Watch button on the top right.
lmao how many years of using git and I never knew that. I could give this a look and pass along to you for final approval/style editing :)
Sure; I'd be happy to review a PR if you put one up