Sort alignments by Base

Question

Sort alignments by Base

ifdongs opened this issue 4 months ago · 6 comments

Hi,
Very powerful tool
When I use igv, I often need to cluster the same bases together for display. Are there any relevant settings that can achieve similar functions?
Thanks

Answer 1 · 2024-08-06T09:19:35.000Z

Hi @ifdongs,
Thanks for the suggestion. I think it could be done, but will need some more details.
Could you describe what you mean a bit more clearly, or what function in IGV you want to replicate - It sounds like you want all of the reads with the T variant at the top of the screen, and all other reads below that? In IGV you can right-click then select Group by -> base at position X. Is that what you mean?

Currently there is no equivalent function in GW, but it could be added. In GW you can almost achieve the same using the filter command with a kmer of interest, but this will hide all the other reads without that kmer.

Answer 2 · 2024-08-07T09:29:34.000Z

@kcleal , Yes,
the specific operational details: right-click -- Sort alignment by -- base
or use hot-keys: ctrl+s

In some plasma sequencing, the sequencing depth is more than 5000×, and the mutation signal intensity is very low, often less than 0.5%, the mutations will be randomly distributed in the GW screenshot and difficult to identify.
If the mutant reads at the top of the screen, gw snipshots will have a better display effect.

i have only used it on linux without x11 before. Can kmer be achieved by setting gw.ini?

Thanks

Answer 3 · 2024-08-07T09:36:10.000Z

I see, I think this can be achieved but will take a little bit of thought to make it work in a dynamic way. Currently you can use the filtering command, and also the count command to identify those reads.

Firstly find one of the reads with a mutant allele
Click on the read with the mutation, it will be printed to the terminal. You should be able to see the mutation as a coloured base
Use the mouse to copy a bit of their sequence around the mutation, e.g. 10-30 base pairs
Use the command filter seq contains SEQ where SEQ is the bit of sequence you copied
Use the count command to count the number of reads
Remove the filter by using refresh or r

I will add the functionality you suggested, but will probably have to wait a week or two before its finished.

Also filter seq omit SEQ will perform the opposite, so could be useful for counting reference alleles for example

Answer 4 · 2024-08-08T09:01:46.000Z

exciting 👍 👍
Thanks

Answer 5 · 2024-09-12T11:56:46.000Z

Hi @ifdongs,
I have added this in v1.1.0. You will need to clone the repo and build from source if you would like to test before the release is finalized. The full release will take a bit longer to prepare.
Reads can be sorted using the 'S' hot key (mouse over the target base), or you can use a command sort 120000 or sort strand 120000, for example. Sorting can also be applied from the CLI using the new --command or -c option. For example --command 'sort 12000' can be used.

Answer 6 · 2024-09-13T08:47:46.000Z

v1.1.0 is now available