A few helper scripts for working with samtools.
Put the path to this repo on your $PATH
.
echo 'export PATH="$PATH:/path/to/samtools-helpers"' >> ~/.bashrc
For some handy aliases, source
.samtools-rc
in this repo:
echo 'source /path/to/samtools-helpers/.samtools-rc' >> ~/.bashrc
The main useful scripts here are samtools-view
(alias sv
) and variants of it (samtools-view-with-header
a.k.a. svh
, samtools-view-less
a.k.a. svl
).
Each of them takes a .sam
runs samtools view
, and then makes the following improvements:
- converts the "bit flag" field to 12
0
s and1
s - formats the file as a table, so e.g. longer vs. shorter read-names in the first column don't mess up the alignment of subsequent columns.
sv 5 NA12878.sam
20FUKAAXX100202:3:6:15018:84106 000010100011 20 224759 60 101M = 225025 366 ACCCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA ?@BBBCEEDFEFEEEFDEEFEEEEBFEDEFCFDDEEFEDFDFEEEFEEEECEEFEEFCEFDEEFFEFEDEEEFFFDECEDCEFEEDDFFBFEFGEAEDCCC MD:Z:101 PG:Z:BWA RG:Z:20FUK.3 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHFHHHGHHHHHIIHHDHHHHHEHHHHH UQ:i:0
20GAVAAXX100126:8:62:5578:2527 001001010011 20 224759 60 101M = 224453 -406 ACCCAAAGCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA 834:/,1(:8::8::<98;-(-;>5?08/:;/+7<;=>?@:9>;==<=:<8<>?4>B>AABAAB@@;;<<=>===9>9?=9>=?==;=:;<?>><?@3@;1 MD:Z:7T93 PG:Z:BWA RG:Z:20GAV.8 AM:i:25 NM:i:1 SM:i:37 MQ:i:60 OQ:Z:C4541/1.55555555544008??9?1514401555?AAA;5554444555?A?7AFEFFFFFFDF55555444454445555444@5@==5555555555 UQ:i:7
20FUKAAXX100202:4:47:20584:49257 000010100011 20 224761 60 101M = 225058 387 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?ACDBBCEDFEDEFEEEFEDBECFBFEFCFDEEEFEDFDFEEEFEEEECEEFEEFCEFFEEFFEFEDEAEFFFAECEFCDFEEFBFFDBEEC:@6A?C4>B MD:Z:101 PG:Z:BWA RG:Z:20FUK.4 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHHDHHHHHIHHHHFHGIGHFE;D9BBD7AH UQ:i:0
20GAVAAXX100126:7:47:4730:37293 000010100011 20 224761 60 101M = 225073 412 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?BB@BCBFDDECC=E@@DB;BDCFDE<<AEB@B>BADD>?C?EDEB>@AC=<?=DAE?E=CAC?;<C=@ADD?ACACCAC>:>4=B676<17@@<:AA<;6 MD:Z:101 PG:Z:BWA RG:Z:20GAV.7 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:BBA>AB@BB@BA?>B==??7>@BBA@:6@@@@@@A@BAA>A?B@BA?=?>9=????@?@>>>@?67@<;??@>?@????@9:96=>2236-39=73@:652 UQ:i:0
20GAVAAXX100126:5:46:21151:39489 000001010011 20 224761 60 101M = 224465 -396 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT >9<=BBB>BB>EFFEEEFEEECEFEEFDEFEEEFFEEFEEFDDEEEEDEEFFDDDDFFFDDFFDEFDEEDFFEEEEEEEEEFEEEEEFFEFEFEF=DED=A MD:Z:101 PG:Z:BWA RG:Z:20GAV.5 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:DBGGFDFCFFBHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHGH UQ:i:0
It's still on you to know which of the 12 bits mean what, but it's a lot better than doing the binary conversion in your head!
$ samtools view NA12878.sam | head -n 5
20FUKAAXX100202:3:6:15018:84106 163 20 224759 60 101M = 225025 366 ACCCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA ?@BBBCEEDFEFEEEFDEEFEEEEBFEDEFCFDDEEFEDFDFEEEFEEEECEEFEEFCEFDEEFFEFEDEEEFFFDECEDCEFEEDDFFBFEFGEAEDCCC MD:Z:101 PG:Z:BWA RG:Z:20FUK.3 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGHHHHHHHHHHHHHHFHHHGHHHHHIIHHDHHHHHEHHHHH UQ:i:0
20GAVAAXX100126:8:62:5578:2527 595 20 224759 60 101M = 224453 -406 ACCCAAAGCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAA 834:/,1(:8::8::<98;-(-;>5?08/:;/+7<;=>?@:9>;==<=:<8<>?4>B>AABAAB@@;;<<=>===9>9?=9>=?==;=:;<?>><?@3@;1 MD:Z:7T93 PG:Z:BWA RG:Z:20GAV.8 AM:i:25 NM:i:1 SM:i:37 MQ:i:60 OQ:Z:C4541/1.55555555544008??9?1514401555?AAA;5554444555?A?7AFEFFFFFFDF55555444454445555444@5@==5555555555 UQ:i:7
20FUKAAXX100202:4:47:20584:49257 163 20 224761 60 101M = 225058 387 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?ACDBBCEDFEDEFEEEFEDBECFBFEFCFDEEEFEDFDFEEEFEEEECEEFEEFCEFFEEFFEFEDEAEFFFAECEFCDFEEFBFFDBEEC:@6A?C4>B MD:Z:101 PG:Z:BWA RG:Z:20FUK.4 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHHDHHHHHIHHHHFHGIGHFE;D9BBD7AH UQ:i:0
20GAVAAXX100126:7:47:4730:37293 163 20 224761 60 101M = 225073 412 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT ?BB@BCBFDDECC=E@@DB;BDCFDE<<AEB@B>BADD>?C?EDEB>@AC=<?=DAE?E=CAC?;<C=@ADD?ACACCAC>:>4=B676<17@@<:AA<;6 MD:Z:101 PG:Z:BWA RG:Z:20GAV.7 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:BBA>AB@BB@BA?>B==??7>@BBA@:6@@@@@@A@BAA>A?B@BA?=?>9=????@?@>>>@?67@<;??@>?@????@9:96=>2236-39=73@:652 UQ:i:0
20GAVAAXX100126:5:46:21151:39489 83 20 224761 60 101M = 224465 -396 CCAAATCTAATCAAGGCTCCCACTCTAACTCCCAAGCTCTAGGATATACCAAGGACAAAGGAAGATCATGAAATACCACCATGGGGATTCAATCAGCAAAT >9<=BBB>BB>EFFEEEFEEECEFEEFDEFEEEFFEEFEEFDDEEEEDEEFFDDDDFFFDDFFDEFDEEDFFEEEEEEEEEFEEEEEFFEFEFEF=DED=A MD:Z:101 PG:Z:BWA RG:Z:20GAV.5 AM:i:37 NM:i:0 SM:i:37 MQ:i:60 OQ:Z:DBGGFDFCFFBHHHHHHHHHHGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHEHHHGH UQ:i:0
Note the opaque binary-flag integers in the second field, and the misalignments of some columns.
sv NA12878.sam
# or:
samtools-view NA12878.sam
svh NA12878.sam
samtools-view-with-header NA12878.sam