proportional_htseq

This script, designed to work under Linux environments, will help you to run htseq-count taking the multihits into account proportionally. Each read will be weighted according to the number of mapped locations. For example, a read mapped to 5 different positions will add 0.2 to the counts of each feature.

Dependencies:

samtools 0.1.18 or above (http://samtools.sourceforge.net/)
htseq 0.5.3p3 or above (http://www-huber.embl.de/users/anders/HTSeq/doc/count.html)

Usage: python htseq.py [options] <mandatory>

Options:
    -h, --help:
             show this help message and exit
    -m, --mode:
             mode to handle reads overlapping more than one feature(choices: union, intersection-strict, intersection-nonempty; default: union)
    -s, --stranded:
             whether the data is from a strand-specific assay. Specify 'yes', 'no', or 'reverse' (default: yes). 'reverse' means 'yes' with reversed strand interpretation
    -a, --minaqual:
             skip all reads with alignment quality lower than the given minimum value (default: 0)
    -t, --type:
             feature type (3rd column in GFF file) to be used, all features of other type are ignored (default, suitable for Ensembl GTF files: exon)
    -i, --idattr:
             GFF attribute to be used as feature ID (default, suitable for Ensembl GTF files: gene_id)
    -o, --samout:
             write out all SAM alignment records into an output SAM file called SAMOUT, annotating each line with its feature assignment (as an optional field with tag 'XF')
    -n, --sort:
             sort the bam file by name (necessary for paired-end reads
    -d, --destiny:
             output directory (default, the directory of execution)
    -c, --clean_up:
             Do not remove the intermediate files generated (default: remove intermediate files)
    -p, --threads:
             Number of threads to run (default: 1)
Mandatory:
    -b, --bam:
             bam/sam file to read
    -g, --gtf:
             gtf file

pfurio/proportional_htseq

proportional_htseq