Python tool for basic statistics on quality of raw Illumina reads
QC-TError is a tool that you can use to get basic statistics on your raw Illumina reads. With help of it you can obtain information regarding:
- basic statistics
- length of reads presented in the data
- total number of reads
- average content of GC in sequence
- number of reads that have poor quality
- visualtisation of:
- average per base quality of reads
- sequence quality per tile
- distribution of average read quality score
- per base sequence content
- per sequence GC content
You can do a test run on the data presented as a example.fastq. The data was taken from
https://figshare.com/articles/dataset/amp_res_2_fastq_zip/10006541. This data was additionally corrected: it consist out of first 60 reads, some of them are also shorterned for purposes to see difference in the plot regarding reads_length.
-
clone this repository into you local
-
it is better to use virtual environment (conda or virtualenv)
-
before first start run
pip install -r requirements.txt
in the terminal (in your virtual environment better) -
then you can start
-
the correct work had been cheched for Ubuntu 20.4 and Windows 10, but if you can choise, Ubuntu is better
-
the script works correct with python 3.8
- To run program type python main.py -i file.fastq (or use your python3 interpreter) in project directory
- If you want, you can specify output directory: python main.py -i file.fastq -o path/to/dir
- Program will count the number of reads, create folder QCTerror_res and exit
- All pictures and tables into QCTerror_res/pictures/ and QCTerror_res/tables/ folders
- The pdf report generation takes some time, so you can take a coffee break or just comment (#) last string in amateur_maker() (in main.py)
- Enjoy :)
- Anna Koroleva,https://github.com/Ann-Krlv
- Contributed to quality_per_base, gc_counter, duplicate_counter,per_sequence_quality_score_print, testing - IDEA DEVELOPER
- Valeria Ladygina, https://github.com/ValeriiaLadyhina
- Responsible for README.md, created Logo, contributed to quality_per_base ,gc_counter, base_content, read length, flake 8 - PLOT BEAUTIMAKER
- Leonid Zhozhikov, https://github.com/LyonyaZhozhikov
- responsible for the part of the tool that creates report - GOD of REPORTS
- Oxana Kolpakova, https://github.com/OxanaKolpakova
- Contributed to per_sequence_quality_score_print, requirements, last fixes for flake 8, testing - SUPER TESTER
If you have any questions, bug reports or complains please approach the authors of FastQ Filtrator via email:
valeriia.ladyhina@gmail.com - responsible for communication with clients