BX Software Engineer Coding Test

Task 1:

Part 1: fastq_analyzer.py

  • Description: This script analyzes FastQ files. Input files are the sample_files/fastq folder

Part 2: find_most_frequent_sequences.py

  • Description: Identifies the most frequent sequences in the input data sample.fasta.

Part 3: annotate_coordinates.py

  • Description: Annotates coordinates based on input data(coordinates_to_annotate.txt and hg19_annotations.gtf) and generates the output file annotated_output.txt.

Task 2: Calculate Mean Coverage

calculate_mean_coverage.py

  • Description: Calculates mean coverage from input file Example.hs_intervals.txt and generates the output file mean_coverage_by_gc_bins.txt.

Task 3: Fetch Variants Information

fetch_variants_info.py

  • Description: Fetches information about genetic variants. Input file is variant_ids.txt.