This is a re-work of the the Yaari lab pipeAIRR script written by Ayelet Peres.
The pipeline will run on Linux, or Windows Subsystem for Linux (WSL).
Beyond this repo, you will need to install:
- Nextflow
- Docker client or Singularity
- Some Python modules (see python/requirements.txt)
The examples assume Docker is used.
- Review the configuration in processed_samples/FLAIRRSeq.config and make any changes to reflect location of your reference sets, number of threads, etc.
- Review the script in processed_samples/ and make changes to the wildcards and directories, to suit the naming of your samples and FASTQ files.
- Edit processed_samples/flairr_logs.toml and modify the first line to reflect the absolute location of the processed_samples directory in your file system (this toml file is not used by nextflow, but it is used by subsequent python scripts to produce summary analyses).
- Run the pipeline with the following command:
cd processed_samples
The script shows both the use of 'standard' germline reference sets and the use of personalised reference sets for each sample, which could, for example, be taken from genomic sources. See the params at the top of preprocess/ and annotate/ for configuration, eg to annotate different loci. Any params setting can be over-ridden on the command line, as shown in
Output is created in subdirectories under processed_samples. The script as written retains the nextflow work directories ./.nextflow, in both processed_samples itself and in the subdirectories. These work directories can grow large, and should be deleted unless needed for debugging.
The pipeline uses the Immcantation Docker image. The current image contains an installation of usearch which is not compatible with WSL. To use the pipeline in WSL, you will need to build a new Docker image, as follows:
- Run the container interactively:
docker run -it --cpus 1 immcantation/suite:4.4.0 /bin/bash
- In another terminal, find the container ID:
docker ps
- In the container, type:
mkdir muscledir
cd muscledir
tar xzvf muscle_src_3.8.1551.tar.gz
The last stage of the build (ln) will fail. Repeat the ln command, but without the -static option. Then test that muscle has build correctly:
./muscle --help
Move it to the bin directory and clean up:
mv muscle /usr/local/bin
cd ..
rm -rf muscledir
- Exit the container and commit the changes:
docker commit <container_id> my_immcantation_suite:4.4.0
- Modify processed_samples/FLAIRRSeq.config to use my_immcantation_suite:4.4.0 rather than immcantation_suite:4.4.0