SziKayLeung/Whole_Transcriptome_Paper

run_LIMA function

Opened this issue · 0 comments

Hi!
I have some questions concerning the run_LIMA function found in the All_HumanFunctions.sh file. First of all I wanted to aks what's the difference between multiplex and no-multiplex. Some samples are multiplexed and others not?
Also, I can't find the Targeted_FASTA file, I don't know if you provided it.

The other question is what do these $1/2/3/4 represent ?

Below is the code that I'm referring to

run_LIMA(){
  source activate isoseq3

  cd $3
  echo "Processing $1 file for demultiplexing"
    if [ -f $2/$1.fl.json ]; then
      echo "$1.fl.bam file already exists; LIMA no need to be processed on Sample $1"
    elif [ $4 = "multiplex" ]; then
      echo "Multiplex: use Targeted_FASTA"
      #lima <input.ccs.merged.consensusreadset.xml> <input.primerfasta> <output.fl.bam>
      time lima $2/$1.ccs.bam $TARGETED_FASTA $1.fl.bam --isoseq --dump-clips --dump-removed --peek-guess
      echo "lima $1 successful"
      ls $1.fl*
    else
      echo "No-Multiplex: use FASTA"
      time lima $2/$1.ccs.bam $FASTA $1.fl.bam --isoseq --dump-clips --dump-removed
      echo "lima $1 successful"
      ls $1.fl*
    fi
    source deactivate
}

Thank you so much for providing the dataset and the codes! Have a nice day :)