/Immprint

ImmPrint answers the question "were these two samples extracted from the same individual ?" by looking at the numbers and sequences of shared unique TCR between the two samples.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

ImmPrint

Immprint is a command-line tool built to identify the origine of T-cell samples. Given two samples of TCR sequences it's able to decide whethever they come from the same individual or not.

Installation

Immprint requires a recent version of python3 and pip3. Clone or download the repository, then run:

cd Immprint
pip3 install ./olga3 .

Usage

immprint sampleA.csv sampleB.csv

The two csv files should contain a column cdr3_nt, which gives the CDR3 sequence of the receptor in nucleotides. ImmPrint can also use the clonal frequency information if a count column is provided.

Additional arguments:

  • -h --help: Print the help
  • -S --no-I: Don't use the recombination probabilities (faster but less precise).
  • -n, --no-counts: Don't use the counts even if provided.
  • -f / --full: Use both chains of the receptor, in that case the elements of the cdr3_nt column should all have this specific format: TRA:TGCA...ACCTTT;TRB:TGTGC...TCTTC.
  • -g, --gamma: modify the value of gamma, that influences the precision of Immprint (if using the recombination probabilities).

Examples

The folder "examples" contain a few test datasets. From the base folder:

	immprint examples/P1_a.csv examples/P1_b.csv 

Or with full receptors:

	immprint --full examples/P1_full_a.csv examples/P1_full_b.csv

Caveats

  • Only works for human TCR, and only for the beta chain when using single chain.
  • For datasets sequenced on the same lane, with barcodes, some amount of contamination is expected. This shouldn't influence immprint, except potentially for small samples when using the full receptor.

Companion paper

The github associated with the paper "Immune fingerprinting" can be found here.