10XGenomics/cellranger

mkref gtf errors after v4.0.0

Closed this issue · 2 comments

cap76 commented

I want to use cellranger to analyse a dataset for which we used multiplexing (e.g. cellranger multi) so need to use a newer version (>6ish or so). However from version 5.0.0 onwards the cellranger mkref does not appear to be able to generate a reference. If I run:

cellranger mkref --genome=GRCh38 --fasta=genome.fa --genes=genes.gtf

I get the following error:

Creating new reference folder at /bi/group/rugggunn/Implantation/GRCh38_v7
...done

Writing genome FASTA file into reference folder...
...done

Indexing genome FASTA file...
...done

Writing genes GTF file into reference folder...
mkref has failed: error building reference package
Error while parsing GTF file
Invalid contig name encountered on GTF line 276013: chr10_GL383545v1_alt. The FASTA file has contigs:
[ ....]
Please fix your GTF and start again.

I tried versions 5.0.0, 6.0.0, and 7.0.0, and they all seem to throw the same error. This is a standard albeit quite old gtf file for (GRCh38) that runs fine for cellranger 2.0.0-4.0.0. I wonder if anyone has encountered this error before and if they had any fixes?

This behavior is expected and was added to better detect mismatches between GTF and FASA files. You can likely verify that by seeing that this contig isn't in the fasta file:

cat genome.fa | grep  chr10_GL383545v1_alt | wc -l

You can either modify your fasta or GTF file to include or exclude this contig, or download one of our premade references.

cap76 commented

Perfect, thanks for the quick response @evolvedmicrobe saved me a lot of head scratching :-)