The SG-NEx project is an international collaboration initiated at the Genome Institute of Singapore to provide reference transcriptomes for 5 of the most commonly used cancer cell lines using Nanopore long read RNA-Seq data:
Transcriptome profiling is done using PCR-cDNA sequencing ("PCR-cDNA"), amplification-free cDNA sequencing ("direct cDNA"), direct sequencing of native RNA (“direct RNA”), and short read RNA-Seq. All samples are sequenced with at least 3 high quality replicates. For a subset of samples spike-in RNAs are included and matched m6A profiling is available.
The raw, aligned, and processed data is hosted on the AWS open data registry (see below for data access and analysis tutorial).
- Email list
- Latest Data Release and Access
- Browse the data
- Data Processing
- Use Cases and Applications
- Data Analysis Tutorials
- Contributors
- Citing the SG-NEx project
- Contact
You can sign up for the sg-nex-updates email list to receive notifications about upcoming data releases:
https://groups.google.com/forum/#!forum/sg-nex-updates/join
Latest Release (v0.3)
This release includes 86 samples from 11 different cell lines.
Data Access
You can access the following data through the AWS Open Data Registry:
- raw files (fast5)
- raw files (blow5)
- basecalled files (fastq)
- aligned reads (genome and transcriptome) (bam)
- tracks for visualisation (bigwig and bigbed)
- processed data for differential RNA modification analysis (json, for use with xPore)
- processed data for identification of m6A (json, for use with m6Anet)
- annotation files
- detailed sample and experiment information
You can browse the S3 data here.
Please refer to the data access tutorial which describes the S3 data structure and how to access files with AWS CLI. The direct links to the data are listed in the sample spreadsheet.
Citation: Please cite the pre-print describing the SG-NEx data resource when using these data, and add the following details: "The SG-NEx data was accessed on [DATE] at registry.opendata.aws/sg-nex-data".
Chen, Y. et al. "A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines." bioRxiv (2021). doi: https://doi.org/10.1101/2021.04.21.440736
Release History
You can find previous releases here in the release history
You can now browse the data using the UCSC genome browser:
View the SG-NEx data in the UCSC Genome Browser
By default only selected tracks are shown, but you can visualise all reads (bigbed tracks) and their coverage tracks (bigwig) from each individual sample.
All data was aligned against the human genome version Grch38 (please refer to the data access tutorial for reference files). We collaborated with nf-core to develop nanoseq, a standardardized pipeline for Nanopore RNA-Seq data processing.
You can browse a list of articles that review or use the SG-NEx data here. If you have used the data for your own research, feel free to add a publication entry.
The following short tutorials are available that demonstrate how to analyse the SG-NEx data:
-
Transcript discovery and quantification of SG-NEx samples (using Bambu)
-
Analysing differential RNA modifications of SG-NEx samples (using xPore)
-
Identification of m6A with the SG-NEx samples (using m6Anet)
Additional, more detailed workflows can be found here:
-
Identification of differential RNA modifications using a METTL3 knockout cell line (using xPore)
-
Analysing transcriptome-wide m6A modifications (using m6Anet)
GIS Sequencing Platform and Data Generation
Hwee Meng Low, Yao Fei, Sarah Ng, Wendy Soon, CC Khor
Cancer Genomics and RNA Modifications
Viktoriia Iakovleva, Puay Leng Lee, Lixia Xin, Hui En Vanessa Ng, Jia Min Loo, Xuewen Ong, Hui Qi Amanda Ng, Suk Yeah Polly Poon, Hoang-Dai Tran, Kok Hao Edwin Lim, Huck Hui Ng, Boon Ooi Patrick Tan, Huck-Hui Ng, N.Gopalakrishna Iyer, Wai Leong Tam, Wee Joo Chng, Leilei Chen, Ramanuj DasGupta, Yun Shen Winston Chan, Qiang Yu, Torsten Wüstefeld, Wee Siong Sho Goh
Statistical Modeling and Data Analytics
Ying Chen, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke
The SG-NEx resource is described in:
Chen, Ying, et al. "A systematic benchmark of Nanopore long read RNA sequencing for transcript level analysis in human cell lines." bioRxiv (2021). doi: https://doi.org/10.1101/2021.04.21.440736
Please cite this pre-print when using these data, and add the following details: "The SG-NEx data was accessed on [DATE] at registry.opendata.aws/sg-nex-data".
Questions about SG-NEx? Please add an entry in the Discussions Forum. You can also contact Jonathan Göke