PXD023373 is an
original dataset announced via ProteomeXchange.
Dataset Summary
Title | Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer |
Description | Deregulated gene expression is a hallmark of cancer, however most studies to date have analyzed short-read RNA-sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which >66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell-line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies. |
HostingRepository | PRIDE |
AnnounceDate | 2021-02-02 |
AnnouncementXML | Submission_2021-02-01_20:08:46.xml |
DigitalObjectIdentifier | |
ReviewLevel | Peer-reviewed dataset |
DatasetOrigin | Original dataset |
RepositorySupport | Unsupported dataset by repository |
PrimarySubmitter | Dennis Kappei |
SpeciesList | scientific name: Homo sapiens (Human); NCBI TaxID: 9606; |
ModificationList | No PTMs are included in the dataset |
Instrument | Q Exactive HF |
Dataset History
Revision | Datetime | Status | ChangeLog Entry |
0 | 2021-01-04 02:39:06 | ID requested | |
⏵ 1 | 2021-02-01 20:08:47 | announced | |
Publication List
Huang KK, Huang J, Wu JKL, Lee M, Tay ST, Kumar V, Ramnarayanan K, Padmanabhan N, Xu C, Tan ALK, Chan C, Kappei D, G, รถ, ke J, Tan P, Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer. Genome Biol, 22(1):44(2021) [pubmed] |
Keyword List
submitter keyword: Gastric Cancer, long-read sequencing, alternative promoters, isoform sequencing (Iso-Seq) |
Contact List
Dennis Kappei |
contact affiliation | Cancer Science Institute of Singapore, National University of Singapore |
contact email | dennis.kappei@nus.edu.sg |
lab head | |
Dennis Kappei |
contact affiliation | Cancer Science Institute of Singapore |
contact email | dennis.kappei@nus.edu.sg |
dataset submitter | |
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2021/02/PXD023373 |
PRIDE project URI |
Repository Record List
[ + ]
[ - ]
- PRIDE
- PXD023373
- Label: PRIDE project
- Name: Long-read transcriptome sequencing reveals abundant promoter diversity in distinct molecular subtypes of gastric cancer