⮝ Full datasets listing

PXD008960-3

PXD008960 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleImprovements to the rice genome annotation through large-scale analysis of RNA-Seq and proteomics data sets
DescriptionWe have performed a Proteogenomics meta-analysis of data sets deposited in ProteomeXchange: PXD000265, PXD000313, PXD000923, PXD001030, PXD001058, PXD002291, PXD002739, PXD002740 and PXD003156 and using 29 RNA-Seq data sets on rice (Oryza sativa). We created a search database comprising translated reads that had been mapped onto the rice genome, as well as officially annotated rice proteins sequences. The RNA Seq database was pre-processed to identify “novel transcripts” for those not mapping fully to an existing exon, and “novel junctions” for those reads mapped with a gap, implying a potential novel splice site that was not annotated in the official gene set. Confidentially identified “novel peptides” i.e. those mapping to a novel junction or novel transcript were post-processed to ensure that there were no other better explanations for the corresponding spectra e.g. peptide from a canonical gene with a modification or amino acid substitution. Data were exported from the pipeline in PSI mzIdentML 1.2 format, containing chromosomal coordinates, and further converted to PSI proBed format for genome visualisation. Novel peptides were searched against other plant databases using BLAST to see if they had predicted in genes from other species. A total of 1584 novel peptides were identified, mapping to ~700 genomic loci in which either new genes have been predicted (~100) or updates to existing gene models have been predicted (~600).
HostingRepositoryPRIDE
AnnounceDate2024-10-22
AnnouncementXMLSubmission_2024-10-22_04:48:09.660.xml
DigitalObjectIdentifierhttps://dx.doi.org/10.6019/PXD008960
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportSupported dataset by repository
PrimarySubmitterDa Qi
SpeciesList scientific name: Oryza sativa (Rice); NCBI TaxID: 4530;
ModificationListAmmonia-loss; iTRAQ8plex:13C(6)15N(2); Oxidation; Carbamidomethyl; Gly->Val; Gln->pyro-Glu; Phospho; Glu->pyro-Glu; Deamidated; cysTMT6plex; iTRAQ8plex; Acetyl; Sulfo; Trimethyl
InstrumentLTQ Orbitrap Velos; TripleTOF 5600; Q Exactive; LTQ Orbitrap
Dataset History
RevisionDatetimeStatusChangeLog Entry
02018-02-15 01:12:25ID requested
12018-10-30 09:30:03announced
22019-02-15 09:06:41announcedUpdated publication reference for PubMed record(s): 30293062.
32024-10-22 04:48:18announced2024-10-22: Updated project metadata.
Publication List
10.6019/PXD008960;
Ren Z, Qi D, Pugh N, Li K, Wen B, Zhou R, Xu S, Liu S, Jones AR, Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets. Mol Cell Proteomics, 18(1):86-98(2019) [pubmed]
10.1074/mcp.RA118.000832;
Keyword List
curator keyword: Technical, Biological
submitter keyword: Rice,Proteogenomics, Proteomics, RNA-Seq, Big data analysis
Contact List
Andrew R Jones
contact affiliationInstitute of Integrative Biology, University of Liverpool
contact emailandrew.jones@liverpool.ac.uk
lab head
Da Qi
contact affiliationBGI-Shenzhen
contact emailqida@genomics.cn
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2018/10/PXD008960
PRIDE project URI
Repository Record List
[ + ]