<<< Full experiment listing

PXD010618

PXD010618 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitlePeptimapper
DescriptionBackground: Accurate structural annotation of genomes is still a challenge, despite the progress made over the past decade. The prediction of gene structure remains difficult, especially for eukaryotic species, and is often erroneous and incomplete. We used a proteogenomics strategy, taking advantage of the combination of proteomics datasets and bioinformatics tools, to identify novel protein coding-genes and splice isoforms, assign correct start sites, and validate predicted exons and genes. Results: Our proteogenomics workflow, Peptimapper, was applied to the genome annotation of Ectocarpus siliculosus, a key reference genome for both the brown algal lineage and stramenopiles. We generated proteomics data from various life cycle stages of Ectocarpus strains and sub-cellular fractions using a shotgun approach. First, we directly generated peptide sequence tags (PSTs) from the proteomics data. Second, we mapped PSTs onto the translated genomic sequence. Closely located hits (i.e., PSTs locations on the genome) were then clustered to detect potential coding regions based on parameters optimized for the organism. Third, we evaluated each cluster and compared it to gene predictions from existing conventional genome annotation approaches. Finally, we integrated cluster locations into GFF files to use a genome viewer. We identified two potential novel genes, a ribosomal protein L22 and an aryl sulfotransferase and corrected the gene structure of a dihydrolipoamide acetyltransferase. We experimentally validated the results by RT-PCR and using transcriptomics data. Conclusions: Peptimapper is a complementary tool for the expert annotation of genomes. It is suitable for any organism and is distributed through a Docker image available on two public bioinformatics docker repositories: Docker Hub and BioShaDock. This workflow is also accessible through the Galaxy framework and for use by non-computer scientists at https://galaxy.protim.eu.
HostingRepositoryPRIDE
AnnounceDate2019-01-22
AnnouncementXMLSubmission_2019-01-22_06:19:30.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmittercloarec laetitia
SpeciesList scientific name: Ectocarpus siliculosus (Brown alga); NCBI TaxID: 2880;
ModificationListmonohydroxylated residue; iodoacetamide derivatized residue
InstrumentLTQ Orbitrap XL
Dataset History
RevisionDatetimeStatusChangeLog Entry
02018-07-30 06:00:36ID requested
12019-01-22 06:19:31announced
Publication List
Guillot L, Delage L, Viari A, Vandenbrouck Y, Com E, Ritter A, Lavigne R, Marie D, Peterlongo P, Potin P, Pineau C, Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes. BMC Genomics, 20(1):56(2019) [pubmed]
Keyword List
curator keyword: Technical
submitter keyword: bioinformatics, genome annotation, peptide sequence tag, proteogenomics, proteomics, tandem mass spectrometry
Contact List
Laetitia Guillot
contact affiliationProtim, Univ Rennes, Inserm, EHESP, Irset – UMR_S 1085
contact emaillaetitia.guillot@univ-rennes1.fr
lab head
cloarec laetitia
contact affiliationuniversité de rennes
contact emaillaetitia.guillot@univ-rennes1.fr
dataset submitter
Full Dataset Link List
Dataset FTP location
NOTE: Most web browsers have now discontinued native support for FTP access within the browser window. But you can usually install another FTP app (we recommend FileZilla) and configure your browser to launch the external application when you click on this FTP link. Or otherwise, launch an app that supports FTP (like FileZilla) and use this address: ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2019/01/PXD010618
PRIDE project URI
Repository Record List
[ + ]