⮝ Full datasets listing

PXD074868

PXD074868 is an original dataset announced via ProteomeXchange.

Dataset Summary
TitleCodon Optimization Depletes Stop Codons in Alternative Reading Frames of Protein-Coding Nucleic Acid Therapeutics
DescriptionOut-of-frame translation events, arising from ribosomal frameshifting or non-canonical initiation, are an intrinsic property of the translational machinery that cannot be fully prevented. The consequences of such events depend on the distribution of stop codons in alternative reading frames, which determine the permissiveness of those frames: whether out-of-frame translation terminates quickly or generates extended protein products. Here, using quantitative dual-fluorescence reporters, we demonstrate that stop codons in alternative reading frames function as molecular checkpoints that terminate out-of-frame translation and prevent product accumulation. Genome-wide analysis across ten organisms reveals that natural coding sequences maintain dense stop codon distributions in alternative frames, with a median spacing of approximately 20 amino acids. We then show that codon optimization, the standard method for enhancing translation, systematically depletes this safeguard. Because all three stop codons (UAA, UAG, UGA) begin with uridine, and optimal codons exclude uridine from third positions, stop codons in the −1 reading frame become structurally impossible in codon-optimized sequences. Analysis of 120 therapeutic sequences, including FDA-approved COVID-19 mRNA vaccines, confirms widespread −1 frame stop codon depletion: out-of-frame products average 164 amino acids, six-fold longer than in natural human genes. We demonstrate that strategic restoration of stop codons in alternative reading frames through synonymous substitutions eliminates detectable out-of-frame products by mass spectrometry, while preserving the intended protein. This approach requires only informed sequence design, with no changes to manufacturing, or regulatory framework. Our findings establish stop codon distribution in alternative reading frames as a critical design parameter for protein-coding nucleic acid therapeutics.
HostingRepositoryiProX
AnnounceDate2026-02-24
AnnouncementXMLSubmission_2026-02-25_00:06:12.744.xml
DigitalObjectIdentifier
ReviewLevelPeer-reviewed dataset
DatasetOriginOriginal dataset
RepositorySupportUnsupported dataset by repository
PrimarySubmitterZheling Liu
SpeciesList scientific name: Homo sapiens; NCBI TaxID: 9606;
ModificationListNo PTMs are included in the dataset
InstrumentOrbitrap Astral
Dataset History
RevisionDatetimeStatusChangeLog Entry
02026-02-25 00:05:48ID requested
12026-02-25 00:06:13announced
Publication List
Dataset with its publication pending
Keyword List
submitter keyword: Protein-Coding Nucleic Acid Therapeutics, Out-of-frame products, Frameshifting risk
Contact List
Weirui Ma
contact affiliationZhejiang University
contact emailmaweirui@zju.edu.cn
lab head
Zheling Liu
contact affiliationZhejiang University
contact email12107081@zju.edu.cn
dataset submitter
Full Dataset Link List
iProX dataset URI