Skip to main content
Log in

Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences

  • Original Article
  • Published:
Molecular Genetics and Genomics Aims and scope Submit manuscript

Abstract

Despite being previously regarded as extremely unlikely, the idea that entirely novel protein-coding genes can emerge from non-coding sequences has gradually become accepted over the past two decades. Examples of “de novo origination”, resulting in lineage-specific “orphan” genes, lacking coding orthologs, are now produced every year. However, many are likely cases of duplicates that are difficult to recognize. Here, I re-examine the claims and show that four very well-known examples of genes alleged to have emerged completely “from scratch”— FLJ33706 in humans, Goddard in fruit flies, BSC4 in baker’s yeast and AFGP2 in codfish—may have plausible evolutionary ancestors in pre-existing genes. The first two are likely highly diverged retrogenes coding for regulatory proteins that have been misidentified as orphans. The antifreeze glycoprotein, moreover, may not have evolved from repetitive non-genic sequences but, as in several other related cases, from an apolipoprotein that could have become pseudogenized before later being reactivated. These findings detract from various claims made about de novo gene birth and show there has been a tendency not to invest the necessary effort in searching for homologs outside of a very limited syntenic or phylostratigraphic methodology. A robust approach is used for improving detection that draws upon similarities, not just in terms of statistical sequence analysis, but also relating to biochemistry and function, to obviate notable failures to identify homologs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4a
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Not applicable.

References

Download references

Acknowledgements

This study received no funding or support. I am, however, grateful for many relevant comments shared directly with me by Dr. Caroline Weisman of Princeton University. I would also like to thank Prof. Ralph Bundschuh of Ohio State University for allowing me to reproduce mathematical equations related to the use of BLAST and PSI-BLAST.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joseph Hannon Bozorgmehr.

Ethics declarations

Conflict of interest

The sole author has no relevant financial or non-financial competing interests to disclose and did not receive funding from any organization for conducting this study.

Additional information

Communicated by Martine Collart.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hannon Bozorgmehr, J. Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences. Mol Genet Genomics 299, 6 (2024). https://doi.org/10.1007/s00438-023-02090-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00438-023-02090-6

Keywords

Navigation