Phase transition in the computational complexity of the shortest common superstring and genome assembly

L. A. Fernandez, V. Martin-Mayor, and D. Yllanes
Phys. Rev. E 109, 014133 – Published 24 January 2024

Abstract

Genome assembly, the process of reconstructing a long genetic sequence by aligning and merging short fragments, or reads, is known to be NP-hard, either as a version of the shortest common superstring problem or in a Hamiltonian-cycle formulation. That is, the computing time is believed to grow exponentially with the problem size in the worst case. Despite this fact, high-throughput technologies and modern algorithms currently allow bioinformaticians to handle datasets of billions of reads. Using methods from statistical mechanics, we address this conundrum by demonstrating the existence of a phase transition in the computational complexity of the problem and showing that practical instances always fall in the “easy” phase (solvable by polynomial-time algorithms). In addition, we propose a Markov-chain Monte Carlo method that outperforms common deterministic algorithms in the hard regime.

  • Figure
  • Figure
  • Figure
  • Figure
  • Figure
  • Received 17 April 2023
  • Accepted 11 December 2023

DOI:https://doi.org/10.1103/PhysRevE.109.014133

©2024 American Physical Society

Physics Subject Headings (PhySH)

Physics of Living SystemsStatistical Physics & Thermodynamics

Authors & Affiliations

L. A. Fernandez1,2, V. Martin-Mayor1,2, and D. Yllanes3,2

  • 1Departamento de Física Teórica, Universidad Complutense, 28040 Madrid, Spain
  • 2Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), 50018 Zaragoza, Spain
  • 3Chan Zuckerberg Biohub — SF, 499 Illinois Street, San Francisco, California 94158, USA

Article Text (Subscription Required)

Click to Expand

References (Subscription Required)

Click to Expand
Issue

Vol. 109, Iss. 1 — January 2024

Reuse & Permissions
Access Options
Author publication services for translation and copyediting assistance advertisement

Authorization Required


×
×

Images

×

Sign up to receive regular email alerts from Physical Review E

Log In

Cancel
×

Search


Article Lookup

Paste a citation or DOI

Enter a citation
×