Geometric Predictors of Knotted and Linked Arcs

Inspired by how certain proteins “sense” knots and entanglements in DNA molecules, here, we ask if local geometric features that may be used as a readout of the underlying topology of generic polymers exist. We perform molecular simulations of knotted and linked semiflexible polymers and study four geometric measures to predict topological entanglements: local curvature, local density, local 1D writhe, and nonlocal 3D writhe. We discover that local curvature is a poor predictor of entanglements. In contrast, segments with maximum local density or writhe correlate as much as 90% of the time with the shortest knotted and linked arcs. We find that this accuracy is preserved across different knot types and also under significant spherical confinement, which is known to delocalize essential crossings in knotted polymers. We further discover that nonlocal 3D writhe is the best geometric readout of the knot location. Finally, we discuss how these geometric features may be used to computationally analyze entanglements in generic polymer melts and gels.


■ INTRODUCTION
Topological entanglements are ubiquitous and an essential feature of everyday materials and complex fluids, endowing them with viscous and elastic properties. Entanglements are often poorly defined, and their unambiguous identification and quantification remain elusive. 1,2 For example, a knot is a welldefined mathematical entity when tied on a closed curve, but there are many examples in physics and biology, e.g., proteins and chromatin, where knots are tied on open curves, rendering such "physical" knots much more difficult to define rigorously and unambiguously. 3−6 More broadly, a long-standing goal in polymer physics and the broader soft matter communities is to understand and control the topology of certain systems from the geometry of (often entangled) 1D curves. This goal encompasses many fields, such as liquid crystals, 7 optics, 8,9 fluids, 10 DNA, 5,11−17 proteins, 18,19 polymers, 3,20−22 soap films, 23,24 and soft matter in general. 25,26 At the same time, the unambiguous characterization of entanglements in these systems is often elusive, in turn begging for better strategies to quantify entanglements in generic soft matter systems.
A striking example of the inherent difficulty in defining entanglements is seen in polymer melts, whereby the close contact of two chains does not necessarily indicate that chains are constraining each other's motion. Instead, so-called "primitive" 27 and "isoconfigurational" 28 mean path techniques are far better placed to separate relevant entanglements from irrelevant ones. Yet, even these sophisticated techniques often struggle when polymers display nontrivial topology, e.g., rings. 29−31 Ring polymers are in fact not amenable for standard primitive path analysis as they do not entangle in the traditional sense as linear polymers do; 30,32 e.g., no "tube" can be defined around their contour, and they do not "reptate". 32 Rings display architecture-specific topological constraints called threadings, 33,34 which display the puzzling property of reducing self-similarly over time. 35,36 The development of a method to robustly and unambiguously quantify entanglements in melts of ring polymers is still an open challenge in the field of polymer physics. 37,38 In parallel to these open questions, it is clear that the geometric design of systems with specific entanglements in their microstructure could in principle allow for the control of mesoscopic material properties. 2,39,40 The realization of woven structures can now be achieved at both micro-and mesoscales using synthetic chemistry 41 or 3D printing. 40 To bypass a virtually endless trial-and-error approach, it is therefore important to be able to select the entanglement motifs to embroider in the structures in such a way that they display the desired mechanical properties. 41 Interestingly, this problem is not too dissimilar to that of knitting socks: using solely two types of stitches ("knit" and "purl"), it is possible to create many distinct motifs and socks with distinctive elastic properties. 42, 43 Another example in which topological entanglements are abundant is in molecular biology and genome organization. Two meters of genome is packed in a 10 μm nucleus in human cells. This extreme level of packaging leads to knotting and entanglement, which are resolved by topoisomerase (Topo2), a protein that is about 50 nm in size that can identify topological knots from pure geometric entanglements in DNA molecules that are more than a thousand times bigger. 44 By a still poorly understood "sensing" mechanism, 45,46 Topo2 is able to reduce the topological complexity of DNA in vivo 45,47,48 and in vitro 49 without introducing more complex knots.
Inspired by Topo2's topological sensing, which is necessarily local and unable to account for the global topology of knotted DNA, here, we investigate the possibility that there exist some geometric descriptors that correlate with the underlying topology of generic closed curves involved, for instance, in woven structures or polymer melts. To this end, we perform molecular dynamics (MD) simulations of knotted and linked semiflexible polymers in equilibrium and study the correlation between the position of the shortest knotted and linked arcs with that of four geometric descriptors: (i) regions of maximum local curvature, (ii) regions of maximum local density, (iii) regions with maximum local 1D writhe, and (iv) regions with maximum nonlocal 3D writhe. We note that while Topo2 works on a very specific polymer, the DNA double helix, here, we are interested in exploring the relationship between local geometry and global topology on generic polymers with the hope that our results may be helpful to better understand the entanglements in generic entangled polymer systems.
We discover that regions of maximum local density strongly correlate with knotted and linked arcs and outperform regions of maximum curvature. Surprisingly, we also find that this effect persists under strong confinement, where the knotted polymer is confined within a sphere smaller than its size in equilibrium. Finally, we show that 3D writhe is the best geometrical descriptors to recognize knotted arcs, and it performs consistently better than other geometric predictors. We conjecture that these local geometric descriptors could be employed to compute topological entanglements in more complex systems such as polymer melts, networks, tangles, and weavings.

Simulation Details
We model knotted and linked curves as semiflexible coarse-grained bead−spring polymers with N = 500 beads of size σ. The beads interact with each other via a purely repulsive Lennard-Jones potential l where k = 30ϵ/σ 2 is the spring constant and R 0 = 1.5σ is the maximum extension of the elastic FENE bond. This choice of potentials and parameters is essential to preclude thermally driven strand crossings and therefore ensures that the global topology is preserved at all times. 50 Finally, we add bending rigidity via a Kratky−Porod potential, U bend (θ) = k θ (1 − cos θ), where θ is the angle formed between consecutive bonds and k θ = 20k B T is the bending constant. We choose this value to mimic that of DNA, as for σ = 2.5 nm, the persistence length would be matched to l p = 50 nm, as known for DNA. 51 Each bead's motion is then evolved via a Langevin equation, i.e., by adding to the Newtonian equations of motion a friction and stochastic term related by the fluctuation−dissipation relation, where the amplitude of the stochastic delta-correlated force is given by k T 2 / B , and γ is the friction coefficient. The numerical evolution of the system is conducted using a velocity-verlet scheme with dt m 0.01 0.01 / LJ = = in LAMMPS. 52 In order to simulate knotted chains, we initialize the chain of beads using the well-known parametrization for torus knots: (x, y, z)(t) = (R(cos qt + r) cos pt, R(cos qt + r) sin (pt), − R sin (qt)) where p and q are coprime integers, R and r are two constants, and t ∈ (0, 2π).
In our paper, we want to compute the likelihood that some geometric features (to be defined below) yield accurate predictions of where the shortest knotted or linked segments are. To do this, we typically consider 1000 configurations taken by dumping the coordinates of the beads every 10 4 LAMMPS steps (or 10 2 τ LJ ). From each simulation, we obtain the fraction of instances in which our predictors (described below) correctly identify the knotted or linked arc. We then run 64 independent replicas (starting from different initial conformations) and typically plot the distribution of this fraction in the form of boxplots (see below for details).

Geometric Descriptors
As mentioned above, we consider four geometric descriptors that allow us to map polymer beads to a scalar quantity (see Figure 1 for a visual representation). They are (i) local polymer curvature (see Figure 1B) i k j j j j j j y where t j,j+1 ≡ r j+1 − r j is the tangent vector at bead j and n = 20 an averaging window; (ii) local bead density (see Figure 1C) where Θ(x) = 1 if x > 0 and 0 otherwise. In this equation, V R = 4πR 3 /3 and is the volume of a sphere of radius R, and we take R = 30σ, slightly larger than a persistence length. We have checked that other sensible choices of R give similar results.
(iii) For the local or 1D (unsigned) writhe (see Figure 1D), the equation is where l w = 50σ is the window length over which the calculation is performed. Finally, for (iv) nonlocal or 3D (unsigned) writhe (see Figure 1E), the equation is which measures the (unsigned) entanglement of a polymer length centered at bead i against the rest of the polymer contour. Equation 5 is the local generalization of the well-known "average crossing number" 53 and has been previously used to identify supercoiled plectonemes in simulated DNA, 54,55 branches in ring polymers, 56 and self-entanglements in proteins. 18 Equation 6 is a generalization of eq 5 where we do not restrict the calculation of the (unsigned) writhe to occur between contiguous polymer segments. Intuitively, eqs 5 and 6 effectively compute the average number of times the contiguous (for 1D) and noncontiguous (for 3D) segments of the polymer display crossings when observed from many different directions. Accordingly, we define the beads at which our descriptors attain their maximum value as Examples of typical curves that we get from the calculation of these observables on simulated polymers are shown in Figure 2. The snapshot in Figure 2A has been color coded from red to blue to identify the bead index. Beads 180 and 380 are colored green and purple to highlight the correspondence with the curves on the right of Figure 2. One can appreciate that the local curvature Γ ( Figure 2B) is rather noisy and does not seem to reflect an increase in entanglements around beads 180 and 380. On the contrary, local density Δ ( Figure 2C) displays three local maxima corresponding to increased density of 3D proximal segments around beads 180 and bead 380. Strikingly, 1D writhe ω 1D and 3D writhe ω 1D ( Figure 2D,E) display the most intuitive and marked trends. The 1D writhe ω 1D is best suited to detect self-entanglements over short distances (around l w ), while the 3D writhe ω 3D is able to detect self-entanglements over large distances. Intuitively, the peaks correspond to the location of the essential crossings of the trefoil knot.

Knot Localization
To identify knotted arcs in our simulated polymer we use Kymoknot, 57 a free and open-source software to identify the topology and shortest knotted arcs of closed and open polymer chains. The algorithm works by using a minimally interfering algorithm that (in either a top-down or a bottom-up direction) truncates the polymer conformation, computes the convex hull of the remaining polymer segments, joins the termini outside the so-formed convex hull, and then calculates the Alexander determinant of the closed conformation. 3 The result of Kymoknot is the interval within which the shortest knotted arc is located. For a polymer conformation that evolves over time, we can visualize the output of Kymoknot in a so-called kymograph. The blue shaded region in Figure 3A represents the shortest knotted arc within the simulated polymer as it fluctuates over time. For clarity, we also show a representative snapshot of the polymer at a given time frame where we have color coded the shortest knotted arc in blue. We then directly use the Kymoknot output to count how frequently the i X 's computed using the geometric descriptors defined above fall within the shortest knotted interval. We call this quantity the "colocalization score", ρ X .
The key point of this work is that Kymoknot recognizes the shortest knotted arc by computing a global topological invariant (the Alexander determinant) of a suitably closed open curve. On the contrary, the quantities defined in eqs 3−6 are purely geometric and have no knowledge of the global topology of the chain. Additionally, 3 of them, Δ, Γ, and ω 1D , are purely local features that can be extracted from a short polymer segment, measuring the surrounding segments in close 1D or 3D proximity.

Localization of Knotted Arcs by Geometric Descriptors
Having described the topological and geometrical observables used in this work to identify knotted and linked arcs, we now aim to address how well the geometric descriptors can predict the location of knots along polymers. To achieve this, we first visually compare the result from Kymoknot ( Figure 3A) to the ones obtained via the i X 's of the geometric descriptors ( Figure  3B−E). We first notice that the maximum of the local curvature Γ appears to be noisy and randomly scattered along the contour. This is also the case if we do not perform the window averaging of the local curvature or if we pick beads separated by a number of beads. On the contrary, the maximum of local density, 1D writhe, and 3D writhe appear to locate near the boundaries of the shortest knotted arc identified by Kymoknot ( Figure 3A). We hypothesize that this finding may be related to the concept of essential crossings 58,59 and that our geometric predictors may thus be able to identify some of the essential crossings in the knotted chain.
To more precisely quantify how well our predictors can identify the location of the shortest knotted arc, we compute the "colocalization score", ρ X . [We recall that this was defined as the number of times that the geometrically predicted i X falls within the shortest knotted interval detected by Kymoknot.] Figure 4A shows that for an unconfined, dilute polymer, ρ Γ is similar to one obtained by a random choice of bead, i.e., for a trefoil ρ rand ≃ 50%. Notice that a computed ρ rand ≃ 0.5 means that, for our choice of parameters, the shortest knotted arc occupies about half of the polymer contour; this is due to the large polymer stiffness chosen to match that of DNA and the net effect is that the knot tends to delocalize. 3 Interestingly, we observe a much larger colocalization score for the other geometric descriptors. More specifically, the local density descriptor i Δ colocalizes with the knotted arc roughly ρ Δ = 70% of the times for a trefoil and more than 80% for the other knot types ( Figure 4A). Additionally, we find that the 3D writhe is the most accurate predictor with ρ ω3D ≃ 80% for the trefoil and ρ ω3D > 90% for the more complex knots.
Interestingly, if we account for a "buffer", i.e., an additional 10 beads on either side of the knot boundaries identified by Kymoknot, we find a further increase in accuracy (see Figure  4B) with i Δ reaching more than 80% for all knot types and 3D writhe, more than 90% for all knot types, getting close to 100% for 5 1 , 7 1 , and 8 19 . While local density improves its predictive power when including the buffer, the 1D writhe does not. Perhaps the most interesting observation from Figure 4 is that, even if more complex knots delocalize and take up a larger fraction of the polymer contour (see the random value increasing up to ≃75%), our geometric descriptors are still significantly more accurate than simply a random choice.

Localization of Knotted Arcs under Spherical Confinement
Arguably, while the semiflexible nature of our chains renders knots rather delocalized over the contour, the consideration of chains that are more flexible would induce knot localization, 60,61 which is expected to facilitate their recognition by our geometric methods. Localized knots are defined such ACS Polymers Au pubs.acs.org/polymerau Article that their subtended arc scales sublinearly with the length of the polymers, i.e., l k ∼ N α with α < 1. It was previously shown that knots in flexible chains display α ≃ 0.75. 3 On the other hand, under spherical confinement, knots are extremely delocalized and display α ≃ 1. 3 Thus, we ask whether our geometric predictors (and in particular the local density Δ) remain good predictors of knot location under spherical confinement. To study this regime, we enclose polymers in spherical shells with harmonic repulsive interactions with all the beads. The radius of the shell R c is slowly reduced until the desired confinement R c /R g (with R g being the equilibrium radius of gyration of the polymer in dilute conditions) is attained. The polymer is then allowed to equilibrate. Finally, we measure the curves Γ(i), Δ(i), ω 1D (i), and ω 3D (i) as before and, in turn, the colocalization score, ρ X ( Figure 5). The only change is that we now use R = R c /8 to compute Δ(i). This is needed because under confinement the radius of gyration becomes smaller than the original value R = 30σ we set earlier for the dilute case. We have repeated this calculation for other sensible choices of R and they produce qualitatively similar results. Interestingly, we observe that Δ still outperforms a random process even at values of confinement strength R c /R g = 0.25 for both the trefoil and pentafoil knots (see Figure 5). It is rather striking that i Δ colocalizes with the knotted arc more than ρ Δ > 95% of the time, meaning that, even under these extreme conditions of self-density, the presence of a knot can be identified via purely geometric features.
Finally, we note that the accuracy trend displays a nonmonotonic behavior as a function of confinement strength. In particular, we note a curious dip in accuracy for R c /R g = 1. It would be interesting in the future to explore in detail the physical origin of this behavior.

Link Localization by Geometric Descriptors
In the last part of this paper, we consider links as prototypical examples of generic entangled chains. We perform MD simulations of two N = 500 bead-spring Kremer−Grest polymer chains tied in a simple Hopf link. We then measure the shortest linked portion using the method described in refs 62−65 and compare the resulting segment with the ones given by our geometric descriptors. Briefly, the algorithm works as follows: from a pair of linked curves with topology τ computed using the two-variable Alexander polynomial, 62 it is possible to obtain the shortest physical link by looking at all possible pairs of subchains (γ 1 , γ 2 ) on the condition that they display the same topology as the original link. The algorithm employs a top-down search scheme on the basis of a bisection method and outputs the index of the beads in chain 1 and chain 2. We then count how likely it is that the i X 's obtained using the geometric predictors fall within the shortest linked regions of the two chains.
We here compare the results from the link localization algorithm with our two best performing descriptors, i.e., the local density, Δ, and the 3D writhe, ω 3D . Since we now consider two chains, we can define Δ(i) and ω 3D (i) as "self" (when computing them considering only the chain that hosts the i th segment) or as "global" (when considering all beads in  and pentafoil (B) as a function of knot confinement, measured as R c /R g where R g is the radius of gyration of the polymer in equilibrium. As before, we compute the score over 1000 conformations and make the boxplot using one value for each of the 64 independent replicas. the system in the calculation). The trend of X s (i) reflects the entanglements of the chain with itself while X g (i) mirrors any entanglement segment to which i is subjected. In Figure 6A− C, we show that, for a randomly chosen simulation snapshot, the global features X g (i) display several maxima and the higher ones correspond to the beads forming the link. For the particular snapshot in Figure 6A, the link localization algorithm 62 detects the shortest linked arc in chain 1 (red in the figure) to be 421−460 and the shortest linked arc in chain 2 (blue in the figure) to be 461−20 (through periodic boundary conditions at N = 500). We highlighted the positions of these beads in Figure 6A−C,E,F, to show the agreement with Δ g and ω 3D,g .
The colocalization score calculated on the global geometric predictors (shown in Figure 6D) suggests that these features correlate well with the location of the link. As expected, we do not see any significant difference when comparing the accuracy of chain 1 and chain 2, and we observe that the colocalization score for the total link, i.e., the conditional probability that both linked segments contain i Xd g , appears to be roughly the product of the two colocalization scores for the single components. Importantly, Figure 6D shows that the geometric predictors significantly outperform the random prediction (even by a factor of 5 or more).
We then noted that the difference of the global and the selfcomponents of the geometric predictors, defined as dX(i) = X g (i) − X s (i), significantly decrease the fluctuations of the curves. Intuitively, dX(i) counts the contributions of interchain segments on the segment i (see Figure 6E,F). Strikingly, we find that i dX , i.e., the bead hosting the maximum value of the difference dX, yields an even better colocalization score with values around 90% for the individual link components and 80% for the total link ( Figure 6G). The ratio of the localization accuracy of the geometric predictors and the random choice is now 10 or more. Arguably, this means that the interchain correlations are the most important contribution to the entanglements. This is also in line with the situation in entangled polymer melts, where total density fluctuations are typically small, while interchain density fluctuations are more informative of the system dynamics. 66,67 ■ DISCUSSION AND CONCLUSIONS What makes a curve knotted? Inside our cells, how do certain proteins recognize complex topologies by scanning the DNA locally? How can we unambiguously identify relevant entanglements in polymeric systems? In this work, we started from the hypothesis that knotted and linked curves in 3D may harbor some geometric features that correlate with the underlying topology. To this end, we have performed MD simulations of knotted and linked curves and have analyzed four geometric predictors: (i) local curvature, (ii) local density, (iii) 1D writhe, and (iv) 3D writhe. We used the geometric predictors to locate the shortest knotted and linked arcs and compared these predictions to the ones given by state-of-theart knot and link localization algorithms (refs 3, 57, and 62).
We discovered that local curvature is equivalent to randomly choosing a bead within the contour. This is interesting as there are models arguing that Topoisomerase, a protein involved in simplifying knots in DNA, may sense curvature to locate a knotted segment. 68 Our work suggests that this would be a poor search strategy and would yield a rather inefficient topological simplification pathway. Admittedly, our model does not capture the torsional rigidity and the double-helical structure of DNA and we thus refrain from arguing that our results clarify the search strategy of Topoisomerases on DNA. At the same time, our results suggest that, in polymer melts and other generic thermally driven entangled systems, such as weavings, the points of maximum curvature of the filaments are not necessarily the most entangled.
On the other hand, we find that local density is a far better geometric predictor of topologically complex states. In our simulations, the bead in the polymer with the largest number ACS Polymers Au pubs.acs.org/polymerau Article of neighbors (largest local density) is often also part of the knotted or linked segment (with an accuracy of ≃80% for simple knots and the Hopf link and up to 90% for more complex knots or under confinement). This is rather striking in that the calculation of local density is restricted to beads that are 3D proximal to bead i and there is no information on the global topology of the curve. One consequence of our findings is that sensing the local density of DNA segments could be a good strategy for Topoisomerase to quickly locate knotted and entangled arcs. Such a binding strategy may be naturally realized by a protein design that presents abundant positively charged amino acids on the surface of the protein in such a way as to maximize unspecific interactions with negatively charged DNA. Indeed, Topoisomerases typically present a positively charged area in the region of DNA binding that is far larger than the one needed to bind DNA. 69,70 Again, we stress that our polymer model does not fully capture DNA's complexity.
In the future, we aim to perform a similar analysis on models that can capture twist 71,72 to quantify the impact of torsional rigidity on these metrics. Furthermore, it has been suggested that in knotted and closed DNA there may be an interplay of both knots and plectonemes; in this case, the geometric descriptors measured here may struggle to identify the essential crossings of the knot from the writhe of the plectoneme. Future studies will illuminate this issue. In spite of the limitations of our present model in modeling DNA, we conjecture that our results may be used to quantify entanglement motifs in tangled and weaved structures. 40,41 For instance, we expect that the pattern of local density along the entangled curves will be motif-dependent and that there may be a relationship between these patterns and the corresponding mesoscopic elasticity. Again, we hope that future work will explore this direction further. Finally, we discover that 3D writhe is our best descriptor with a consistently high (≳90%) accuracy in identifying the knotted and linked arcs. This observation is less striking than the one for the local density as 3D writhe is not (strictly speaking) a local geometric predictor. In other words, the calculation of 3D writhe has to scale as N 2 while the local curvature, density, and 1D writhe scale as N. We note that local density can make use of neighbor lists; hence, why we claim it could scale faster than N 2 .
In line with this, we note that state-of-the-art algorithms that search for knotted and linked segments on polymeric systems 57,62,73 or proteins 19,74,75 require a considerable amount of computational time. For instance, when run on a single CPU, knot localization on our N = 500 chain in dilute conditions takes about 2 ms but under confinement takes up to 300 ms per conformation. On the other hand, the calculation of the local density profile takes on average 0.3 ms. Similarly, link localization for our two N = 500 chains takes up to a minute even in dilute conditions on a single conformation. On the contrary, the calculation of the local density profile for the same link takes 30 ms per conformation. For this reason, we argue that adding a preliminary search step using geometric predictors, before launching a full blown topological search scheme, could be a way to render search algorithms more efficient in the future.
It is appropriate here to highlight that entanglements are among the most elusive and slippery topics in polymer science. Algorithms such as an isoconfigurational mean path 28 and primitive path analysis 27 are the "gold standard" to quantify relevant entanglements in polymeric systems and yet they fail in the case of ring polymers. 30 We hope that the geometric descriptors proposed here may be a complement to these tools and could be used to identify entanglements in complex polymeric systems. We speculate that (interchain) local density and 1D and 3D writhe as defined in this work may yield interesting results not only in melts of ring polymers but also in molecular (and periodic) weavings. 2,39−41 We expect that different entanglement motifs are associated with distinct patterns of our geometric observables. In turn, they may be used to predict the global elastic response of the entangled network to certain perturbations. To the best of our knowledge, these metrics have not yet been tried on polymer melts or molecular weavings.
One intriguing application of our results is on Olympic gels. 15,76−78 Indeed, there is no simple way to compute the extension of three or more components of the Gauss linking number, known as the Milnor's triple linking number, 79 on the systems of ring polymers. This means that it is extremely challenging to unambiguously discern three physically inseparable Borromean rings from three unlinked and physically separable rings. Systems made of interlinked "Olympic" rings, 76 such as the naturally occurring Kinetoplast DNA 15,80 or synthetic equivalents, 78 are likely to display Borromean and higher order Brunnian configurations of interlinked rings. 81 This means that computing the pairwise (Gauss) linking number between rings is likely not enough to predict the mesoscopic elasticity of Olympic gels, as this metric completely neglects contributions from Brunnian links. We hope that our geometric predictors may be able to offer an alternative to the lack of (simple) topological invariants to characterize these elusive conformations. For instance, a step toward this goal in the near future would be to study the behavior of our geometric predictors in simple Borromean rings in dilute conditions. Finally, we note that the data generated by our geometric predictors lend themselves fittingly to be used as input features for machine learning algorithms, e.g., neural networks, to identify knots and entanglements. This is because our predictors are invariant under translations and rotations of the conformation and under relabeling of the beads. In the future, we thus aim to couple our geometric observables to Machine Learning, as recently done in ref 82, to identify and localize knots and entanglements in more complex systems. ■ REFERENCES