Extreme-value statistics of stochastic transport processes

We derive exact expressions for the finite-time statistics of extrema (maximum and minimum) of the spatial displacement and the fluctuating entropy flow of biased random walks. Our approach captures key features of extreme events in molecular motor motion along linear filaments. For one-dimensional biased random walks, we derive exact results which tighten bounds for entropy production extrema obtained with martingale theory and reveal a symmetry between the distribution of the maxima and minima of entropy production. Furthermore, we show that the relaxation spectrum of the full generating function, and hence of any moment, of the finite-time extrema distributions can be written in terms of the Marčenko–Pastur distribution of random-matrix theory. Using this result, we obtain efficient estimates for the extreme-value statistics of stochastic transport processes from the eigenvalue distributions of suitable Wishart and Laguerre random matrices. We confirm our results with numerical simulations of stochastic models of molecular motors.


Introduction
Life is a non-equilibrium phenomenon characterized by fluxes of energy and matter at different scales. At the molecular level, molecular motors play a key role for the generation of movements and forces in cells. Examples are vesicle transport, muscle contraction, cell division and cell locomotion [1,2]. A molecular motor consumes a chemical fuel, adenosine triphosphate (ATP), that is hydrolysed to adenosine diphosphate (ADP) and inorganic phosphate. The chemical energy of this reaction is transduced to generate spontaneous movements and mechanical work. Single-molecule experiments have revealed that the activity of single or a few molecular motors displays strong fluctuations [3][4][5][6][7][8][9][10][11][12][13] which can be captured by the theory of stochastic processes [14][15][16][17][18][19][20][21].
An important question is to understand general features and universal properties that govern the statistics of fluctuations of stochastic transport processes that include the motion of molecular motors. Universal relations for the fixed-time statistics of time-integrated currents, such as the distance travelled and the work performed, have been investigated in the framework of non-equilibrium stochastic thermodynamics [22][23][24][25]. These results provide e.g. universal bounds for the efficiency of molecular motors given by the ratio between the mechanical power and the chemical power put in the motor [26]. Timing statistics of enzymatic reactions, such as those powering the motion of molecular motors, have been discussed within the framework of Kramers theory [27]. Recent theory and experiments in Kinesin have revealed symmetry relations between forward and backward cycle-time distributions of enzymatic reactions [28][29][30]. Related results have been derived in the context of waiting times of active molecular processes [31] and transition-path times in folding transitions of DNA hairpins [32].
When discussing stochastic processes, it is often sufficient to study averages and small fluctuations. However, rare events and large fluctuations play an important role when resilience and reliability of a system are investigated. In this context, the statistics of extreme values and of extreme excursions from the average play an important role, as has been discussed applying extreme-value theory in biophysics [33][34][35]  (b) Example of a trajectory X(t) (black), its maximum X max (t) (red), minimum X min (t) (blue) and its average over many realizations X(t) (thick grey) as a function of time t. The trajectories are obtained from a numerical simulation of a 1D biased random walk with hopping rates k + = 1.05 and k − = 0.95 in the positive and negative direction, respectively. The entropy production along the trajectory X(t) is S(t) = AX(t), with A = ln(k + /k − ) = 0.1.
An individual trajectory of a motor starting from a reference state X(0) = 0 at time t = 0 is denoted by X [0,t] = {X(s)} t s=0 . It contains jumps j = 1, 2, . . . from state x − j to state x + j that occur at stochastic times t j . The entropy production in unit of k B associated with this trajectory is S(t) = ln[P(X [0,t] )/ P(X [0,t] )] = AX(t) [31]. Here P is the path probability andX [0,t] = {X(t − s)} t s=0 is the time reversed path. Thus, the entropy production S(t) is a stochastic variable that undergoes a biased random walk of step size A with trajectories S [0,t] = AX [0,t] . For A positive, both the average velocity v = X(t) /t = (k + − k − ) = 2ν sinh(A/2) and the average rate of entropy production σ = S(t) /t = vA are positive. Here and in the following we denote by · averages over many realizations of the process X(t). However, due to fluctuations, the stochastic variables X(t) and S(t) can in principle take any value with finite probability and even become negative.
We now derive exact expressions for the statistics of the minimum X min (t) = min τ ∈[0,t] X(τ ) and the maximum X max (t) = max τ ∈[0,t] X(τ ) of the position of the motor with respect to its initial position, see figure 1(b) for illustrations. We also discuss the global minimum and maximum of the stochastic entropy production S(t) = AX(t) denoted by S min (t) and S max (t), respectively. We first discuss the statistics of the global extrema of the position X min ≡ lim t→∞ X min (t) and X max ≡ lim t→∞ X max (t), and of the entropy production, S min and S max . The probability that the global minimum of the discrete position is −x, with x 0, is P(X min = −x) = P abs (−x) − P abs (−x − 1), where P abs (−x) = e −Ax [31] is the probability that X(t) reaches an absorbing site in −x at a finite time. Thus, the global minimum follows a geometric distribution for x 0 and P(X min = x) = P(S min = Ax) = 0 for x < 0. From equation (3) we obtain the mean global minimum of a 1D biased random walk and of its associated entropy production: Therefore, the global minimum of the position diverges in the limit of a small bias A whereas the entropy production minimum is bounded for all A 0 and obeys the infimum law S min −1 [31]. This bound is saturated in the limit of small affinity, which corresponds to the diffusion limit [60]. Because S(t) and X(t) have positive drift, the average global maxima of entropy production and displacement are not defined. However the difference lim t→∞ [ S max (t) − S(t) ] = A/(e A − 1) is finite and obeys symmetry properties that we discuss below.
Finite-time extrema statistics of the 1D biased random walk may be obtained from the finite-time absorption probabilities where P abs (−x; t) is the probability that X reaches an absorbing site at −x at any time smaller than or equal to t. The absorption probability P abs (x; t) = δ x,0 + t 0 P fpt (T; x)dT, with δ i,j Kronecker's delta and is the first-passage time probability for the walker to first reach an absorbing site in x, with |x| 1, at time T 0 [61,62], see appendix A. Here I x denotes the xth order modified Bessel function of the first kind. Note that ∞ 0 dT P fpt (T; x) = P abs (x) 1. We identify in equation (6) two timescales. The smaller timescale τ 1 = (k + + k − ) −1 = (2ν cosh(A/2)) −1 is the average waiting time between two jumps, and τ 2 = (2 k + k − ) −1 = (2ν) −1 is inversely proportional to the geometric mean of the transition rates; their ratio τ 2 /τ 1 = cosh(A/2) 1 increases with the bias strength. Normalizing (6) by P abs (x), we obtain the mean T = |x|A/σ and variance Var[T] = (coth(A/2)/|x|) T 2 of the first-passage time, in agreement with the first-passage time uncertainty relation Var[T]/ T 2/σ [63]. Furthermore, the first-passage time probability density (6) obeys the following symmetry properties. First, the ratio is independent on T, as follows from the stopping-time fluctuation theorem [28,31,64]. Second, the 'conjugate' first-passage time probabilityP fpt (T; x), obtained exchanging k + by k − (i.e. A by −A), obeys These two properties implyP fpt (T; x) = P fpt (T; −x), which has interesting consequences for random walks [48,65] and for the extrema statistics of S(t), see below.
In order to derive exact finite-time extrema statistics, it is often convenient to use generating functions and Laplace transforms. The generating functions of the distributions of finite-time entropy production extrema are defined as G min/max (z; t) = ∞ x=−∞ z x P S min/max (t) = Ax Θ(∓x), with Θ the Heaviside function. Their Laplace transforms are given byĜ min/max (z; s) Here,P fpt (s; x) = e Ax/2 e −|x| cosh −1 [s/2ν+cosh(A/2)] − δ x,0 is the Laplace transform of the first-passage-time probability density at site x. These expressions enable the computation of the Laplace transform of all the moments of the extrema from successive derivatives of the generating functions with respect to ln z. In particular, the Laplace transform of the average minimum of entropy production reads s Ŝ min (s) = −A/[P fpt (s; −1) −1 − 1]. In the time domain, we may write this equality as S min (t) = −A t 0 dT ∞ x=1 P fpt (T; −x) [55] which can be written as (see appendix B): Numerical simulations of the 1D biased random walk are in excellent agreement with equation (9) (figure 2, blue symbols). Note that equation (9) can also be expressed in terms of the Kampé de Fériet Interestingly, our simulations reveal (figure 2, red symbols) that the average maximum of entropy production minus the average entropy production at time t equals to minus the right-hand side of equation (9): Equation ( where, in this case, S(0) = 0. Figure 3 shows empirical distributions of entropy-production minima and maxima obtained from numerical simulations, which fulfil the symmetry relation (11).

From a relaxation spectrum to a random-matrix approach
We now explore a connection between entropy-production extrema and random-matrix theory. More precisely, we relate the previously derived expressions for the average and distribution of extrema with eigenvalue distribution of specific random matrices. Equation (9) can also be written in terms of a finite relaxation spectrum (see appendix C) Here is the minimal relaxation time of the extreme value statistics and is the maximal extrema relaxation time. Here ρ is the Marčenko-Pastur distribution, where times are normalized byτ ≡ k + /(k + − k − ) 2 . The Marčenko-Pastur distribution is given by [66]: Interestingly, the Marčenko-Pastur theorem states that ρ(λ) is the distribution of eigenvalues in the large size limit of Hermitian matrices drawn from the ensemble of the Wishart-Laguerre random matrices [67][68][69], whose structure is explained below. Note that τ 0 = λ −τ and τ ∞ = λ +τ . Because the distribution of 1/λ can also be expressed in terms of the Marčenko-Pastur distribution (see appendix C), equation (12) can be written in terms of Marčenko-Pastur distributed relaxation rates k = τ −1 , see equation (C2). The average minimum given by equation (12) can be obtained from the generating function of the minimum distribution (see appendix C) where Equations (12)- (14) imply that the time at which the distribution of the extrema relaxes to its long time limit is given by the largest timescale of the relaxation spectrum τ ∞ =τ λ + . We illustrate this result in the inset of figure 2 which shows this for the case of S min (t) . Furthermore, the Marčenko-Pastur theorem implies the following trace formula: where M is an m × m random matrix drawn from the Wishart or Laguerre ensembles. It can be obtained as Here, R is a random matrix and R T its transpose, n = e A βm plays the role of a degree of freedom, with β the Dyson index of the Wishart or the Laguerre ensemble [70]. Hence, the rectangularity of R is linked to the bias of the random walk through n/m = βe A . In the case of real Wishart matrices for A > 0 the matrix R is an m × n rectangular random matrix, whose real entries are independent and identically distributed Gaussian random numbers with zero mean and unit variance. This implies that M is drawn from the distribution P(M) ∼ (det M) (n−m−1)/2 e (n/2)Tr M . An alternative is the use of Laguerre matrices. The construction of these matrices and further details are given in appendix D. Equation (15) can be approximated numerically using random matrices with finite but sufficiently large m. In practice, we use the following estimate: where λ i is the ith eigenvalue of M drawn from either the Wishart or Laguerre random-matrix ensembles. This is illustrated in figure 4 using a 16 × 16 Wishart random matrix, shown in figure 4(a). This matrix has a set of eigenvalues λ with distribution shown as a histogram in figure 4(b) together with the Marcenko-Pastur distribution which is reached in the infinite matrix size limit. The corresponding extrema relaxation times correspond to valuesτ λ. The approximation for the relaxation of the mean minimum is given by equation (17) (15) In figure 5 we show the approximation for different random-matrix ensembles of different sizes. We compute numerically the eigenvalues of a single random matrix of the Wishart ensemble and of a matrix drawn from the β-Laguerre ensemble with parameter β = 2. Notably, using a single 64 × 64 random matrix from the Wishart or Laguerre ensembles, we obtain an estimate of the average entropy production minimum that differs with respect to the exact value by at most 2% (at small times).

Extrema statistics of molecular motors
We now investigate whether similar results also hold for more complex stochastic models of molecular motors. We consider a biochemical process where a molecular motor's fluctuating motion is described by a continuous-time Markov jump process on a potential energy surface in two dimensions x and y (figure 6(a)). Here x denotes the spatial displacement of the motor along a discrete track of period , and y is a chemical reaction coordinate denoting the net number of fuel molecules spent by the motor.
The motion of the motor is biased along the track by a mechanical force f ext applied to the motor. In addition, the motor hydrolyzes ATP with chemical potential difference Δμ. We consider both f ext and Δμ to be independent of the state of the motor, which corresponds to the limits where the external force and the concentration of fuel molecules are stationary. States (x, y) of the motor are in local equilibrium at temperature T = β −1 . The dynamics of the motor is as follows. From a given state the motor can perform, at a random time, a jump to eight adjacent states corresponding to the following four transitions and their reversals: (i) sliding along the track by a distance (− ) without consuming fuel but generating work in (against) the direction of the force, at a rate k + (iii) work generation in (against) the external force using ATP (ADP) at a rate k + mc (k − mc ), with k − mc = k + mc e −β(f ext +Δμ) and (iv) work generation against (in) the external force using ATP (ADP) at a rate k + . We use transition rates of the form k ± α = ν α e ±A α /2 , where ν α give different weights to each transition type. A single trajectory of the motor is a 2D random walk containing snapshots (X(t), Y(t)) of the state of the motor at time t. Here X(t) is the spatial coordinate of the motor (with respect to its initial position) and Y(t) is the reaction coordinate representing the net number of ATP molecules consumed up to time t. Note that when Y(t) is negative, the motor has consumed more ADP than ATP molecules. The entropy production associated with a single trajectory of the molecular motor is where A m = βf ext , A c = βΔμ are the mechanical and chemical affinities. Thus S(t) is a random walk with four different step lengths A m , A c , A mc ≡ A m + A c and A cm ≡ −A m + A c corresponding to the jumps along the X, Y and the diagonal directions, respectively.
We perform numerical simulations of this 2D stochastic model of the molecular motor using Gillespie's algorithm, and evaluate the entropy flow associated with different trajectories of the motor. Obtaining exact extreme-value statistics in this model is challenging. However the following simple approximation provides good estimates. The average (figures 6(b) and 7(b)) of the entropy production extrema obtained from  (9) with effective parameters A eff and ν eff given by equations (18) and (19), see text for further details. (c) Cumulative distribution of −S min (t) (blue open symbols) and of S max (t) − S(t) (red filled symbols) for f ext = −1.5 pN (circles) and 0.5 pN (down triangles), and t = 50 ms. The black symbols are estimates given by equation (3) with effective parameters A eff , and the orange line is an exponential distribution with mean one. Values of the simulation parameters: k B T = 4.28 pN nm, = 8 nm, Δμ = 4k B T, ν m = 10 Hz, ν c = 5 Hz, ν mc = 25 Hz, and ν cm = 1 Hz. The numerical data were obtained from 10 8 simulations done using Gillespie's algorithm. simulations can be approximated by equations (4)-(9) replacing A and ν by the effective parameters.
where the index α runs over the four types of transitions α = m, c, mc and cm. This approximation based on effective parameters follows from considering effective 1D models with jumping rate Furthermore, we find for this model that the symmetry relations (10) and (11) between the distribution and the mean of finite-time minima and maxima of entropy production are satisfied with high accuracy in our numerical simulations, see figure 6. Notably, the distributions of entropy production maxima and minima coincide despite having irregular shapes for irrational ratios of the mechanical and chemical affinities.
The average extrema of the mechanical and chemical currents are shown in figure 7 as a function of the external force. The behaviour of these currents is known exactly from a mapping to 1D biased random  (4) with effective affinities A x/y = ln(k + x/y /k − x/y ). For comparison, we also show the mechanical net stepping rates k + x − k − x (dashed grey line) and the chemical net stepping rate k + y − k − y (dotted grey line). k ± x/y are defined in equation (20). (b) Average long time minima (blue circles), maxima minus final value (red filled circles) and rate of total entropy production (grey line) as a function of the external force. The black line is an estimation obtained using equation (4) with the effective parameters A eff and ν eff given by equations (18) and (19). The values of the fixed parameters are the same as in figure 6. The averages at long time in (a) and (b) were obtained at t = 0.5 s from 10 6 simulations done using Gillespie algorithm. walks X(t) and Y(t) with effective forward and backward hopping rates given, respectively, by This yielding effective affinities A x/y and rates ν x/y defined as in equation (2). This allows us to calculate the extreme value statistics of the net number of steps X(t) and Y(t), as illustrated in figure 7(a) as solid lines. Note that A x = β(f ext − f stall ) is related to the mechanical affinity A m = βf ext and the stall force In the examples shown in figures 6 and 7, f stall −1.2 pN in the simulations. A y and the chemical affinity A c obey a similar relationship. We observe in figure 7(b) that the largest average extreme values of entropy production occur when the external force applied to the motor is near the stall force. Interestingly, the minima and maxima of the displacements, X min and X max , show diverging averages when the stall force is approached from above or below respectively. Such behaviour could be measured in experiments on the statistics of the stepping of molecular motors near stall forces. Note that in figure 7 the long time limit of the average extrema is already reached at 0.1 s, see figure 6.

Discussion
We have derived analytical expressions for the distribution and moments of the finite-time minimum and maximum values of continuous-time biased random walks. Such stochastic processes provide minimal models to describe the fluctuating motion of molecular motors and cyclic enzymatic reactions that take place in a thermal reservoir and under non-equilibrium conditions induced by e.g. external forces and/or chemical reactions.
Our key results are: (i) exact statistics of the extrema of the position and the entropy production of a biased random walk; (ii) simple expressions at finite time in terms of an explicit relaxation-time spectrum; (iii) symmetry relations between distributions of extrema of stochastic entropy production; (iv) a novel connection between extreme-value statistics of biased random walks and the Marčenko-Pastur distribution of random matrix theory.
For biased random walks, our results provide insights beyond the infimum law for nonequilibrium steady states, S min (t) −1, which states that the entropy production of a mesoscopic system plus its environment cannot be reduced on average by more than the Boltzmann constant. Further in continuous systems this bound is approached at large times, lim t→∞ S min (t) = −1. Here we have shown that the effects of discreteness are very important. At large times we find that a model dependent bound above −1 is reached, see equation (4) and figure 2. We find that for Markovian biased random walks with homogeneous stationary distribution, the difference between the minimum of entropy production and its initial value has the same statistics as the difference between its maximum and its final value, for any given time interval [0, t], see equations (10) and (11). For random walks, such a 'min-max' symmetry has been noticed before [48]. This result further reveals that a supremum law for entropy production bounds the average of the difference between the maximum of entropy production and its value at a fixed time t 0: The inequality (22) applies to any Markovian nonequilibrium stationary process. It can be obtained by applying the results of reference [31] to the process R(t) = P(X [0,t] )/P(X [0,t] ), exploiting the martingale property of R(t). We have shown that the inequalities for the average extrema of entropy production and displacement of a biased random walk saturate in the limit of a small affinity A 1. This limit corresponds to systems that exchange a small amount of heat with their environment-below the thermal energy k B T-in each forward or backward step of the walker. For larger values of A, our analytical expressions reveal that details on the discreteness in the walker's motion have a strong influence in the extrema statistics. The time-asymmetric parameter A fully determines the distribution of extrema at large times, as well as the shape of the relaxation time spectrum at finite times, whereas the time-symmetric rate constant ν (see equation (2)) only scales the time dependence.
Relaxation or retardation spectra describe the dynamics of systems governed by a continuum of characteristic times, such as those found in fractional rheology [71]. In this spirit, we expressed in equations (12) and (14) the extreme value statistics for a 1D biased continuous time random walk as a relaxation process, whose spectra of relaxation times follow the Marčenko-Pastur distribution (13).
A relation between statistical properties of stochastic processes and random matrices has also been found in different contexts [72][73][74][75][76]. In quantum-mechanical scattering problems, retardation rates have been described using spectra of Laguerre random matrices [74]. For classical Markovian relaxation processes, relaxation time spectra of the extreme value statistics have been discussed [53][54][55]. Path counting combinatorial problems have been shown to be described in terms of random matrix spectra [77]. These works suggest that large random matrices may be used as a combinatorial shortcut to tackle the relaxation of the statistics of extreme values.
Here we have shown that the finite-time extreme-value statistics of stochastic transport can be approximated by drawing suitable Wishart and Laguerre random matrices of finite size. The resulting random-matrix estimates can outperform the accuracy and convergence of Monte Carlo simulations that determine extrema statistics. As shown in figure 4, only a small number of eigenvalues are needed for an accurate description of the extreme value statistics.
The first passage times of non-Gaussian stochastic transport processes are well-described by a universal distribution involving their three first cumulants [64]. This distribution is similar to the distribution of first passage times given in equation (A19) that we obtain exactly. Note that the extreme value statistics of Gaussian transport such as a diffusion process is found as the small bias limit A 1 of our 1D model (see (A7)). In particular, the velocity v and diffusion coefficient D for small bias are v νA and D ν. In this limit A → 0, the characteristic relaxation time of extrema isτ 1/(νA 2 ) D/v 2 and the relaxation spectrum (13) reads: In other words, √ λ follows Wigner's semi-circle distribution [69]. From this result, we recover the finite-time average minimum of entropy production in the diffusive limit where s = v 2 t/D, cf. reference [31]. Our work has important consequences for the theory of nonequilibrium fluctuations of active molecular processes and biomolecules. For example, the statistics of the maximum excursion of a motor against its net motion along a track provides insights on the physical limits of pernicious effects of fluctuations at finite times, which can be relevant in e.g. the finite-time efficiency of enzymatic reactions responsible of polymerization processes, muscle contraction by molecular motors, etc. We have shown that the displacement of motors with small cycle affinity exhibits large extreme values on average. However, the associated extreme entropy flows are on average always bounded in absolute value by the Boltzmann constant; an improved bound can be estimated from our results for 1D biased random walks, as shown in our application to an homogeneous 2D biased random walks describing the motion of molecular motors. Insights of our theory could be also discussed in the context of more complex biomolecular stochastic processes (e.g. microtubule growth [7,78] and transport in actin networks [79]). It will be interesting to extend our theory to Markovian and non-Markovian processes with time-dependent driving [80][81][82][83][84][85], stochastic processes with hidden degrees of freedom [86,87], and also to explore whether extrema statistics from single-molecule experimental data reveal relaxation spectra described by the Marčenko-Pastur distribution.

Acknowledgments
We acknowledge enlightening discussions with Pierpaolo Vivo and fruitful discussion with Izaak Neri, Simone Pigolotti, Ken Sekimoto, and Carlos Mejía-Monasterio. AG acknowledges MPIPKS and ICTP for their hospitality, Bertrand Fourcade for his guidance towards MPIPKS, and EUR Light S & T for funding.

Appendix A. First-passage-times, large deviations, and absorption probabilities of biased random walks
In this section we review some knowledge of random walk theory (e.g. first-passage statistics [61]) to derive equation (6) in the main text i.e. the exact formula for the first-passage time distribution of a 1D continuous-time biased random walk. We also discuss large-deviation properties of this model, absorption probabilities using martingale theory.

A.1. Model and solution of the Master equation
We consider a continuous-time biased random walk in a discrete one-dimensional lattice, where X(t) = {0, ±1, ±2, . . .} denotes the position of the walker at time t 0. We assume that X(0) = 0 and that the walker can jump from state X(t) = x to x + 1(x − 1) at a rate k + (k − ).
The waiting time at any site is exponentially distributed with the rate parameter k + + k − . The probability P x (t) = P (X(t) = x) to find the walker at the lattice site x at time t obeys the master equation with initial condition P x (0) = δ x,0 . From this evolution equation, a velocity v = k + − k − and a diffusion coefficient where I x (y) denotes the xth order modified Bessel function of the first kind. Equation (A2) follows from (A1) and the exact expression for the generating function of the modified Bessel function of the first kind ∞ x=−∞ z x I x (y) = e y(z+z −1 )/2 .

A.2. Large deviation and diffusion limit
We now discuss and review large deviation properties of the 1D biased random walk [89,90] and relate them to the statistics of 1D drift diffusion process. For this purpose we consider the scaling limit of P x (t) given by equation (A2) for large x ∼ vt, with v = k + − k − the net velocity of the walker. We assume a large deviation principle for P x (t) of the form In order to derive an analytical expression for the rate function J, we first approximate the modified Bessel function in equation (A2) using a saddle point approximation where the saddle point is given by i sin θ 0 = z i.e. iθ 0 = sinh −1 and we have used cos iθ 0 = cosh θ 0 and The rate function J(u) with u = x/vt can be evaluated from the leading term of equation (A2) which is found using equation (A5): Interestingly, the ratio between the second and the leading term of (A7), given by tanh(A/2) 2 (u − 1)/3, vanishes for small deviations u ∼ 1 but also for large deviations in the limit of a small bias A 1. The continuum limit of the biased random walk for A small simplifies to the drifted Brownian motion P x (t) = e −(x−vt) 2 /4Dt / √ 4πDt, where the polynomial prefactor is recovered by normalization, and we have used the expressions for the velocity v = 2ν sinh(A/2) and the diffusion coefficient D = ν cosh(A/2). In this regard, the 1D biased random walk can be seen as a generalization of the drifted Brownian motion for any bias. A finite bias modifies occurrences of extreme large deviations with respect to those occurring in the drift diffusion process. Consequently, the bias A is expected to affect the extreme value statistics of the process, as shown below.

A.3. Martingales and absorption probabilities
In this subsection we employ martingale theory to derive an analytical expression of the absorption probability P abs (−x) for a 1D biased random walk starting at x = 0 to ever reach an absorbing boundary located at −x < 0.
We first show explicitly that e −S(t) is a martingale process with respect to X(t), i.e.
for t t . In words, the average of e −S(t) over all trajectories with common history X [0,t ] up to time t t equals to its value at the last time of the conditioning e −S(t ) . The proof is as follows: In the first and second lines we have used the additive property and the Markov property of entropy production, respectively. In the third line we have used the definitions Δt ≡ t − t and S(t) = X(t)ln(k + /k − ). In the fourth line we have used the identity ∞ x=−∞ z x I x (y) = e y(z+z −1 )/2 . We remark that the proof sketched above can be simplified using the integral fluctuation relation e −S(t−t ) = 1 in the second line, which holds for any t t [91,92]. It has been shown [31,93] that the martingality of e −S(t) implies a set of integral fluctuation relations at stopping times where T is any bounded stopping time, i.e. a stochastic time at which the process X(t) satisfies for the first time a certain criterion. In particular, equation (A10) holds for the first-passage time T 2 of X(t) to reach any of two absorbing barriers located at −x − and x + , with x + and x − two arbitrary positive integer numbers. When applying equation (A10) to this particular stopping time, we can unfold the average in the left-hand side using the absorption probabilities where in the second line we have used the fact that e −S(T 2 ) = e −Ax + with probability P abs (x + ) and e −S(T 2 ) = e Ax − with probability P abs (x − ). In the third line we have used P abs (x + ) + P abs (x − ) = 1, and in the fourth line equation (A10). Solving the third line equation (A11) for the absorption probability we obtain Taking the limit x + → ∞ in equation (A12) we obtain the well-known analytical expression for the absorption probability which we used to derive the analytical expressions − equations (3) and (4) − of the distribution and mean of the global minimum of entropy production in the biased random walk.

A.4. First-passage-time distribution
The first-passage-time density P fpt (t ; x) can be derived from the solution of the Master equation (A1) with an absorbing boundary at site x = 0, with x an integer number [61]. It can also be derived from the distribution of the walker using Laplace transforms through the renewal equation: where P 0 (t) is the probability to be at a state at time t when the system was at the same state at t = 0. This convolution integral becomes a product in the Laplace domain, for any x = 0: where We thus obtain, using equations (A16) and (A17) in (A15) In the above equations and in the following we will use the variables ν = (k + k − ) 1/2 and A = ln(k + /k − ), see equation (2) in the main text. The inverse Laplace transform of equation (A18) implies P fpt (t; x) = (|x|/t)P x (t): which is equation (6) in the main text.

B.2. Integral representations of extreme value statistics
We start from the first-passage-time density formula (A19) and we exploit two properties of the modified Bessel function of the first kind. This allows us to rewrite the absorption probability as a definite integral of trigonometric and hyperbolic functions and the parameters A and ν:

Appendix C. Extrema statistics and Marčenko-Pastur distribution
In this section, we derive analytical expressions for the extrema statistics of 1D biased random walks in terms of the Marčenko-Pastur distribution (13) of random matrix theory, copied here for convenience: where λ is a positive random variable, δ 1 a parameter, and λ ± = 1 ± √ δ 2 . Note that this distribution is normalized with mean

C.1. Average value of the finite-time minimum of entropy production
Performing the changes of variable k = 2ν(cosh(A/2) − y), as well as τ = 1/k, in equation (B16) we obtain where we have introduced the variables Thus, the average entropy production minimum can be expressed as exponential relaxation process with a spectrum of relaxation times distributed according to Marčenko-Pastur distributions (C1) wherek and Note that equation (C7) provides equation (12) of the main text.

C.2. Generating functions of the absorption probability and the minimum
The generating function for the absorption probability can also be expressed using Marčenko-Pastur distributions. Using the same method as described above for equation (B14), we find where Using equations (C12) and (B3) we obtain the generating function of the distribution of minima given by equation (14) in the main text

C.3. Laplace transforms
Taking the Laplace transform of equation (C11), we obtain and similar relations hold for moments of any order. Notably, equations (C15) and (C16) have a similar mathematical structure as the Laplace transform of the first-passage-time density of Markovian stochastic processes found in [55], where insteadP fpt (s; x) is expressed as a weighted discrete sum of relaxation modes.

Appendix D. Random-matrix estimates of extreme-value statistics
In this section we discuss the connection between the relaxation spectrum of first-passage and extrema statistics in the 1D biased random walk with random-matrix theory. We now describe how one can estimate finite-time statistics of the minimum entropy production from the spectrum of suitable random matrices.
For this purpose, we use a celebrated result by Marčenko and Pastur [66]. Consider a real m × m Wishart matrix defined as where R (its transpose R T ) is an m × n random matrix, with n m. The random matrix R is filled with independent identically distributed (i.i.d.) random variables drawn from a normal distribution of zero mean and unit variance, i.e. R ij ∼ N (0, 1), for all i, j m. The resulting positive definite random matrix follows the Wishart distribution of degree of freedom n and density c n,m (det w) (n−m−1)/2 e Trw n/2 (where c n,m is a normalization factor). Following Marčenko and Pastur, the eigenvalues λ of the Wishart random matrix W are asymptotically distributed according to the distribution (C1) in the limit n, m → ∞ with finite rectangularity m/n → δ < 1. It has been shown that this asymptotic result also holds when all R ij are i.i.d. random variables drawn from any distribution of zero mean and unit variance [69]. We now put in practice Marčenko and Pastur's result, namely we find random matrices whose spectral density matches with that of the relaxation spectrum of the average minimum of entropy production. This can be achieved e.g. by using a Wishart random matrix of rectangularity m/n identified as δ = k − /k + = e −A in terms of the bias A of the walker, i.e. we draw a real m × n Wishart random matrix W with m, n 1 and m/n e −A (for instance n = e A m ). Then we evaluate the m eigenvalues λ i of the matrix W and we give them a dimension using equations (C8) and (C9) and performing the changes of variables k = λk in equation (C6) and τ = λτ in equation (C7) respectively. We thus obtain the following two estimates, S min (t) k and S min (t) τ , for the average entropy production minimum: wherek = k + andτ = k + /(k + − k − ) 2 as identified previously. These estimates converge respectively to the exact result in the limit of a large matrix size. Using equations (C11) and (C12), the same procedure can be applied to estimate the generating function and any order moment of the distribution of entropy production extrema.
To test the convergence of these estimates, we define their relative error k (t) and τ (t) as the relative difference which is a random real quantity for both k and τ estimates. Their limiting values are related and can be calculated analytically: which vanishes in the limit of a large random matrix because 1/λ ρ = 1/(1 − e −A ), with . . . ρ denoting an average over the Marčenko-Pastur distribution (C1). From the limits (D5), we conclude that the estimate (D2) is advantageous to study the short-time behaviour whereas (D3) is most suited for large-time asymptotics. Our numerical results show that | k (t)| | max | and also | τ (t)| | max | for all tested parameter values and for all times t. Therefore we will use max given by equation (D6) as a conservative bound for the relative error of the random-matrix estimates at any time t. The estimates introduced above rely on the fact that one can achieve a rectangularity m/n e −A with large enough random matrices. Because e −A is not a rational number in general, it is desirable to develop random-matrix estimates that achieve the Marčenko-Pastur distribution accurately for any value of A. Following [67], the β-Laguerre matrices are an alternative ensemble whose spectral density tends asymptotically to the distribution MP(e −A ) in the large size limit. A β-Laguerre m × m random matrix L is defined as where n = mβe A , with β the Dyson index of the ensemble. Here, R is an m × m random matrix with all entries equal to zero except the m × (m − 1) diagonal and sub-diagonal elements. The non-zero entries R ij are drawn following χ(d ij ) distributions of d ij degrees of freedom: The random variable Z ij = d ij k=1 X 2 k , where the X k ∼ N (0, 1) are i.i.d., follows the χ(d ij ) distribution. Equivalently, one can obtain a random variable that follows the χ(d ij ) distribution by taking the square root of a random variable drawn from a chi-square distribution χ 2 (d ij ). The degrees of freedom d ij of the χ distributions in the β-Laguerre random matrix are: We recall that in this case n = mβe A is not the dimension of the matrix R, which is an m × m square matrix, but a positive real number. Therefore, the rectangularity parameter does not need to be approximated in this method. Figure 8 shows numerical results of the random-matrix estimates of the average entropy-production minimum for the 1D biased random walk with bias A = 1. We draw m × m random matrices from the β = 1, 2, 1000 Laguerre ensembles. Note that Laguerre ensembles of Dyson indices β = 1, 2, 4 are equivalent to Wishart random matrices with R ij given respectively by real, complex and quaternionic normal random variables, and use the appropriate conjugate transpose of R [70]. We plot the maximum relative error max (D6) as a function of the size of the random matrix m ( figure 8(a)). The Wishart (1-Laguerre) ensemble provides a biased overestimate of the real value with maximum relative error 2.3% for small random matrices of sizes larger than 64 × 64. We observe that the 2-Laguerre ensemble provides an estimate that is practically unbiased, even using small matrices (except in the limit A 1). Furthermore, β-Laguerre matrices with large values of β (e.g. β = 1000) yield small dispersion in the relative difference but a bias (underestimation) for small matrix sizes. The mean and the standard deviation of max obtained from a large population of computer-generated random matrices are observed to converge to zero with the matrix size m as ∼ 1/m. This fast convergence (compared to the usual 1/ √ m) is a consequence of the correlation between the m eigenvalues of the β-Laguerre random matrices. The convergence of the estimates is revealed in the difference between the spectral density of the random matrices and the Marčenko-Pastur distribution for m = 2 6 ( figure 8(b)) and m = 2 12 (figure 8(c)). Remarkably, even though for m = 2 6 the eigenvalue distribution is a rough approximation to the Marčenko-Pastur distribution, the relative error of the estimate is smaller than ±2.3% for a single 1-Laguerre and ±1.5% for a single 2-Laguerre random matrix.
Eventually, we notice that all the expressions that involve a sum over the eigenvalues can be recast into random matrix traces. For instance, the two minimum entropy estimates and their associated maximum relative error read: