Excited States, Symmetry Breaking, and Unphysical Solutions in State-Specific CASSCF Theory

State-specific electronic structure theory provides a route toward balanced excited-state wave functions by exploiting higher-energy stationary points of the electronic energy. Multiconfigurational wave function approximations can describe both closed- and open-shell excited states and avoid the issues associated with state-averaged approaches. We investigate the existence of higher-energy solutions in complete active space self-consistent field (CASSCF) theory and characterize their topological properties. We demonstrate that state-specific approximations can provide accurate higher-energy excited states in H2 (6-31G) with more compact active spaces than would be required in a state-averaged formalism. We then elucidate the unphysical stationary points, demonstrating that they arise from redundant orbitals when the active space is too large or symmetry breaking when the active space is too small. Furthermore, we investigate the singlet–triplet crossing in CH2 (6-31G) and the avoided crossing in LiF (6-31G), revealing the severity of root flipping and demonstrating that state-specific solutions can behave quasi-diabatically or adiabatically. These results elucidate the complexity of the CASSCF energy landscape, highlighting the advantages and challenges of practical state-specific calculations.

Here, we provide explicit expressions for the gradient and Hessian of the CASSCF energy, which are required for quasi-Newton optimisation algorithms. These expressions are derived in detail in Refs. 1 and 2, but we include this pedagogical discussion for completeness. In what follows, the indices m, n, p, q, r, s correspond to arbitrary spatial orbitals, i, j, k, l correspond to occupied orbitals, a, b, c, d correspond to virtual orbitals, and t, v, x, y correspond to active orbitals. Furthermore, we employ the groundstate one-and two-body reduced densities, and the transition density matrices Explicit expressions for the non-zero matrix elements can be simplified depending on the types of orbitals involved, giving expressions for the ground-state density matrices as and where the purely active components γ tv and Γ tvxy cannot be simplified beyond Eq. S1. The only non-zero component of the onebody transition density matrix is γ K tv . In addition to the Γ K tvxy components, the non-zero terms of the two-body transition density matrix include

S1.2. Gradient terms
The gradient can be divided into the orbital components g o and the CI components g c that are defined in the main text. The CI components are given by the Hamiltonian matrix elements in the configuration space, that is The orbital components are given by with its anti-symmetrized variantÊ − mn =Ê mn −Ê nm .

S1.3. Hessian terms
Similarly, we can divide the Hessian Q into three components: Q cc , Q oo and Q oc (where H co = (Q oc ) † ). The CI-CI components are given by the Hamiltonian matrix elements within the active configuration space shifted by the current energy E 0 , giving The off-diagonal components corresponding to the orbital-CI matrix elements are given by where the constituent transition matrix elements are computed as The final orbital-orbital term is given by While a full derivation using the results of Ref. 2 is lengthy, the non-redundant blocks are: • Virtual-Core, Virtual-Core: • Virtual-Core, Virtual-Active: • Virtual-Core, Active-Core: • Virtual-Active, Virtual-Active: • Active-Core, Virtual-Active: • Active-Core, Active-Core: (S19) Here, we have introduced the inactive (core) and active Fock matrices 2 (S20)

S2. Eigenvector-following for saddle point optimisation
We employ the eigenvector-following technique to target stationary points with an arbitrary Hessian index. While Section 6.2.1 of Ref. 3, and references therein, describe this method in detail, here we summarise the salient points for completeness and provide the details of our particular implementation.
Eigenvector-following works in the eigenbasis for the Hessian matrix Q with eigenvalues µ . In this basis, the components of the Newton-Raphson step x = −Q −1 g, with gradient g, and the change in energy ∆V are given by Contributions with µ > 0 or µ < 0 lower or raise the energy, respectively. To drive the optimisation towards a particular type of Hessian index, eigenvector-following artificially modifies the sign of these components. In particular, the components of the quasi-Newton eigenvector-following step are defined as where a positive or negative step gives an uphill or downhill step, respectively. This expression reduces to the Newton-Raphson step in the g µ → 0 limit and can be understood by imposing constraints on each component using Lagrange multipliers. 3 When an index-n saddle point is targeted in this work, the components corresponding to the lowest n eigenvalues of the Hessian are chosen to be downhill directions. Furthermore, we employ a dogleg trust radius technique to control the step length and improve the local convergence behaviour. A trust radius method works by defining a region with radius ρ around the current point in which a quadratic approximation to the objective function is considered to be accurate. The next step is chosen by optimising the objective function within this trust region, known as the sub-problem. In practice, approximate solutions to the sub-problem are required, and we employ the dogleg method (see Section 4.1 of Ref. 4). This method requires an analogue of the steepest descent direction for saddle point optimisation, which we define as where the positive and negative sign for each component are chosen in the same way as the quasi-Newton step [Eq. (S22)].
Defining an unconstrained step which optimises the energy along the steepest descent direction as and using the eigenvector-following quasi-Newton step x QN , the optimal dogleg step x is then given by The trust radius is then updated by comparing the ratio of the actual energy change ∆E to that predicted by the quadratic model ∆E model . The trust radius is halved for ∆E/∆E model < 0.25, and doubled if ∆E/∆E model > 0.75 and |x DL | = ρ. Otherwise, the trust radius is not deemed to be interfering with the optimisation and is left unchanged. Finally, a step is rejected if ∆E and ∆E model have different signs. We find that an initial trust radius of 0.15 provides adequate optimisation behaviour.

S3. Mode-controlled Newton-Raphson optimisation
Once a stationary point has been identified, it can be followed along a potential energy surface by using the old orbital and CI coefficients as a guess at the new geometry. Since the Hessian index can change at different geometries, we use a standard Newton-Raphson optimisation to perform this optimisation. The Newton-Raphson step is identified by solving a quadratic Taylor series expansion to the energy around a given point, giving for the gradient g and Hessian matrix of second derivatives Q. The dogleg method (described in Section S2) cannot be used as there is no way of defining a "steepest descent" step for saddle point optimisation without knowing the target Hessian index. Instead, we simply truncate the components of x NR depending on their magnitude along each eigen-direction of the Q, giving x µ = min |x NR,µ | , ρ x NR,µ |x NR,µ | .
The trust radius ρ is then updated using the same strategy described in Section S2. Similar mode-controlled Newton-Raphson optimisation algorithms have been reported multiple times in the past. 1,5-7