Efficient 3d finite element modeling of cyclic elasto-plastic rolling contact

Abstract Railway rails accumulate large plastic deformations due to cyclic rolling contact loading. The plastic deformations alter the rail geometry, affect material behavior, and cause crack formation and growth. The complex interactions between these phenomena require high fidelity simulations to be understood. 3d finite element simulations are accurate, but their computational cost limits the possible number of simulated cycles. We propose a cyclic finite element simulation in which the wheel and rail remain in contact throughout the simulation. It uses periodic boundary conditions, shadow elements, and model reductions. Compared to previous work, it is 25 times faster. The method is available as an open-source plugin to Abaqus, enabling other researchers to study rolling contact loading coupled with large plastic deformations.


Introduction
The high energy efficiency of railway transportation implies that increased utilization can reduce the transportation sector's environmental impact. However, this increase pushes for higher loads, faster speeds, and more frequent operations. These changes put higher demands on safety and maintenance. The latter also reduces the available time for maintenance. Therefore, the industry requires optimized maintenance operations to minimize service disturbances and ensure safe operations. Numerical modeling techniques that predict rail deterioration can enable such optimization. Additionally, numerical modeling can aid in the design of new systems to reduce future maintenance needs.
Much work in the literature has dealt with finite element simulation of deterioration due to rolling contact loading. Examples include crack propagation [3,5,23], white-etching layers [2,12], and laser dispersed quenching [22]. However, the repeated rolling contact loading causes large plastic deformations in the surface layer of railway rails. Close to the surface, shear strains of 13 [1] and 6.3 [16] have been reported in the literature. This strain accumulation affects crack initiation [cf. 10], fracture toughness anisotropy [9], and plastic anisotropy [14]. Unfortunately, very few studies on rolling contact fatigue simulations account for these effects. One exception is Larijani et al. [11], who considered an anisotropic initial state. Still, the simulation did not account for the evolution of anisotropy during rolling contact loading. Pletz et al. [19] simulated the accumulation of plastic strains for 1400 cycles. One issue with these simulations is the long simulation time of approximately 20 min per cycle (on a computational cluster, see Section 3.4 for further details). Hence, more efficient simulations are required to evaluate crack growth, white-etching layers, and other phenomena. Such efficiency gains are the topic of the present paper.
All of the above studies of rolling contact loading apply the load to only a portion of the considered rail. Fig. 1a illustrates this simulation setup, for which the wheel cannot roll across the entire rail, as this would cause it to roll over the edge. Therefore, the rolling is restricted to the region indicated by the blue line. Additionally, when considering strain accumulation, results can only be evaluated in a smaller part of the rail, denoted the Region Of Interest (ROI) in Fig. 1a. Outside this region, the surrounding unloaded material will influence the results. Fig. 1a illustrates this by the variation of some Quantity Of Interest (QOI). The grey region outside the ROI is thus an auxiliary region with inaccurate results. It is only required to obtain accurate results inside the ROI. The simulations in [19] required a large auxiliary region to avoid affecting the plastic deformations in the ROI. This additional region has a detrimental effect on the computational time. Firstly, more degrees of freedom are present, causing a longer solution time for each increment. Secondly, it requires a longer rolling distance, resulting in more increments per rolling cycle. Hence, reducing both the rolling length and the auxiliary region size is essential to achieving efficient simulations. Fig. 1b illustrates a continuous simulation setup. It does not require an auxiliary region along the rail, only under the ROI. The research questions are which boundary conditions to apply to the left and right faces and how to map back the wheel in-between cycles.
A very efficient simulation method was proposed by [21] in which the steady-state contact problem is simulated. Their solution considers the rail material flowing through the rail. This method has the advantage that each cycle is solved as a single load increment, resulting in very low computational costs. However, this approach has two major drawbacks: Firstly, the material flow requires material state interpolation. For advanced constitutive models, this is challenging. Secondly, and more importantly, local defects such as cracks cannot be studied with that framework.
In this work, we propose to apply periodic boundary conditions. The idea comes from both computational homogenization and molecular dynamics simulations. It achieves a semi-continuous simulation that allows plastic deformation accumulation while considering embedded discontinuities such as cracks. Effectively, the periodic boundary conditions imply that the simulation considers an infinite number of wheels one rolling length apart, rolling over the rail. This setup is illustrated in the top right corner of Fig. 2b. Between each cycle, the wheel is mapped back one periodic spacing.
The paper is organized as follows: Section 2 summarizes the old methodology [19], before describing the new methodology in detail. Section 3 compares the required rolling length to the old simulation setup in both 2d and 3d. Furthermore, a simulation for more realistic rail-wheel contact conditions is presented. Finally, we evaluate the computational efficiency of the simulation. The open-source plugin [17] to Abaqus [4] is described briefly in Appendix B.   bottom rail face is fixed, while the remaining faces are free. Note that the nomenclatures "face" and "surface" refer to edges and curves in 2-dimensional analyses. A reference point in the wheel center controls the elastic wheel's motion. Each cycle consists of 5 steps: First, the reference point displacements are controlled, moving the wheel into contact with the rail. Second, the prescribed vertical wheel load is applied, further pressing the wheel into the rail. Thirdly, the wheel is moved in the z-direction and rotated around the x-axis to simulate the rolling with slip. After the rolling step, the wheel is lifted and loses contact. In the fifth step, the wheel is translated and rotated back to its starting position. Fig. 2a shows these steps, followed by an illustration of the accumulated longitudinal displacement, u z . This displacement field is affected by the rolling contact start and stop and the material outside the contact region. To minimize these so-called boundary effects, the auxiliary region (cf. Fig. 1) must extend far in the longitudinal direction.

Method
The goal is to find a simulation method that reduces the boundary effects, approaching the continuous setup in Fig. 1b. However, it must also permit variations along the rail, such as inclusions and cracks. In the real case, a new wheel rolls over the considered rail segment each time. During one train passage, these wheels have a given spacing. If sufficiently far apart, it is reasonable to assume that the neighboring wheels do not influence the local contact stresses. As a representation of this case, we propose to use Periodic Boundary Conditions (PBC). An important consideration, emphasized in the present work, is the wheel spacing required to obtain accurate results. Hence, the wheel spacing does not represent the wheel spacing in a real bogie but a spacing chosen in the simulation based on a convergence study. As shown in Fig. 2b, this inter-wheel spacing corresponds to the rolling length, which is the simulated rail segment's length.
PBC are well known to reduce the effect of the unit cell boundaries in computational homogenization [20]. This reduced boundary effect gives a higher accuracy for a smaller computational domain. In computational homogenization of finite element simulations, no material leaves the computational domain. But in rolling contact, the wheel passes over the contact surface. This scenario is common in Molecular Dynamics (MD) simulations, in which PBC are the standard choice [6]. When a molecule leaves the domain on one side, the same molecule effectively enters from the other side. The present simulation method applies these periodicity conditions in the longitudinal rail direction (i.e. along the z-axis).
The MD simulations are Eulerian and consider a fixed control volume. Computational homogenization with finite elements typically applies Lagrangian meshes. These two approaches can be combined with the Arbitrary Lagrangian-Eulerian (ALE) technique [8]. ALE does not, however, solve the challenge with how the contact is computed as the wheel exits the computational domain (and enters on the other side). MD simulations typically treat each particle as a point. But when using surface-to-surface contact formulations in finite element simulations, it is not sufficient to only consider the node positions for the contact. This makes the periodicity in the Eulerian or ALE approaches more challenging to implement.
A standard finite element solver cannot simulate the contact when considering an Eulerian (or ALE) wheel exiting the domain. Simulation of this setup requires a custom code. However, such special codes are challenging to develop to be suitable for a broad range of users. Therefore, a method for realizing the proposed periodic boundary conditions in the commercial finite element code Abaqus [4] is presented.
The method uses a Lagrangian mesh and an implicit time integration. Inertial effects are neglected (quasi-static analysis). PBC are applied to the rail ends. Shadow contact surfaces extend the rail contact surface to allow contact simulation as the wheel passes the rail ends. These surfaces have no intrinsic stiffness but their nodes' motions are coupled to the motions of the matching nodes in the rail. These constraints, including the periodic boundary conditions, are illustrated in Fig. 2b by the pairs of nodes with the same color. The equations for these constraints are described in Section 2.1.
Once the wheel has rolled a full rolling length, the deformed wheel is mapped back to its starting position. Modeling the wheel with a linear elastic material enables a super-element to represent the wheel. The super-element only retains nodes within an angular segment to reduce the number of degrees of freedom. Hence, the wheel also rotates when moving back. This motion is described in Subsection 2.2.

Rail constraints
To mathematically define the constraints, the nodal coordinates and deformations are defined first: Initial nodal coordinates are denoted X = [X, Y, Z]. The deformed nodal positions, x = [x, y, z], are given as x = X + u, where u = [u x , u y , u z ] are the nodal displacements. An additional time-dependent control variable, Δz(t), denotes the rail elongation. This addition makes it possible to include the effects of thermal expansion/contraction. Such a uniform stress field can also be used to approximate the stress in the rail surface due to rail bending. However, no gradient can be included, so additional stresses deeper in the rail will be incorrect.
In the bottom set, B (see Fig. 3), the nodal displacements are prescribed to where L is the rail length. In the most common case, with no rail elongation, Eq. (1) reduces to fixed boundary conditions on the bottom face.
In the prescribed right side, Ŝ , the displacements are connected to the displacements in S on the left side. The length vector L = [0, 0, L] describes the rail length such that X being in Ŝ is equivalent with X − L being in S. The same applies to sets T L and T L , respectively. Here, the subscript L indicates that the retained set in the rail is on the left side (note, however, that the constraint set, T L , is to the right of the rail). This gives the linear constraints . Similarly, the displacements in T R are connected to the displacements in the top right set T R . However, here, X ∈T R , is equivalent to X + L being in T R . Hence, the linear constraints The motivation for choosing T R as the constrained set, as opposed to T R , is to eliminate the degrees of freedom in the shadow surfaces. This choice does not affect the results in the present work. The extents of T R and T L must be large enough to avoid that the outermost nodes in T R and T L come into contact. Hence, both the contact patch size and the deformation accumulation must be accounted for. Note that T L and T R may overlap as the degrees of freedom within these sets are retained.

Modeling the wheel
As previously mentioned, a super-element represents the elastic wheel. When considering a certain rail segment, a new wheel will roll over each time. Hence, the accumulation of plastic strains can be neglected for the wheel. Starting from a full solid wheel with a central hole, an angularly symmetric mesh is applied. All nodes on the inside of the hole are tied rigidly to a reference point in the center. The linear motion of the reference point is described by u w and the rotation by A linear super-element is created, retaining only the reference point and the expected contact nodes. The stiffness matrix and node coordinates are extracted and constitute the super element. Appendix A describes the finite deformation formulation used for the wheel. Membrane elements with negligible thickness and stiffness model the contact surface by sharing nodes with the wheel super element. Such elements enable surface-to-surface contact formulations that give more accurate contact stresses compared to node-to-surface contact formulations. An example of a wheel contact surface for linear elements, symmetric about the yz-plane, is shown in Fig. 4.
When moving back the wheel, the displacements and rotations of the wheel's control point (see Fig. 3) are prescribed to where is the closest integer number of wheel elements rolled and Δα is the angular increment of one membrane element (see Fig. 4).
The motions of the wheel's surface nodes are prescribed while moving back the wheel. The node displacements are such that the wheel deformation is the same but shifted along the wheel circumference: The deformations in C 0 are mapped to Ĉ 1 , see Fig. 3. To formalize these constraints, Fig. 4 defines node indices. A node on the wheel surface is thus denoted n ij for i ∈ [1, N α ] and j ∈ [1, N x ]. Here, N α denotes the number of nodes in the circumferential direction, and N x denotes the number of nodes in the x-direction. The initial and deformed coordinates of node n ij are denoted X ij and x ij , respectively.
The new positions of nodes in set Ĉ 1 in Fig. 3 are prescribed to the old positions of the nodes in set C 0 , shifted by the wheel translation: The nodes n ij for i ≤ N roll (set F 1 in Fig. 3) will not be in contact after moving back and the external forces on these nodes are thus zero. Hence, their displacements can be calculated directly by using the super element's stiffness matrix. Thus, all the wheel nodes are prescribed during the moving back phase.

Summary of methodology
The constraints in Eqs. (1)-(3) are active throughout the simulation. The simulation consists of the following steps: 1) Initiate: Control u w and θ w to lower the wheel.

2)
Apply load: Release u w y and apply the wheel load, F w y .

3)
Rolling: Control u w z and θ w x to move the wheel.

4)
Move back: Apply Eqs. (4) and (5), and calculate displacements of nodes in F 1 . Fix all rail contact nodes during this step.

5)
Reapply load: Release u w y and reapply the load, F w y .

6)
Release: Release rail and wheel contact nodes.

Continue:
Go to step 3 to start the next rollover cycle.
The default penalty stiffness for the contact modeling in Abaqus is 10 times the underlying elastic element stiffness. Exactly how this stiffness is defined is not clearly stated in the manual [4]. However, the shadow elements have negligible stiffness, and thus a fixed linear penalty stiffness, k p , is used as a simulation parameter. In the included simulations, k p = 1000 kN∕mm 3 for both solid and shadow elements. This value is similar to the default contact stiffness for a cubical 1 mm 3 element with Young's modulus 200 GPa and Poisson's ratio 0.3 (evaluated by numerical tests).

Rolling length convergence (2d)
In [19], the rolling length convergence was evaluated, but only for a few cycles. The rolling length required for accurate results depended on the number of cycles. For large shear deformations in 2d simulations, long rolling lengths were required. For the new method presented herein, a shorter converged rolling length is expected due to the semi-continuous simulation setup. This hypothesis is evaluated by considering the longitudinal rail displacement for 200 rolling cycles.
The finite strain plasticity material model "AF2" from [13] is used. This model was also used in [19] and is chosen in the present study due to its simplicity while still accounting for finite strains and being thermodynamically consistent. The same parameter values as in [19] are used, which were calibrated for R260 rail steel in [13]. Fig. 5 defines the geometrical parameters for the rolling length study. Following [19], a 2 mm mesh with 4-node, fully integrated, plane strain elements are used (Abaqus element code CPE4) for both the rail and the wheel. Table 1 summarizes the geometry and load parameters. F w y is the vertical wheel load, while μ is the rail-wheel friction coefficient. The creepage, c, from [19] is used, where c > 0 implies an accelerating wheel. It is defined such that the wheel angular velocity, θ x , becomes where v z is the z-component of the wheel center velocity and R o the outer wheel radius (see Fig. 5). Note that for a non-cylindrical 3-dimensional wheel, R o is not uniquely defined, and must be set as an input parameter. The values of F w y and R o were chosen in [19] to obtain similar contact patch size and maximum pressure as in 3d.
Simulations with the old methodology are used as reference. Hence, the same mesh and geometry are used but additional auxiliary regions are required, with details given in [19]. The displacements are measured in the middle of the rail's rolling surface while the wheel is moving back. Hence, the rail is not loaded at this time in the old methodology. For the new methodology, the wheel remains in contact with the rail. Therefore, the displacements are evaluated while the rail is loaded. Displacements are evaluated at one half rolling length from the wheel at the rail surface. This choice minimizes the influence from the local strain field around the contact. For a 40 mm rail length, the elastic shear deformation when the full traction is applied as a uniform shear load on top of the rail is approximately 0.2 mm. Fig. 6 shows the horizontal displacement accumulation with the number of rollover cycles. The new methodology is rather insensitive to the rolling length. The old methodology, on the other hand, shows a strong rolling length dependence. Hence, this finding strengthens the hypothesis that a shorter rolling length is accurate with the new methodology. It was expected that the two methodologies should give the same results for sufficiently long rolling lengths. It was, therefore, surprising that the old results did not converge towards the new results. This discrepancy can be understood by considering the variation of the displacement along the rail surface. Fig. 7a shows that for the old methodology with 60 mm rolling length, no region with uniform deformation remains after 20 cycles. For the longest evaluated rolling length of 200 mm in Fig. 7b, this zone is not present after 80 cycles. At this point, the displacement accumulation is also deviating from the new methodology's converged results in Fig. 6. The displacement variations for the new methodology are shown in Fig. 7c and 7d. Slight fluctuations can be observed along the rail. These are, however, much smaller than the accumulated displacements.
The present section's purpose was to evaluate the new simulation methodology's dependence on the rolling length. However, a few comments on the actual results are also warranted. We observe that the displacements continue to accumulate in each cycle. Meyer et al. [16] showed that large shear strains accumulate in the surface layer of rails. However, to capture these strains, much smaller elements will be required. Currently, the deformation does not result in ill-conditioned elements. See the deformed mesh in Fig. 8. For smaller elements and larger deformations, this issue may occur and thus require re-meshing. This issue is further addressed in Section 4. In the next subsection, we will see that the predicted deformations in a 2d simulation are much larger than in 3d.

Rolling length convergence (3d)
Based on the promising results from the rolling length convergence for the 2d simulation, a similar study is conducted in 3d. As found in Pletz et al. [19], the accumulated deformations are significantly lower in a 3d simulation. Therefore, the convergence study in 3d considers 1200 cycles. The geometry and mesh were taken from [19]: A cylindrical wheel (diameter 920 mm) is rolled over a part of a UIC60 railhead, see Fig. 9. The same material model, creep, and friction coefficient as in the 2d simulations are used. Only half the wheel load, F w y = 150kN, is applied as only the half rail cross-section is simulated utilizing symmetry (i.e. 29.4-ton axle load). The elements are fully integrated linear elements (Abaqus element codes C3D6, C3D8, M3D4) for both the rail and the wheel, and have a size of approximately 2 mm in the contact region.
In Fig. 10, the displacement accumulation for the 3d-simulation is shown for four different rolling lengths. The detailed view shows that the shorter rolling lengths have higher deformations initially. The differences are quite small and of the same magnitude as the differences in elastic shear stiffness (cf. discussion in Section 3.1). At higher cycle numbers, the longer rail sections accumulate more deformations. The "old" results from [19] show a lower accumulation after many cycles.   Those simulations were conducted for 60 mm rolling length. The first 100 cycles match the new results rather well. After additional cycles, the surrounding material seems to affect the results. Fig. 11 shows that the region of uniform displacement becomes smaller with increased displacements. The center of the rail is not within this region after 400 cycles.
For the new methodology, it was unexpected that longer rails accumulate more deformation. They have more material to prevent local plastic deformations in the contact zone. However, two hypotheses could explain this phenomenon: 1. The wheel slip is controlled. Shorter rails are more compliant in shear, causing a slower buildup of shear stresses in the contact patch. 2. Due to the periodic boundary conditions, the stress from neighboring cells affects the strain accumulation. The first hypothesis can be investigated by considering the wheel torque during a cycle. For simplicity, we use a linear elastic material. Fig. 12 shows the wheel torque variation with the normalized rolling distance for two rolling lengths. In the first cycle, the torque gradually builds up to the stationary value. When mapping back the wheel, the torque drops slightly. Direct manipulation of the contact iterations is not, to the authors' knowledge, possible in Abaqus. Hence, this variation cannot be avoided without custom user contact formulations. This is outside the scope of the present paper but may be considered for future work, see Section 4.
In the first cycle, the torque increases a bit faster for the longer rolling length. While the torque drops slightly more during reloading, roughly the same rolling distance (6 mm) is required to reach the full torque. Hence, the portion of the rollover cycle with lower torque is higher for the shorter rolling length. Thus, the torque variation contributes to the lower displacement accumulation for the shorter rolling lengths. The authors could not determine if this effect alone can explain the lack of rolling length convergence. Fig. 13 shows the variation of displacement along the rail after the 1201 cycles that were simulated in Fig. 10. No clear conclusion can be drawn for the short rolling length, as the main variation is due to the local contact stresses. A slightly lower accumulated displacement is observed at Z ≈ − 35 mm for the long rolling length. This finding could indicate that lower deformations accumulate during the torque buildup phase. However, the variations are rather small compared to the noise in the results.
The second hypothesis can be investigated by considering the stress cycle for two rolling lengths. Elastic material response is used for simplicity. Fig. 14 shows how the stress components in an element in the  When the wheel is close to the investigated point, all stress components, except σ zz , are nearly the same for both rolling lengths. Two differences can be observed: A slightly higher compressive longitudinal stress, σ zz , occurs for the longer rolling length. However, this does not affect the peak von Mises stress (σ vM ). Still, due to the non-proportional loading, the difference in σ zz could affect the material ratcheting. Even so, it is not clear whether this will increase or decrease the plastic deformations. Secondly, for the 30 mm rolling length, several stress components do not return to zero before the next cycle begins. This artifact could affect the plastic material response due to kinematic hardening. However, due to the complex interaction with the surrounding material over multiple cycles, we could not isolate the effect of σ zz on the plastic ratcheting.

Realistic numerical example
Thus far, the numerical examples have dealt with rolling length convergence in terms of accumulated plastic shear deformations. These are interesting to investigate the behavior of the new simulation methodology. However, it is also important to show how the simulation framework can consider various rolling contact fatigue phenomena with realistic wheel profiles. As an example, a wheel with an S1002 profile rolling on a 60 mm long UIC60 railhead is presented. The wheel crosssection was taken from [18], has a running band radius of approximately 500 mm, and is meshed with quadratic elements (Abaqus element codes C3D15, C3D20, and M3D8), see Fig. 15. Following the methodology presented, a super-element is created that retains the nodes indicated by the red line. The railhead is meshed with linear elements (Abaqus element codes C3D4 and M3D3) and has an embedded elastic inclusion in the middle of the rolling length, inside the contact patch. The green region in Fig. 15 shows its location in the cross-section, and its location in the middle of the rolling length is visible in Fig. 16. The depth is 3 mm and the surface-breaking circle has a 12 mm diameter and a center 19 mm from the center plane of the rail (-x direction). Such an inclusion can, e.g., represent a martensitic white etching layer. This simulation is not possible in other continuous simulation frameworks, cf. Van and Maitournam [21]. A wheel load, F w z = 150 kN, creepage c = 0.015, and friction coefficient, μ = 0.5 are used. The high loading friction coefficient, creepage, and load result in high strain accumulation per cycle. These choices are more challenging to solve due to the higher non-linearity, and are thus suitable for evaluating the methodology. Finally, the rail plasticity is modeled using two different finite strain material models. The first is the "AF2" model [13] used for the rolling length convergence studies. The second is a more advanced model, accounting for the evolution of anisotropy [15], therein denoted "H r = 0". Herein, this model is denoted the MM2021 model and it is available as an UMAT at https://github.com/KnutAM/MaterialModels/. The parameters from [15] were calibrated for the same R260 steel as in [13], but with further experiments evaluating the yield surface evolution. Such advanced material models are also challenging to use with Eulerian meshes due to state variable interpolation. Fig. 16a and 16b show the accumulation of longitudinal displacements. The elastic inclusion becomes visible as it disrupts the continuous deformation field along the rail. Its deformation is more uniform in the transverse direction than for the adjacent plastic rail regions. To simulate this more realistic geometry requires more degrees of freedom, naturally increasing the simulation time. However, the disruption of the displacement field is limited to a small region around the inclusion. This observation indicates that inclusions do not severely increase the required rolling length. Another observation is the difference in plastic deformations between the two material models. As in [19], it is clear that accurate material models are a key component in simulating the plastic deformations during rolling contact loading. Finally, it is noted that the high wheel load, creepage, and friction cause high plastic deformations to accumulate much faster than what is expected under normal operating conditions. Fig. 16c and 16d show the von Mises stress at the end of the 50th rolling cycle. As previously mentioned, the elastic inclusion can represent a martensitic white etching layer. Even though martensite has a significantly higher yield limit than the initial pearlitic material, it will also sustain plastic deformations in reality. Hence, the present example is an extreme case to highlight the effect of a strong discontinuity. Von Mises stresses up to 5.5 GPa occur in the simulation. In the color scale, all stresses above 750 MPa are white to improve the visualization of stresses outside the elastic inclusion. In the contact patch (at the end of the domain), the von Mises stress is higher for the AF2 model. This finding fits well with the lower displacement accumulation, indicating higher resistance to plastic deformations. Outside the running band, the MM2021 model produces higher von Mises stresses. These stresses seem to be caused by the accumulated deformations in the running band, as the surrounding material must adapt to this deformation.

Computational speed
The main motivation for improving the simulation methodology was to increase computational speed. For comparison, the 3d rolling length convergence study was conducted with the same settings and computational cluster as for the 3d simulations in [19]. In that study, a 60 mm rolling length was used, yielding a calculation time of 20 min per cycle with 10 cores (CPU: Intel 2650v3 (Haswell), RAM: 6.4 GB/core). However, as seen in Section 3.2, that rolling length appears insufficient for convergence. Table 2 gives the computation time per cycle for different rolling lengths in the present study. The results show that shortening the rolling length reduces the cycle time. For the same rolling length as in Pletz et al. [19], the new methodology is more than seven times faster. However, it is more meaningful to compare the simulation time for equally accurate simulations. Up to 200 cycles, all simulations (new and old methodologies), except for L = 30 mm, produce very accurate results. Thereafter, the L = 40 mm (new methodology) and the L = 60 mm (old methodology) begin deviating. The old methodology deviates faster, and after 600 cycles, it is less accurate than L = 30 mm for the new methodology. The new methodology with L = 30 mm was almost 28 times faster than the old (L = 60 mm). As a conservative statement, we conclude that for similar accuracy at many cycles, the new method is more than 25 times faster than the old methodology. It is important to note that this speedup factor depends on the simulation parameters: the number of cycles, the geometry, the load, and the material model.

Future improvements
As discussed in Section 3.2, a torque jump occurs after mapping back the wheel. Ideally, the contact conditions (and thus the torque) should not be affected by this mapping. When moving back the wheel between cycles a small influence from numerical inaccuracies in the nodal positions is expected. However, the tangential contact formulation relies on the amount of slip to calculate the shear forces. Therefore, mapping back the wheel, even within a numerically short time, causes a change in the traction stress. It might be possible to circumvent this issue with a custom contact implementation. While this improvement is outside the scope of the present paper, it is interesting for future work. Especially if it yields better accuracy for the shorter rolling lengths, and hence obtaining the speed gains discussed in Section 3.4.
In the present study, the mesh size sensitivity has been left outside the scope. When using smaller elements in the contact surface, the high shear strains will cause severe element distortion. Preventing this is one benefit of ALE simulations. However, as previously discussed, the frequent state interpolation in ALE simulations can result in significant errors with history-dependent material models. A possible strategy is to re-mesh only when the element distortion becomes too large. This approach would allow smaller elements to capture the behavior closer to the surface. In Sections 3.1 and 3.2, the deformations continue to accumulate with more cycles. However, investigating if the deformation accumulation saturates requires both a finer mesh and more cycles. Also, the wheel loads, friction coefficients, and creepages from the field or experiment must be applied. An efficient re-meshing procedure would therefore be a natural extension of the present work.
To further speed up the simulations, a substructure could replace the rail part expected to behave elastically. However, using a substructure in Abaqus prevents parallel element assembly. Therefore, it is likely more efficient to provide a user element, similar to the wheel. The main difference to the wheel is that no large rotations will occur. Hence, a standard linear super-element can be used.
Finally, an important approximation in the proposed methodology is that the wheel is fully elastic and has a constant profile throughout the simulation. An issue is that the plastic rail profile adapts to the wheel profile after many rolling cycles. Currently, the wheel can move laterally during a rolling cycle to reduce this shape adaptation. However, in a more general framework, the wheel profile could morph during the simulation. Furthermore, by combining the co-rotational formulation in Appendix A with numerical model reduction (cf. [7]), a plastic wheel behavior could be approximated.

Conclusions
A new methodology for finite element simulations of elasto-plastic rolling contact loading has been developed. Its main advantage is a higher accuracy for a given rolling length due to periodic boundary conditions. Additionally, less surrounding material is required. In total, the improvements yielded more than 25 times faster 3d-simulations compared to the old method [19]. Furthermore, for 2d-simulations, the accuracy improvements were even more significant. However, the previous observations that 2d simulations cannot quantitatively predict the same plastic accumulation as 3d simulations are confirmed. This finding further motivates the need to focus on developing efficient 3d rolling contact simulation methods.
The methodology has been applied to an example with realistic geometry, advanced material models, and an elastic inclusion in the rail. The inclusion is an extreme example of a martensitic white etching layer. In addition to demonstrating the proposed methodology's versatility, the simulations confirm previous findings that accurate material modeling is essential for accurate rolling contact simulations. The present paper emphasizes an elastic wheel rolling on a rail. However, the method can be applied to any rolling and sliding contact between an elastic and an inelastic body. To facilitate for other researchers, an opensource plugin is provided [17].

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
This work is part of the activities within the CHARMEC Centre of Excellence at Chalmers University of Technology. Parts of the study have been funded within the European Union's Horizon 2020 research and innovation programme in the project In2Track2 under grant agreement no. 826255. The simulations were performed on resources at Chalmers Centre for Computational Science and Engineering (C3SE) provided by the Swedish National Infrastructure for Computing (SNIC).

Appendix A. Wheel super element
While Abaqus support substructures, there are three reasons for writing a user element instead of using a substructure: • Access to the wheel stiffness matrix when moving back the wheel • Faster performance as Abaqus does not support parallelized element assembly when substructures are used.
• The Abaqus manual does not describe in detail how finite rotations are dealt with for substructures. Due to these reasons, a user element routine has been developed for the wheel. It considers a linear elastic structure, controlled by a reference point. The linear elastic behavior is considered in a rotated frame, defined by the x-rotation of the reference point. Fig. 17 defines the vectors and positions used in the transformation of displacements due to a large x-rotation.
The element internal force vector {F}({u}) is a non-linear function of the element displacement vector {u}. However, as the material is assumed linear elastic, the function becomes a linear function in the rotated (prime (')) coordinate system, where K ′ is a constant stiffness matrix. This model is accurate because the strains and rotations are small in the rotated system. The displacements of node i are given as the vector u i = [u x,i , u y,i , u z,i ], where i = 0 for the node at the wheel center. The full degree of freedom vector, {u}, is and corresponds to the full load vector {F}, where F x,i , F y,i , and F z,i are the nodal forces in the x, y, and z directions, respectively. M w x , M w y , and M w z are the moments about the wheel center. The nodal forces at node i are denoted The quantities in the primed coordinate system are defined equivalently.
Considering the nodal level, we can express the nodal forces in the global coordinate system (GCS).
where Q is the rotation matrix from the GCS to the rotated coordinate system. Due to the wheel translations, the conversion of nodal displacements in the GCS to the rotated coordinate system is more involved. The displacements are given as the difference between the initial and deformed coordinates, However, u i includes large rotations, and only the small displacement relative to an undeformed wheel, û i , is of interest where X i is the position on an undeformed wheel, translated by u 0 and rotated by θ w x around the x-axis of the initial point X i . This point is found by considering the initial vector from the wheel center to node i, rotating it, and adding it to the displaced position on the wheel center However, û i is measured in the GCS, and must be rotated to the prime coordinate system From these deformations, the nodal forces in the primed coordinate system can be calculated with Eq. (7). Finally, the nodal forces in the GCS are calculated by Eq. (10). As the rotation of the reference point is always displacement controlled, this rotation does not need to be accounted for in the iterations. Therefore, the stiffness matrix can be calculated simply by considering a standard rotation: where 1 is the 3×3 identity matrix and 0 = 1 − 1 the 3×3 zero matrix.

Appendix B. Description of plugin
The three main Abaqus plugins can create a rail, create a wheel, and set up a simulation. Additionally, a set of user subroutines are required for the simulation to run. In this section, we give an overview of the three main plugins and the user subroutines. For an in-depth documentation, please see [17].

B.1. "Create rail" plugin
Given an Abaqus sketch of a rail's cross-section, the plugin creates a rail part with the necessary sets and surfaces. The user can modify this rail by changing the mesh and geometric features and adding sections and materials. However, a requirement is that the mesh is periodic, i.e., the surface mesh on the rail end faces are equal.

B.2. "Create wheel" plugin
As for the rail, the starting point is an Abaqus sketch of a wheel cross-section. The 2d cross-section mesh has small elements at the contact surface. Revolving the 2d-mesh around the x-axis creates a 3d-mesh. The angular spacing is such that the element size furthest out is equal to the fine mesh in the 2d cross-section. A tie constraint connects the inner ring of the wheel to the central reference point. User-given intervals in x and rotation about x determine the retained nodes. Abaqus substructure generation calculates the stiffness matrix for the unrotated wheel, i.e., K ′ in Eq. (7). The plugin saves this stiffness matrix to a file that the user element routine later uses to read in the wheel stiffness. Furthermore, the plugin saves the node coordinates for use by the "Setup simulation" plugin.

B.3. "Setup simulation" plugin
The plugin requires an Abaqus Model Database (.cae file) for the rail and a folder created by the "Create wheel" plugin. It adds shadow membrane elements to the rail by extending the contact surface. The plugin creates a wheel surface mesh from zero stiffness membrane elements. After that, it adds contact conditions, loads, and step definitions.
Some Abaqus settings are not available via the GUI/Scripting interface. Therefore, the plugin edits the keyword database to include the wheel super element and the output to Abaqus' results file (.fil). The user subroutines require that output for moving back the wheel between cycles.

B.4. User subroutines
The rollover simulation requires three subroutines; UEL, URDFIL, and DISP. UEL is the user element routine described in Appendix A. URDFIL and DISP are used to move the wheel back and map its deformation: URDFIL reads the wheel node coordinates and displacements from the Abaqus result file. These results are saved in a Fortran module, making them accessible by the DISP subroutine. The DISP subroutine is then used to apply displacement controlled boundary conditions, calculated according to Eqs. (4) and (5), as well as based on the user element stiffness (see Section 2.2).