The ISSC 2022 committee III.1-Ultimate strength benchmark study on the ultimate limit state analysis of a stiffened plate structure subjected to uniaxial compressive loads

This paper presents a benchmark study on the ultimate limit state analysis of a stiffened plate structure subjected to uniaxial compressive loads, initiated and coordinated by the ISSC 2022 technical committee III.1-Ultimate Strength. The overall objective of the benchmark is to estab- lish predictions of the buckling collapse and ultimate strength of stiffened plate structures subjected to compressive loads. Participants were asked to perform ultimate strength predictions for a full-scale reference experiment on a stiffened steel plate structure utilizing any combination of class rules, guidelines, numerical approaches and simulation procedures as they saw fit. The benchmark study was carried out blind and divided into three phases. In the first phase, only descriptions of the experimental setup, the geometry of the reference structure, and the nominal material specifications were distributed. In the second phase, the actual properties of the refer- ence structure were included. In the third and final phase, all available information on the reference structure and measured properties were distributed, including the material properties and laser-scanned geometry. This paper presents the results obtained from seventeen submitted FE simulations as well as details on the experiment. It also presents comparisons of the force versus the displacement curve, failure modes and locations for each phase, among others, and a discussion on the participants ’ ability to predict the characteristics of the reference experiment with the information that is available for the phase. The outcome of the study is a discussion and recommendations regarding the design of finite element models for the ultimate state analysis of stiffened plate structures, with emphasis on the prediction of the ultimate capacity, force-displacement curve, and failure mode and location related to access to data, uncertainties and modeling of the material properties, geometric imperfections and distortions, and residual stresses.

This paper presents a benchmark study on the ultimate limit state analysis of a stiffened plate structure subjected to uniaxial compressive loads, initiated and coordinated by the ISSC 2022 technical committee III.1-Ultimate Strength. The overall objective of the benchmark is to establish predictions of the buckling collapse and ultimate strength of stiffened plate structures subjected to compressive loads. Participants were asked to perform ultimate strength predictions for a full-scale reference experiment on a stiffened steel plate structure utilizing any combination of class rules, guidelines, numerical approaches and simulation procedures as they saw fit. The benchmark study was carried out blind and divided into three phases. In the first phase, only descriptions of the experimental setup, the geometry of the reference structure, and the nominal

Introduction
Ships and offshore structures are built with stiffened panel that have been designed to fulfil requirements and criteria for the ultimate limit state (ULS) according to rules and guidelines from classification societies. Despite decades of use, it can still be challenging to accurately predict the buckling (failure) modes, ultimate strength and post-buckling behavior of stiffened plate structures. The development of new shipbuilding steel materials and welding and production technologies, together with the larger dimensions of these structures and the fact that they are subjected to multiple loads (such as external loads and temperature) that may trigger a emphasized the identification of possible sources of uncertainties and errors with reference to previous discussions on the correct modeling and assessment approach for the buckling tests presented in the former committee report [14]. The outline of the study in the following chapters is as follows. Chapter 2 presents the objectives and a detailed description of the benchmark study. The reference experiment is presented in Chapter 3, followed by a summary of the results from the experiments in Chapter 4. Chapter 5 presents the participants' FE models and analyses of the experiment, followed by a discussion and comparison of the results from the reference experiment and the FE analyses in Chapter 6. The conclusions of the study are presented in Chapter 7.

Objective and description of the benchmark study
Benchmark studies are important for comparing the skills, best practices, assumptions and "traditions" of researchers and practitioners of a given technical area. Benchmark studies are helpful in establishing how to design numerical models and simulate and analyze various aspects such as the structural response of a complex ship or offshore structure subjected to compressive loads against ultimate strength. Even if modeling guidelines and best practices are available, there are always sources of errors and uncertainties that lead to scatter in the simulation results. Benchmark studies help to compare and learn from all participants while systematically identifying issues that require improvements and sometimes new guidelines. Some important questions that these studies discuss and try to answer include the following: How large is the scatter in the results that can be accepted? Which indicators or criteria should be used in the assessment? Almost importantly, how can the lessons learnt be communicated to provide improvements and revised best practices?
The overall purpose of the benchmark study presented in this investigation was, with reference to the results from a reference experiment on a stiffened steel plate structure, to compare different class rules and guidelines, the participant's skills and experiences, numerical approaches and simulation methods, in their "ability" to make trustworthy predictions of the buckling collapse and ultimate strength of stiffened plate structures subjected to compressive loads. The influences from uncertainties in the modeling procedure, solver, material properties, geometrical initial imperfections, residual stresses, assumptions made by the modeler/analyst, etc., are incorporated in the study and form the basis for discussions, conclusions and recommendations for stricter/well-defined guidelines for the ultimate strength analysis of stiffened plate structures.

Description of the benchmark study
The benchmark study was led and coordinated by the two leading authors who are members of the ISSC 2022 technical committee III.1-Ultimate strength. Seventeen groups participated worldwide: 9 of the groups were members of the committee, and 8 accepted an invitation that was sent to a larger group of experts outside the committee. All groups (participants) were active in researching the ultimate strength of ships and offshore structures, and they have published numerous scientific papers on the topic. Fig. 1 presents the physical model in the reference experiment carried out on a large full-scale steel grillage representative of a typical ship structure in warships and as secondary structure in larger ships. Design and testing of the grillage were performed by the US Navy's Naval Surface Warfare Center, Carderock Division (NSWCCD), while fabrication of the grillage was conducted by Aberdeen Test Center's (ATC) weld fabrication shop; see Chapter 3 for details. The grillage consisted of three full sections and two partial sections and was longitudinally stiffened by three identical stiffeners and a single large girder. The plating consisted of two plates of different thicknesses that were butt-welded in Section 3 according to Fig. 1. The experiment was displacement-controlled with clamped longitudinal end conditions except for the one end where longitudinal loading was applied in the moving direction; tie-downs were mounted on the sides along the length of the model to prevent vertical motions.
Model, physical and human error-related uncertainties that affect the prediction of the ultimate strength capacity and failure mode characteristics of the stiffened steel plate structure, compared to the results from the experiment, were investigated by dividing the benchmark study into three phases; see Fig. 2 for a schematic. This is a novelty compared to former ISSC technical committee III.1-Ultimate Strength benchmark studies. The multiphase validation procedure was defined to reflect the amount of information available from the early design of a new stiffened plate structure with limited confirmed or measured data (Phase 1) to a very detailed design where the majority of the information that defines the structure has been thoroughly measured (Phase 3).

Description of the three phases
The benchmark study was carried out as a blind study whereby participants did not have access to the test results from the experiment prior to the study. 1 Results were not shared or discussed amongst participants prior to submission to the study coordinators. This ensured that each participating group exhibited their best performance, making use of their own best practices and preferred reference literature to design and build numerical models that should replicate the physical model test of the reference stiffened plate structure.
The benchmark study was run from June 2019 to September 2020. The participants received the same information, instructions, data and files needed to carry out the modeling and analyses for each phase; all files can be found as public research data in Ref. [16]. They were free to make their own assumptions, but these assumptions should have been motivated and referenced in the technical report. All participants carried out FEAs, and some also presented results for Phases 1 and 2 using other numerical codes (referred to as practical codes in the former chapter); the latter is presented in the committee's full technical report of the benchmark study [17].
To meet the objective of the study, the phases were carried out in sequential order, i.e., only the information and data needed to accomplish the purpose of each phase were made available. When a participant had completed a phase, the results and the technical report were submitted to the coordinators of the benchmark study, access was granted to the next phase, and more information and data were made available. Note that a participant was not allowed to revise or update the results from a former phase. Insights of errors from assumptions or modeling issues that led to a revision of the methodology or models between the phases were documented in the technical report from each phase. Hence, it was possible to trace and analyze human error-related uncertainties.
The technical report followed a template provided to the participants to ensure consistency in the information and data reported. The technical report from each phase described in detail how the FE model and its related components were developed from assumptions based on experience, convergence studies, references, class rules or guidelines, e.g., the mesh type and element size, solver, modeling of the material by interpretation of material data and choice of constitutive material model, geometric imperfection model, and residual stress model; see Chapter 5 for details. One model Excel file was also filled in for each phase to clearly document and follow-up the FE model used in each phase. In a results Excel file, the participants presented the force versus displacement curve, and failure mode and location in their model together with clarifying figures for each phase. In the following, a brief overview of each phase is presented (see Table 1). Specific details referring to the data and FE modeling are presented in Chapter 5.

Phase 1 -nominal properties
The participants predicted and assessed the ultimate strength capacity of the reference structure based on its nominal properties. Printed 2D drawings and a 3D CAD file of the nominal geometry were shared together with a file that presented the nominal data and material selections for the different parts of the structure. A description of how the experiment was carried out was provided, including information on loading and boundary conditions. In this phase, it was stressed that each participant should clearly express their assumptions on how geometric imperfections were considered in the model and which reference, class rule or guideline was followed.

Phase 2 -nominal properties, actual properties, and measured geometrical imperfections
The information shared in Phase 1 was complemented with new information from thorough laser scanning/tracking of the geometry of the as-built reference structure as well as thickness measurements in some locations of different parts of the structure (referred to as the "actual thickness"). The participants were asked to repeat their analysis from Phase 1 considering the measured geometrical imperfections, distortions and deflections of the as-built reference model. Phases 2-3 included material information in the  form of vendor-supplied material certification sheets that typify the level of information that can be expected at a shipyard. The phase was divided into three subtasks and reports to enable thorough assessment of each factor's influence on the results.

Phase 3 -actual properties, measured properties, and residual stresses
In this final and third phase, the participants were provided with the remaining data available from measurements and testing including the exact thickness measurements made in many locations on the reference structure (referred to as the "measured thickness") and tensile test results, including full stress-strain curves, for all structural members. The participants were asked to repeat their analyses from the former phases but with revised/updated models based on the new data. This phase was also divided into three subtasks where the first two phases (Phases 3-1 and 3-2) were mandatory. The third subtask was optional, an add-on that included modeling of the residual stresses. The decision to make this phase optional was primarily due to the lack of residual stress measurements on the reference structure.

Report and summary of the results
Each participant was required to submit one technical report and two Excel files (model and results) for each phase as described in Chapter 2.2. Considering the large number of participants in the study, a vast amount of material was collected that formed the total documentation. The coordinators of the benchmark study extracted the information and results presented in this investigation and asked each participant to review the summary and propose corrections if any. Note that it was not permitted during the process to modify the results. This step served as a final check to ensure that the information that was submitted was correctly interpreted and analyzed and that there were no typos in the technical reports. Additionally, participants who had provided more than one force versus displacement curve from a phase were asked to select one of the curves and explain why it should be included in the summary.

Description of the reference experiment
The reference ultimate strength testing experiment was performed utilizing a newly fabricated tee-stiffened plate steel structure (hereafter referred to as the "structure"). The structure was designed by the NSWCCD and fabricated by the US Army's Aberdeen Test Center in accordance with standard shipbuilding practices. Detailed surveys of the as-built structure were conducted prior to testing, including laser scanning/tracking, ultrasonic thickness measurements, and tensile tests of the material coupons. The test results consist of overall load and displacement measurements, strain gauge data used to determine the failure sequence, and video data. Residual stress measurements were not conducted due to practical limitations in conducting these measurements.

Table 1
Overview of the three phases and what was considered (green cells) in each phase.

Description of the structure
The structure was fabricated from shipbuilding steel grade A36 (ASTM A36 [18]) steel plate and rolled steel shapes. The overall length of the structure was 7315 mm and the width was 2438 mm. The longitudinal frame spacing was 1829 mm such that the structure consisted of three full sections and two partial sections. Longitudinal stiffening consisted of three identical 124.6 mm depth tee shapes prepared from AISC W12 × 14 beams (Longitudinal A, C, D) and a single AISC W12 × 19 I-T longitudinal girder (Longitudinal B), all spaced 610 mm center-on-center. Transverse stiffeners consisted of tee shapes with a depth of 177.8 mm prepared from AISC W10 × 17 I-T beams. All stiffeners fashioned from single beam to avoid any stiffener-to-stiffener welds and ensure consistent properties from one stiffener to the next. Furthermore, all material for a given beam size was from a single heat to minimize material variability. Stiffener intersections were fabricated using a slotted construction method that avoids collars, with the exception of the large girder, which uses conventional collar details where it intersects transverse stiffeners.
The plating consisted of two full-breadth strakes: 6.35 mm thick plating with a length of 3352 mm and 7.94 mm thick plating with a length of 3962 mm, joined in the middle section by a transverse butt weld that spans the width of the panel. The structure was outfitted with side plates, transverse frame cap plates, and heavy end plates. The side plates and cap plates were 9.53 mm thick except at the ends of the transverse stiffeners, where the plates were 19.05 mm thick. An array of holes was located on these pieces to facilitate the connection between the plate structure fixture and vertical tie-downs. The structure end plates were fabricated from a 38.10 mm plate with a matrix of bolt holes to connect the fixture loading to the reaction ends. All welding was performed using 70S-1 welding wire through a pulsed gas metal arc welding (GMAW-P) process and visually inspected at the time of construction. A summary of the nominal and actual properties of the stiffened plate structure and geometric imperfections is provided in Table 2. Note that imperial units were used to designate beam sizes and that all metric dimensions have been converted from imperial units.
The properties of the structure in Table 2 show that the magnitudes of the imperfections in the structure exceed typical tolerance levels given in codes, e.g., the maximum out-of-plane plate amplitude corresponds to D s /96 (see Section 3.3 and Fig. 5 for local plate amplitudes: 6.4 mm corresponds to 0.07β 2 t p , which is in-between the slight (0.025β 2 t p ) and average (0.1β 2 t p ) levels defined by Smith et al. [19]), the maximum side displacement corresponds to D f /219, while typical magnitudes of the imperfections, implemented in buckling strength requirements [20,21], are D s /200 and D f /1000 or D f /667; D f and D s are the frame and stiffener spacings, respectively. The somewhat larger imperfections in the structure compared to Refs. [20,21] may be due to the high slenderness of the plates and stiffener webs. At one hand it is expected that the ultimate strength will be overpredicted if the smaller code-based imperfections are used, on the other hand the actual imperfections shapes (see Section 3.3) differ from the buckling eigenmodes and could thus increase the strength if they push the deformation response into a higher eigenmode.

Material data and characteristics
Tensile testing was conducted on various structural components to establish material properties. All tests were conducted following ASTM E8 [22] utilizing a 50.8 mm gauge length. Coupons were extracted in the rolling and transverse directions of both the thickness plates as well as in the rolling direction for stiffener and girder elements. A total of 24 coupons were tested with three repeat tests for each configuration, and full stress-strain curves were collected through peak loading. The results, presented in Table 5 in Chapter 5.6, indicate that the material yield strength is significantly greater than the yield strength of 250 MPa specified in ASTM A36 [18] for all structural elements. Furthermore, the rolled shapes that were purchased met the specifications for both ASTM A36 [18] and ASTM A992 [23] with a higher minimum yield strength of 345 MPa.

Geometric imperfection data
Measurements for the initial shape and thickness were obtained prior to testing using a combination of laser tracking on the stiffener edges, laser scanning of the plating, and caliper and ultrasonic thickness measurements. Out-of-plane plate distortions were established using a best-fit plane of the plate data as shown in Fig. 3. Stiffener measurements were used to establish the rotation (distortion, stiffener tilt) from the plate plane, an example of which is shown in Fig. 4. A summary of imperfection data is provided in Table 2. It is evident that out-of-plane plate distortions are moderate in Section 2 and 3 and low in Section 4. A similar trend exists in the measurements of stiffener tilt. However, the cuts at x = 1500 mm (Section 2) and at x = 6000 mm (Section 4) in Fig. 5 show that local out-of-plane plate distortions are of hungry horse type and have their highest amplitudes in Section 4.

Test procedure and measurements
Structural testing was conducted using NSWCCD's Grillage Test Fixture (see Fig. 6). This fixture can accommodate large-scale stiffened plate structures of up to 7320 mm in length and 2440 mm in width and is capable of exerting longitudinal loads of up to 22 MN with a maximum travel in either the tension or compression direction of 150 mm of displacement. Longitudinal loads are applied through two rows of five 2.2 MN actuators located at one end of the fixture that are connected to a single heavy steel header onto which the plate structure is bolted. The structure was carefully shimmed prior to testing and strain gauges monitored to ensure that any eccentric loading was minimized, though loading eccentricity cannot be avoided when the neutral axis of the section shifts. Previous efforts have determined that the fixture compliance is 0.268 mm/MN, and the load-end shortening curves (hereafter referred to as force versus displacement curves) shown herein are corrected utilizing this value.
The boundary conditions were provided at the loading and fixed ends (see Fig. 1). At these ends, all motions (displacements and rotational movements) were fixed except for the axial displacement at the loading end. Vertical motions along the long edges were restrained by a series of 27 tie-downs installed on each side of the structure at a spacing between 254 mm and 279.4 mm. The tie-towns, which consist of a 31.8 mm diameter threaded rod that is screwed into a 63.5 mm outer diameter cylinder, were hand-tightened prior to testing.
The test procedure was performed as follows. First, two low load-level compressive cycles are performed to capture the linear response of the structure, consisting of loading the structure to 1.1 MN (approximately 25% of the calculated anticipated peak load), unloading to zero load, and subsequently loading to 2.2 MN (approximately 50% of the calculated anticipated peak load) before again unloading the structure. After the completion of the compression cycles at 25% and 50% of the anticipated peak load, an ultimate strength collapse run was performed in which the structure was compressed well into the post-buckling range. The test was terminated once the structure dropped to 70% of the peak load.

Test results
A plot of the overall force-displacement behavior is shown in Fig. 7. The response for the two low load cycles of 25% and 50% is linear with no permanent set or change in structure compliance. The collapse run follows the same loading path as the low-load cycles with nonlinearity observed in the overall load-shortening curve above 3.34 MN. The peak ultimate load achieved during collapse loading was 6.59 MN. Structure compression was halted at 4.68 MN of compression, and the structure returned to zero load. Fig. 3. Results from plate out-of-plane measurements of the plate; the coordinate system is referred to Fig. 1 (z = 0 at the bottom of the plate).
The structure contains three full-length sections in which failure could occur. The primary failure zone occurred in Section 2, where the thicker plating was located, as seen in Fig. 1. However, peak strength is observed in Section 3 and significant post-peak nonlinear response occurs in Section 4. The closely spaced buckling modes suggest that minor differences in geometry, material properties, or residual stresses drive failure to Section 2 despite the thicker plating.
The dominant failure modes in Section 2 appear to be tripping of the girder and local flange buckling of the longitudinal stiffeners. The failure sequence during the collapse cycle in Section 2, obtained through a combination of video and strain gauge analysis, is as follows. First, the plating between the stiffeners starts to buckle downwards elastically, increasing compression in the longitudinal  Cuts at x = 1500 mm (Section 2) and at x = 6000 mm (Section 4) that present the local out-of-plane plate distortions which are of hungry horse type. stiffener flanges during the increase in loading from approximately 0.9 to 4.5 MN. Tripping, or lateral-torsional buckling, of the longitudinal girder initiates at a load of approximately 4.5 MN. Finally, local flange buckling of the remaining stiffeners along with column-type buckling occurred, as seen in Fig. 8. A post-test view of the structure is shown in Fig. 9.
Section 3, where the plate weld and thickness transition are located, behaves in a similar manner to Section 2. Analysis of strain gauge data indicates that peak load has been attained and that significant inelasticity is present and it is noted that several strain gauges failed due to excessive strains. In Section 4, where the plating is thin but distortions are noticeably lower (see Figs. 3 and 4),   plastic deformation is not visibly observed. The response of the stiffeners appears to be linear, though nonlinear behavior of the plating is observed.

Finite element models and analyses of the reference experiment
Traditionally, the ultimate strength assessment of ship and offshore structures is based on closed-form methods (empirical methods) given in design standards or reduced-order progressive collapse codes. The closed-form methods included in design standards are based on statistical evaluation from a large number of full-scale tests. Reduced-order tools provide a higher level of sophistication but ultimately rely on many assumptions regarding boundary conditions as well as material behavior. In contrast, the FEA method provides the most comprehensive manner in which the ultimate strength capacity can be assessed along with the impact of studying various assumptions. However, FEA remains very time-consuming in the context of offshore and ship buildings when assessing the overall hull girder capacity of the ship. FEA requires a large amount of resources in view of modeling, load application, calculation time, solution iteration, and results evaluation. Expanding the model extent to include the entire hull girder introduces additional numerical challenges. Therefore, ship and offshore construction standards almost always rely on closed form/empirical methods. The current study emphasizes FEA. Other methods (closed form/empirical methods) used to simulate the test are presented in the committee's full report of the benchmark study [17].

Geometry and FE models
The participants of the benchmark study created their FE model using either 2D drawings or the 3D CAD file, both of which were provided by NSWCCD in Phase 1. The geometry of the FE model includes all the longitudinal stiffeners, the longitudinal girder, the transverse girders, transverse end caps, side plates and end plates. The dimensions of the stiffened plate structure are presented in Table 2.
The end plates were included in the FE model by most participants. Six participants disregarded the end plates, and they were reproduced by applying fixed boundary conditions. Taking into account that the plates are essentially rigid, not considering them in the FE model is a correct approach that leads to a reduction in the number of degrees of freedom (DOF). Additionally, the scallop holes and collar plates in the longitudinal girder were treated differently in the FE model by the participants. Some participants included the scallop holes in the model while others ignored them. The geometry of the structure was modeled by all the participants by using either 4-noded or 8-noded shell elements (reduced or fully integrated) with five, seven or eleven integration points through the thickness.

Mesh sensitivity check
FE models need to be able to capture all the important failure modes and locations in order to perform a correct assessment of ultimate strength capacity. A main aspect of this is obtaining sufficient mesh resolution. The recommended practice for obtaining the structural capacity by nonlinear FEA methods [24] prescribes a mesh density of (minimum) 3-6 first order elements per expected half-buckling scenario to capture all the relevant buckling deformations and localized plastic collapse behavior in the structure. The mesh density of the reference structure corresponds to a mesh size between 50 and 100 mm. However, running a mesh sensitivity study to verify that the element density has no influence on the FE results (stress distribution) is always recommended. Half of the participants performed mesh sensitivity analyses to find the optimal element size. After the sensitivity test, each participant found an optimal mesh size between 12.5 × 12.5 mm and 50 × 50 mm, which fulfils the mesh size requirement given in recommended practices. The other half of the participants relied on their own knowledge and experience in this field and chose to use a mesh size between 25 × 25 mm and 56 × 70 mm for their analyses without performing any mesh sensitivity studies. Table 3 presents a summary of the participants' different FE model definitions and parameters; the mesh size used by each participant can be found in Table 3. The total DOFs in the FE models used in the benchmark study varied between 63,000 and 1,000,000.

Boundary conditions
The applied boundary conditions in the FE model need to closely represent the reference experiment. All participants used the same boundary conditions as those described in Chapter 3.4 in their models. The motion and rotation of both structure ends are fixed except for axial displacement at the loading end. The vertical displacement was constrained at the side plates to simulate the 27 tie-downs installed on each side of the structure. An axial displacement (longitudinal compression) was incrementally applied on the loading end (displacement control) to the point of post collapse of the structure. It should be noted that the two initial compressive load cycles performed in the physical experiment to capture the linear response of the structure (see Fig. 7) were not included in the FEA by the participants. The physical experiment was carried out at a slow loading rate so that dynamic effects were not important and buckling was well controlled; therefore, the majority of the FEAs were carried out as static analyses.

Initial geometric imperfections
Ship and offshore structures always exhibit out-of-plane deviations from the perfect form due to welding, cutting and production. These patterns of imperfections are quite unpredictable, and detailed information is not always available. The buckling and ultimate strength capacity is sensitive to initial imperfections. In practice, the classification society standards define maximum tolerance limits for out-of-plane deviations, without any restriction on the shape of the imperfection; see Section 3.1 Nonlinear analyses typically consider the most critical imperfection patterns (shape and amplitude), resulting in a slightly conservative estimate of the ultimate capacity.
Phase 1 of the benchmark study started without any information regarding geometric imperfections. Most of the participants, however, modeled initial imperfections in their FE models using different assumptions and procedures. In accordance with the recommendations given by classification societies or in the literature, most of the FEAs were performed with initial deflection shapes based on eigenmodes associated with the applied loads. The amplitude of the imperfections used in the analyses varies considerably between participants. The initial geometrical imperfections were applied on the plates, the longitudinal stiffeners and the longitudinal girder. No initial imperfections were applied to the transverse frames. The maximum imperfection amplitudes on the different structural members are presented in Table 4.
From Table 4, it can be noted that the plate amplitude assumed by the participants vary from zero to D s /87, but more than 2/3 of the participants used approximately D s /200 (i.e., 3 mm). For the stiffeners and girders, the amplitude varies from 0 to 0.05 × D f . Half of the participants adopted a value of approximately 0.001 × D f (i.e., 1.8 mm) or 0.0015 × D f (i.e., 2.7 mm). It is noted that scaling of one eigenmode to a desired level for the plate or the stiffer/girder may have yielded a small or large level for the amplitude of the other component, e.g. for ID 4 and 12. It is also noted that the assumed imperfection levels are generally significantly smaller than the actual values measured, see Table 2 in Section 3.1.
In Phase 2-1, the initial geometrical imperfections of the structure obtained by laser scanning were provided to all the participants. The measurement data were included in the FE model by all the participants in Phases 2 and 3 and are shown in Figs. 3 and 4. An example of the measured geometric imperfections mapped on an FE model, plate and stiffeners, is presented in Fig. 11.

Welding-induced residual stresses
Typically, in ship and offshore structures, the welding-induced residual stresses are either ignored or implicitly assumed when assessing the ultimate strength capacity. This topic was not addressed at the outset of the benchmark study, and all FEAs were free of welding-induced residual stresses. After completion of Phase 3-2, however, the mismatch of the slope of the force-displacement between the FEA and the reference experiment was relatively large. Therefore, it was decided to investigate the effect of welding-induced residual stresses in Phase 3-3. These additional FEAs were essential to validate the experimental load-displacement curve from the NSWCCD reference experiment. Welding-induced residual stresses were included by the participants by using one of the following two Table 4 Summary of the amplitude of the geometric imperfections applied in Phase 1 and the guidance/approach used in the model. Harmonic imperfections according to [26] approaches; see Chapter 6.3 for more details: • assuming a uniform tensile region in the heat-affected zone (HAZ) of all the fillet welds and a matching opposing compression zone in the remainder of the plate. • pre-stressing the structure through a thermal analysis in the HAZ and using the results as an initial condition for the collapse analysis.   a This is a value estimated by the majority of the participants from the provided data, but it is unrealistically high. The majority used this value, but some participants reduced it to 210 GPa. A comparison of the results between participants showed that either choice of this value for the side plate did not influence the results from this benchmark study.
The magnitude of the tensile stress levels applied on the HAZ varies considerably between participants and ranges between 20% and 100% of the yield stress. The presence of welding-induced transverse residual stresses acting in the direction perpendicular to stiffeners or girder was neglected in this study. An example of a welding residual stress distribution obtained by a sequential thermalmechanical FE analysis is presented in Fig. 12.

Constitutive material models
The material data made available to the participants in the different phases is presented in Table 5. "Nominal material" refers to material data as specified in ASTM A36 [18], "Actual material" refers to the material specification provided by the supplier of the material of the reference structure, and "Measured material" refers to tensile tests carried out as described in Chapter 3.2. "Measured material" data included full stress-strain curves for each structural member.
The material was represented by different constitutive material models for each phase. These ranged from elastic-perfectly plastic, bilinear stress-strain curves with tangential module E T , or multilinear stress-strain curves. Table 6 presents a summary of the constitutive material models used by each participant in each phase; see Ref. [17] for more details.
Measured material data provided to participants during Phase 3-2 consisted of raw engineering stress-strain curves. FE software packages most typically require that data must be given by the true stress-strain curve instead of the engineering stress-strain curve and this step was left to participants and is a source of potential analyst error. An example of the measured engineering stress-strain curve for the structure plate is shown in Fig. 13 along with the corresponding true stress-strain curve. The true stress-strain curve was calculated based on engineering stress-strain data. An additional source of differences in material modeling is the manner in which participants accounted for either a yield plateau (Plate A, stiffeners, girder) or lack of a clearly defined yield point (Plate B). An example for yield plateau of longitudinal B (girder web) is presented in Fig. 13 illustrating the need for simplifying assumptions regarding details of the yield plateau region. For the materials that showed a yield plateau, almost all participants consistently used the stress at which 0.2% plastic strain occurs as the plateau yield stress. In addition, as the provided stress-strain curves contained high- Table 6 Summary of the participants' choice of constitutive material in each phase: EP = elastic-perfectly plastic; BL = bilinear stress-strain curve with tangential modulus E T ; ML = multilinear stress-strain curve (n.a. = not available).  density data, most participants simplified the curve in their FE models with a multilinear stress-strain curve (see Table 6). Since the physical experiment was performed on the reference structure at low speed, strain-rate effects were considered negligible, i.e., they were disregarded in the analyses.

Solver and solution control
The FE software programs used by the participants to perform the ultimate strength assessment according to the benchmark study include Abaqus [27], ADINA [28], Ansys [29], LS-DYNA [30] and MSC Marc [31]; see Table 3. All the details of these commercial programs can be found in the corresponding references.
The tests were conducted at a low speed; therefore, the dynamic effects in the FEA should be ignored. The most common FE method used by the participants was a nonlinear implicit solver. Fourteen participants used an implicit equation solver for their FE analyses (see Table 3). For moderately nonlinear problems, e.g., progressive collapse analysis, the implicit solver becomes unstable (convergence issues). To overcome these issues, the participants used the modified Riks method, other numerical damping controls in their FE analyses, or inclusion of inertial effects. In contrast, explicit equation solvers do not require matrix inversion or iteration (no convergence problems), and the computational effectiveness is higher than that of the implicit equation solver. However, to obtain good results, the explicit equation solver requires special attention and a great amount of experience on the part of the user. Three participants used an explicit FE solver as implemented in the commercial programs Abaqus and LS-DYNA.

Results and discussion
The results of the FE simulations of the reference experiment are presented, compared and discussed in three chapters. Chapter 6.1 presents the force-displacement curves, failure modes and locations for all the phases. In Chapter 6.2, a statistical analysis of the results is presented together with an assessment and discussion of which factors and uncertainties contribute the most to the deviation between each phase's result and the experimental result. Chapter 6.3 presents the concluding remarks on the outcome of the benchmark study in relation to class rules, guidelines and recommended practices for ultimate limit state analyses of stiffened plate structures subjected to uniaxial compressive loads.

Force versus displacement curve, failure mode and location of failure
Fig. 14 presents the force-displacement curves for each phase together with the bar diagrams of the ultimate capacity. Overall, the numerical (FEA) predictions show relatively low scatter between the participants up to the ultimate load for each of the phases; note that analysis of the post-buckling behavior and residual strength were not incorporated in the current study. There are some differences between the participants' models and results, which are discussed in more detail in Chapters 6.2 and 6.3.
The Phase 1 ultimate capacity predictions using nominal material values and geometry uniformly underestimated values from the reference experiment. This was an expected outcome as use of nominal material properties introduces conservatism relative to the actual tensile and ultimate strength properties of the material. This underestimation is satisfactory or even desirable in the case of an initial design and maintaining a margin of safety from the perspective of a class rules and guidelines.
Analysis of the results successive phases shows that the largest change in the ultimate capacity is between Phase 2-2 and Phase 2-3. The difference between these phases is due to the introduction of the "actual material" properties instead of the "nominal material" properties. The FE models used in all phases except Phase 3-3 could not fully capture the nonlinear behavior seen in the test data curve even when the peak predicted strength was close to the experimental value. By introducing welding-induced residual stresses in the FE models in Phase 3-3, a better match between the FEA and test data curves is obtained. Nevertheless, all predictions underestimated the displacement and slightly underestimated the ultimate capacity compared to the test result. This indicates that the compliance and strength of the reference structure was greater than in the submitted predictions.
The results in Fig. 14 shows that the force-displacement curves predicted by the participants in Phase 1 were generally softer than the test curve, but they were closer to the test curve in Phase 2-1 to 3-1. The access to true (measured) stress-strain data (Phase 3-2) and inclusion of welding-induced residual stresses (Phase 3-3) in the FE models gave less agreement with test curve. In all phases one or two participants had a substantially stiffer force-displacement curve than the majority of the participants. It is also interesting to observe that the participants that actually predicted a relatively good match of the stiffness behavior before Phase 3-2 when the true (measured) stress-strain data became available.
The failure modes at the ultimate strength and corresponding locations predicted in each phase are presented in Fig. 15; see Fig. 16 for two examples from Phases 2-3 and 3-3 from ID2 of the equivalent plastic strain distribution in Section 4 directly after the maximum load capacity is reached. The change in the failure modes and corresponding locations for the phases and for each participant are shown, and there is no consensus between them. Fig. 15 shows that although the participant's analyses performed very well on average and had low scatter in the predicted ultimate capacity, the participants' assumptions, modeling approaches and procedures affected the prediction of the failure mode and its location to a larger extent. According to Chapter 4, the primary region of failure of the reference structure was in Section 2 and was governed by tripping of the girder, though post-buckling response is also observed in Section 3. Note that none of the participants predicted the failure mode and location from the reference experiment; see Chapter 6.3 for a discussion.

Discussion on the FE models, modeling procedures and access to data
A statistical analysis of the FEA results is presented in Table 7 in terms of the mean value and the standard deviation of the ultimate capacity; see Fig. 17 for the bar diagram. The results show that the mean values for Phases 1 to 2-2 are relatively far from the test results, while the results from Phases 2-3 to 3-3 show good agreement with the test result. The magnitudes of the standard deviations are generally low. It can be seen that for Phases 2-3 to 3-3, the test result is within one standard deviation for Phases 2-3, 3-1 and 3-2 and just outside for Phase 3-3. It was concluded that, despite the differences in the assumptions, modeling approaches and procedures, the participants of the benchmark study were able to satisfactorily predict the ultimate capacity of the reference structure. The following subchapters present more detailed discussions on the results related to the participants' assumptions and modeling approach procedures.     (Phase 1 and 2-1) The FE models and results from Phase 1 served as a basis for the participants' basic assumptions, modeling approaches and procedures. All participants used the same boundary conditions and nominal material data in the FE model, but the choice of the constitutive material model differed (see Table 6). According to Chapter 5.1, the participants treated the scallop holes and collar plates in the longitudinal girder differently in the FE models; the scallop holes were either included in the FE model or ignored. Other differences between the FE models are summarized in Table 3 with regard to the FE solver and solution control, type of element used, number of integration points in the element plane and thickness, mesh size and total degrees of freedom in the model.

FE models and modeling procedures
A detailed analysis and comparison of all the participant's results from Phase 1 did not result in an identification of or correlation between any of the sources in Table 3 that pinpointed why an ultimate capacity value that was too low or too high was predicted compared with the mean value from all the participants' results. An analysis of the results shows that the different FE model definitions, with or without modeling the scallop holes and collar plates in the longitudinal girder, were not the reason for the different failure modes and locations.
The geometric imperfections that were assumed and modeled in Phase 1 are presented in Table 4. By comparing Tables 3 and 4, IDs with similar definitions in Table 3 but with different geometric imperfections in Table 4 were identified, such as IDs 1, 4, 5 and 16. Since the results from these IDs show different ultimate capacities, failure modes and locations, it was expected that the difference in how the geometric imperfections were modeled was the major factor behind for the differences in the results. However, the results from Phase 2-1, where all the participants used the same measured imperfections, ruled out geometric imperfections as the most important factor; the high plate slenderness of the reference structure suppressed the influence of the geometric imperfections. This conclusion is further strengthened by the observation that the IDs 1, 3, 7 and 9 used the same imperfection levels, but got very different results. The results for the majority of the participants (except for IDs 3, 6, 7 and 12-14) did not change significantly compared to the Phase 1 results. It is noted that ID 12, who used the largest stiffener/girder imperfection in Phase 1, got a substantial increase of the ultimate capacity in Phase 2-1, while ID 4, who had the next but largest stiffener/girder imperfection (same plate imperfection) did not experience a similar increase. The statistics in Table 7 also show that the mean value of the ultimate capacity and the standard deviation for Phase 2-1 are just slightly lower than those for Phase 1. However, the failure mode and location for some of the IDs changed between Phases 1 and 2-1, where the majority of them were predicted to occur in Section 4 in the reference structure.
A comparison and analysis of the various constitutive material models (which included how the participants interpreted the material properties) used by the participants were performed. The choice of the material model for the nominal material in Phases 1 and 2-1 had a very large influence on the results. The trend is that participants who used a multilinear (ML) material model presented a lower ultimate capacity than the participants who use a bilinear (BL) or elastic-plastic (EP) material model. The difference between ML and BL results indicate that the finite post yield stiffness may be important compared to models using a yield plateau. This affirms that despite relatively clear rules and guidelines from various references, the variety of constitutive material models used in Phases 1 and 2-1 together with how the participants handled the provided material properties in those models had the largest influence on the scatter in the results.

Access to material data, thickness measurements and measured distortions
The importance of having access to representative material data for the material in the structure subject to analysis is clearly seen by the FEA results from Phases 1 (nominal material data), 2-3 (actual material data) and 3-2 (measured stress-strain curves). In this benchmark study, the use of nominal data gave a misleading ultimate capacity for the reference structure compared to the test data, while the use of actual material data gave a much better prediction. The access to the stress-strain curves of the material did not affect the ultimate capacity level (see Table 7). With regard to the failure mode and location, Fig. 15 shows that they do not change to a large extent between Phases 2-2 and 3-2, and the changes that are observed are related to the other factors introduced in Phases 2-3 and 3-1, i.e. the measured thickness and distortion.
It should be highlighted that access to representative material data is important as well as how these data are used to represent the material in the FE model with a constitutive material model; see the discussion in Chapter 6.1. The choice of constitutive material model was identified as the model uncertainty that had the largest influence on the predicted ultimate capacity levels and its standard deviation. It also remains unclear what the importance of assumptions regarding the presence of a yield plateau has on FEA results. For some of the material coupons, the 0.2% yield strength is nearly 8% below the peak stress prior to a yield plateau.
One of the differences between Phases 2-3 and 3-1 is that measured thicknesses and distortions were added to the FE models in Phase 3-1. The possibility of a more accurate representation of these factors in the FE models did not result in a major change in the ultimate capacity (see Table 7). Hence, for the reference structure in this benchmark study, the measured thicknesses and distortions (stiffener tilt angle) had a low influence on the overall results and the model uncertainty. Table 4 presents the guideline/approach the participants used to model and include geometrical imperfections in their FE models for Phase 1. In the Phase 2-1 FE models, the only change in the FE models compared with Phase 1 is that all the participants used the measured geometrical imperfections. Fig. 18 presents the relative change (with Phase 1 as the reference) in ultimate capacity for Phase 2-1 based on the Phase 1 level. For the majority of the participants, the change was minor. This was discussed in Chapter 6.1 and presented in Table 7 and Fig. 17, in which the ultimate capacity level and its standard deviation were slightly reduced; there were some changes in the predicted failure modes and locations (see Fig. 15). Overall, this indicates that if the goal of the FEA is to predict the ultimate capacity level, any of the guidelines/approaches presented in Table 4 can be used. However, the present study cannot provide a clear conclusion which of the guidelines/approaches presented in Table 4 as regards imperfections levels that shall be used. With regard to the failure mode and location, the results from this benchmark study cannot be used to make any recommendations or draw conclusions based on the FEA and test results; see Chapter 6 for a discussion.

Geometrical imperfections and welding-induced residual stresses
The force-displacement curves from the Phase 3-2 FEA did not capture the nonlinear characteristics from the test data curve, i.e. the larger displacement at ultimate strength and more gradual change in the stiffness (see Fig. 14). In Phase 3-3, the majority of the   participants included welding-induced residual stresses in their FE models (see Chapter 5.5). Table 8 presents a summary of the residual stress level (longitudinal direction of the weld along longitudinals A to D) in the heat-affected zone (HAZ) and its width.
The results from the Phase 3-3 FEA show that the introduction of residual stresses in the FE models gives better agreement with the nonlinear test data force-displacement curve. Fig. 19 presents the relative change (with Phase 3-2 as the reference) in ultimate capacity for Phase 3-3 based on the Phase 3-2 level. Note that the peak load of the ultimate capacity level is reduced on average by 2.2% compared to the Phase 3-2 results. Fig. 15 shows that the majority of the participants predicted a different failure mode and location compared with Phase 3-2. Hence, it appears to be beneficial and it is recommended to introduce welding-induced residual stresses in an FE model according to a guideline or recommended practice even if their absolute levels or distributions have not been quantified by measurements.

Concluding remarks
The present benchmark examined the ultimate limit state analysis of a realistic stiffened plate structure. Seventeen separate academic, industry, and government lab groups worldwide submitted FEA results in three systematic phases, all conducted blindly to the experimental data. The various phases allowed the study organizers to isolate and discuss which data, modeling approaches and procedures have the greatest influence on the uncertainties in the prediction of the ultimate capacity level, failure mode and failure location of the reference structure. Overall, the objective of the study has been accomplished, and the statistics presented in Table 7 show that the results of Phases 3-2 and 3-3 are, on average very close to the results from the test. If the level of accepted uncertainty of the ultimate capacity level would have been defined to be 5% prior to the start of the benchmark study, the majority of the participants' FE models for Phase 3-2 would be considered validated FE models (ID 2, 3, 5, 7, 8, 9, 10, 11, 12, 13, 15, 16, and 17). However, none of  60  50  68  60  50  2  70  100  50  70  100  50  5  100  100  24  100  100  43  6  100  80  40  100  80  80  8  100  34  12  100  82  50  12  50  50  25  50  100  60  13 100 the FE models managed to predict the failure mode and location that occurred in the physical test. This is not surprising given the concurrent nature of failure in two adjacent sections; see Chapter 4. The reference structure used in this benchmark study is an unconventional structure compared to similar types of structures that have been used in former ISSC benchmark studies; see Chapter 1. As there is only a single reference experiment, it is not possible to quantify the uncertainties related to the experimental setup and the measurements carried out during the test. There was good agreement between the FEA results and test results with regard to the ultimate capacity level, but the expected failure mode and location were not well predicted by the FEA. Prediction from ID 14 did identify failure that occurred in the adjacent section during Phase 3-3 despite the use of thicker plating in this region. Based on the documentation from the test of the reference structure and the analyses of all the results from the FEAs, it is difficult to conclude why the majority of the FE models could not predict the correct failure mode and location. Possible reasons for this outcome include the inaccurate modeling of residual stresses, solver-specific and material modeling issues regarding the transition from elastic to inelastic behavior (e.g., see Ref. [32]), and the assumption that the test fixture provides perfectly rigid boundaries. In addition, there are measurement uncertainties related to material properties, imperfection levels etc. It is also likely that several failure modes are closely located and small variations in modeling assumptions may trigger different modes. The unsuccessful tracing of the force-displacement relationship at ultimate strength and in the post-collapse region for all participants is a strong indication that there are certain aspects of the tests that have not been sufficiently well captured. To address the statistical aspects of the benchmark, at least two more tests on a similar structure are needed to understand what cannot be captured by the majority of the FEA. Due to practical limitations, this could not be conducted in this research.

Conclusions
This paper presented a benchmark study on the ultimate limit state analysis of a stiffened plate structure subjected to uniaxial compressive loads. It was carried out as a blind study and divided into three phases, where a comparison of the participants' ability, expertise and recommendations on how to perform FEAs of a reference experiment is presented. Seventeen experienced research groups within the field participated worldwide.
The summary of the results from all the FEAs from Phase 3-2 -where the FE models were based on the measured material data, geometrical imperfections, distortions, and thicknesses of the structural componentsshows very good agreement with the results from the test on the reference structure with regard to the ultimate capacity level. The results of twelve of the seventeen FEAs were within 5% of the ultimate capacity level from the test. In total, the standard deviation of all the FEAs from Phase 3-2 was low. None of the FE models was able to predict the failure mode and location that occurred in the test. Hence, it was concluded that at least two more tests on the same type of structure and experimental setup are needed to conclude if this uncertainty is related to the experiment or to the FE models' ability to mimic the experiment.
All the participants followed recommended practices and guidelines in the design of their FE models. However, attention should be paid to the choice of element and mesh size related to the physical dimensions of the structure and its members, geometrical imperfections, material models and failure modes. These are fundamental issues in ultimate limit state analysis by FEA.
Access to representative material data in the initial design/early prediction of a structure's ultimate capacity was discussed. The FEA that was based on nominal material data for the reference structures largely underestimated the ultimate capacity level of the structure. However, the actual material data, which defined the elastic modulus, yield strength, ultimate strength and elongation, were necessary for reasonably accurate prediction of the ultimate capacity, but did not allow precise tracing of the force-deformation curve at ultimate strength and post-collapse region. The access to specific stress-strain curves for each material did not alter the ultimate capacity level, failure mode or location between the FEAs.
The representations of the material in combination by the choice of constitutive material model (EP, BL or ML) differed between the participants. These varieties were found to be the largest factor related to model uncertainty. It had less influence on the ultimate capacity level compared to the predictions made with respect to the failure modes and locations.
Modeling of geometrical imperfections and distortions affects the ultimate capacity level, failure mode and location of failure. The modeling approaches and procedures used by the participants varied moderately. The substitution of the assumed geometrical imperfections in Phase 1 to the measured imperfections in Phase 2-1 only have a minor influence on the results from the FEAs.
Welding-induced residual stresses were introduced in the FE models in Phase 3-3. The results from the FEAs showed that the nonlinear force-displacement curve from the test data was better replicated compared to when these residual stresses were disregarded in Phase 3-2. The introduction of these residual stresses also changed the failure mode and location between the two phases. Hence, residual stresses should be included to realistically reproduce the ultimate capacity level, failure mode and location of failure in FEA.
In retrospect, further efforts are needed to define what failure of a structural member means within FEA. Unlike closed-form criteria, neither FEA nor test data readily provide a determination of when "failure" occurs. Strict objective criteria would aide greatly in the comparison of FEA predictions of the ultimate strength.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.