Sex difference in open-water swimming—The Triple Crown of Open Water Swimming 1875-2017

The aim of the present study was to compare swimming performances of successful finishers of the 'Triple Crown of Open Water Swimming' from 1875 to 2017, assessing the effects of sex, the place of event and the nationality of swimmers. Data from 535 finishers in ‘Catalina Channel Swim’, 1,606 finishers in ‘English Channel Swim’ and 774 finishers in ‘Manhattan Island Marathon Swim’ were analysed. We performed different analyses and regression model fittings for all swimmers and annual top-5 finishers. Effects (sex, event, time, nationality) and interaction terms (event—time) were examined through a multi-variable spline mixed regression model. Considering all swimmers, we found that (i) women were approximately 0.06 km/h faster than men (p = 0.011) and (ii) Australians were 0.13 km/h faster than Americans (p = 0.004) and Americans were 0.19 km/h faster than British (p<0.001) and 0.21 km/h faster than Canadians (p = 0.015). When considering annual top-5 finishers, we found that (i) women were 0.07 km/h slower than men (p = 0.042) and (ii) Australians were not faster than Americans (p = 0.149) but Americans were 0.21 km/h faster than British (p<0.001). Our findings improved the knowledge about swim performances over time, in the three events, considering the effects of sex and the nationality of swimmers.


Introduction
Open-water ultra-distance swimming is of increasing popularity. The number of athletes competing in channel [1,2] and lake [3,4] crossings increased in recent decades and the performance of the athletes improved [1,4]. Especially, women reduced the gap to men considerably in long-distance swimming [3][4][5]. This decrease in sex difference in performance might be attributed to the increased participation of women in open-water swimming, which is associated with improved training and nutrition [4].
This trend was of great scientific interest as sex difference in performance is a major field in exercise physiology. Particularly, a debate has been arisen on whether women were going to outperform men in the future considering their larger rate of improvement compared to men a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 swimmers and multiple crossings of a single swimmer were excluded. Swim times (h:min:s) were converted to swimming speed (km/h) to compare performance among the three different distances although different distances might affect swimming speed. In addition to swim times, event and year of competition, we collected: name-surname, sex and nationality of swimmers. We considered all finishers and the annual fastest top five separately. In one calendar year, one person could swim several times the same distance, but we deleted duplicate cases, when two observations had the same swim time.

Statistical analysis
All statistical analyses were performed by the statistical package R, R Core Team (2016), Vienna, Austria, URL https://www.R-project.org/. Swimming speeds (km/h) were presented as mean (standard deviation) and categorical variables as number and percentage, N (%). Seventy non-missing nationalities were recorded, which were grouped into nine regions/countries: Africa, Asia, Australia (AUS), Canada, Central-South America, Europe except Great Britain (Rest of Europe), Great Britain (GBR), New Zealand (NZL) and USA. Great Britain was separated from the rest of Europe to better study the relationship between nationality and performance of British nationality in the 'English Channel Swim'. Participation to each event, by nationality and during time, was compared between sexes using chi-square test for frequency distributions. We performed t-tests to compare the average speed between sexes in each event, then for the most prevalent countries and by period of time. In addition, effects (i.e. sex, event, time, nationality) and interactions (i.e. sex-time, event-time, event-sex, event-sex-time) were considered more rigorously through a spline regression model, with five degree of freedom basis splines in function of time. Time was defined as years from the median year. Since a swimmer might finish more than one race, we fitted a mixed model, with random effects on intercept for each unique swimmer. Different regression model specifications, with none, one, two and three term interactions were considered. Model selection was performed using both Akaike information criterion (AIC) and the Bayes information criterion (BIC).
The selected model was specified as follow: Speed $ ½Fixed ef f ects ðXÞ ¼ Sex þ Event : BSðcyear; df ¼ 5Þ þ Nationality þ ½Random ef f ects of intercept ¼ Swimmers where BS(cyear, df = 5) denoted the 5 degree of freedom (df) basis splines and cyear denoted the year centered on the median. We performed two different analyses and we fitted two regression models: one for all swimmers and one for the annual top five finishers. For all tests and regressions, we defined statistical significance at p 0.05.

Participation
Between 1875 and 2017, a total of 2,915 observations from 1,875 different finishers were considered, i.e. multiple finishes per swimmer were analyzed. The average finishes were 1.56 per swimmer, though only 454 (24%) swimmers have more than one record. The number of successful female and male solo finishers in 'Catalina Channel Swim', 'English Channel Swim' and 'Manhattan Island Marathon Swim' were 535, 1,606 and 774, respectively. The number of women was 553 (29% of the total unique swimmers) with 921 finishes (32%) and the number of men was 1,322 (71%) with 1,994 (68%) finishes. The difference in sex distribution was presented in Table 1. Participation in each event and during time was different between females and males. Female participation had been always

Performance considering all swimmers
The (kernel estimate) density curves of the observed swimming speeds by sex and event were presented in S1 Fig. We tested means (t-test) between sexes for each event and for the most prevalent regions/countries (  Table 3 in order to better highlight the trends in the last four decades. In Table 2, model 5, with three-term (sex-time-event) interaction, compared to the model selected, with (event-time) interaction, had a slightly lower AIC but a higher BIC. Therefore, the selected reduced model, nearly matched, or in some cases outfitted the full model, which would be quite tricky to interpret. At the same level of calendar year, nationality and event, men were near 0.06 km/h slower than women (estimate = -0.06478, p = 0.011) ( Table 2, Fig 1).
Therefore, we observed an effect of sex in the intercept of the model. Event was not statistically significant alone but in interaction with time (the last term of interaction 'Manhattan Island Marathon Swim': BS (Year) 5 had p = 0.023). Accordingly, an effect of event in the slope of the model was shown. As presented in Fig 1, the trend of performance over time was The range (min, max) of yearly estimated values of swimming speeds by sex, for each period of time from 1980 onward and for each event, was reported in Table 3. That is, the minimum and maximum values of 10 fitted years, except the last period of 7 years, were presented for each period, event and sex. Because the interaction terms (sex-event) and (sex-time) were not considered in the model, the range of the estimated difference between sexes, men rangewomen range, was constant (-0.06, -0.06) across event and time. In the 'English Channel Swim', the differences between maximum and minimum of the estimated values of each sex were the smallest, and the minimum and maximum reduced over time. In the 'Catalina Channel Swim', the minimum and maximum increased until 2000, and then decreased. On the contrary, the differences between maximum and minimum of each sex decreased until 2000 and then increased. In the 'Manhattan Island Marathon Swim', the minimum and maximum increased during the first period then decreased and the maximum increased again in the last period. The differences of minimum and maximum instead, after first decreasing, increased. In particular, the range of estimated male performance in period [1990,2000] was equal to the range of estimated female performance one decade earlier, in period [1980,1990].
To compare estimated values (Table 3) with observed values, the average swimming speed comparison between sexes by event and period of time was reported (S1 Table). P-values of mean t-tests were presented in order to refine the descriptive part of the analysis and to provide an overview of the interaction effects that were not considered in our statistical model. Moreover, details of average performance before 1980 were also provided. In particular, in 'Catalina Channel Swim', no differences between females and males during time were found. In 'Manhattan Island Marathon Swim', females were slower only in period [1875,1960) compared to males. Average performance, from 1980 onward, for both sexes, after increasing over

Performance considering the annual top five swimmers
The total number of observations for annual top five swimmers was 1150 (741 swimmers). The number of women was 299 (40%) with 506 observations (44% of the total observations) and the number of men was 442 (60%) with 644 (56% of the total observations). The (kernel estimate) density curves of the observed swimming speeds, by sex and event was plotted in S1 Fig. Considering the annual top five swimmers, the distribution of women, in particular with regards to its skewness, was more similar to the distribution of men, compared with all swimmers. Participation and average swimming speeds for each event and the most prevalent nationalities (USA, AUS, GBR, and rest of Europe) were presented in Table 4. The participation did not change with event (p = 0.286) and with time (p = 0.071), but changed with nationality (p<0.001). No differences between females and males for average swimming speeds in each event were found and in each event for the most prevalent nationalities.
Predicted values (lines) and observed swimming speeds (points) for each sex during time, from 1980 onward, by sex, event and nationality were plotted in Fig 2. Predicted values were computed according to the spline regression model whose details, including model selection criteria were provided in Table 2. In the model, all observations from 1875 were considered, Table 3. Estimated values of swimming speed by event, sex and year.

English
Period of time   Table 3 in order to better highlight trends in the last four decades. In Table 2, it was shown that the selected model-with interaction (event-time)-was the best trade-off between the full complex model, with the lowest AIC, and a more parsimonious model with a lower BIC. Men were by near 0.07 km/h faster than women (estimate = 0.07157, p = 0.042) at the same level of calendar year, nationality and  Fig 2). Therefore, an effect of sex in the intercept of the model was shown. Event was statistically significant alone and in interaction with time. Therefore, effects of event in both slope and intercept of the model were found. The trend over time was overall increasing, but not monotonically, for all of the three events (Fig 2). Regarding nationality, Australians were not faster than Americans (p = 0.149), but swimmer from Great Britain were near 0.21 km/h slower than Americans (estimate = -0.20791, p<0.001).
The range (min, max) of yearly estimated values of swimming speeds by sex for each time period from 1980 onward was shown in Table 3. That is, for each period, event and sex, the minimum and maximum values of 10 fitted years, except the last period of 7 years, were reported. Because the interaction terms (sex-event) and (sex-time) was not considered in our model, the range of the estimated difference between sexes, men range-women range, was constant (0.07, 0.07) across event and time. The differences between maximum and minimum of the estimated values of both sexes were the smallest, and the minimum and maximum increased until 2010 and then decreased in the 'English Channel Swim'. In the 'Catalina Channel Swim', instead, the minimum and maximum decreased until 2000, and then increased. Furthermore, the differences between minimum and maximum increased until 2010 and then increased. In the 'Manhattan Island Marathon Swim', the minimum and maximum increased during the first period then decreased and increased again in the last period. The differences between minimum and maximum, after first decreasing, increased.

Discussion
The main findings of the present study were that (i) the participation in women and men varied by nationality and for all swimmers also by event, (ii) the nationality of finishers varied by event, (iii) women were faster than men when considering all swimmers; on the contrary, men were faster than women when considering annual top five, (iv) swimming speed was the fastest in the 'Manhattan Island Marathon Swim' and was the slowest for all swimmers in the 'English Channel Swim', and (v) Australians were faster than Americans, who in turn were faster than British and Canadians (all swimmers, mixed model).

The participation in women and men varied by nationality
A first important finding was that female participation varied by nationality. Australian swimmers have a different approach to open-water ultra-distance swimming events than female swimmers from the United States of America and Great Britain. In pool-swimming at world class level, swimmers from Australia were more consistent than those from the United States and other nations when the relationship between world-ranking and performance at the Olympic Games was investigated [12].
In open-water swimming events, the rates of participation of women and men vary by race distance and event. In these solo swims with a partially very long history, the participation in women and men varied by nationality. However, no dominance of a particular nationality for all race distances was observed in the FINA (Fédération Internationale de Natation) races in 5 km, 10 km and 25 km held between 2000 and 2016 [13]. In these races, women and men compete together in partially very large fields where, interestingly, men were always faster than women although the possibility of drafting exists. In solo swims like the open-water swimming events of the 'Triple Crown of Open Water Swimming', drafting is not possible. Potential explanations could be that swimmers competing in FINA World Cup Races are elite swimmers also competing at World Championships and Olympic Games whereas swimmers competing in the events of the 'Triple Crown of Open Water Swimming' are mostly recreational athletes. Therefore, the motivation for swimmers competing in the FINA races seems different since these races offer prize money [https://swimswam.com/prize-money-2017-fina-worldchampionships-remains-unchanged/] whereas no prize money can be earned in the events of the 'Triple Crown of Open Water Swimming'.

The nationality of finishers varied by event
Considering the variation of the nationality of finishers by event, more US-American swimmers competed in the 'Catalina Channel Swim' and in the 'Manhattan Island Marathon Swim' and more British swimmers in the 'English Channel Swim'. This is most likely due to the fact that travelling for US-Americans to Europe seems more costly than travelling within the own country. Similarly, it would be more affordable for British swimmers to travel to the 'English Channel' than to fly over the Atlantic to the United States of America.
In the FINA races held worldwide between 2000 and 2016 in 5 km, 10 km and 25 km, swimmers preferred races held on the continent where they lived. Europeans were the most finishers in races held in Europe, and Americans finished most in races held in America. Also, relatively more Asians finished in races held in Asia than on the other continents. Nationality played a role, not only for performance and participation, but also on the prevalence of nonfinishers of the 10 km and the 25 km races [13].

Sex differences in swimming speed
Regarding the summary statistics, women were faster than men, in the 'English Channel Swim' and in the 'Manhattan Island Marathon Swim' when all swimmers were considered. This could be also influenced by the difference in sex participation. In fact the overall men-towomen ratio was the lowest (1.76) in 'Catalina Channel Swim' compared with 2.23 in the 'English Channel Swim' and 2.37 in the 'Manhattan Island Marathon Swim'. This could be explained by the fact that more casual women swimmers enrolled in 'Catalina Channel Swim', which slowered the overall average time for women. When the annual top five were considered, the men-to-women ratio was close to 1 and did not change by event. Moreover, in the 'Catalina Channel Swim' nationality did not vary by sex.
However, after correcting for nationality, repeated measurements within swimmers and interaction terms event-time for all swimmers, women were faster than men, but, on the contrary, men were faster than women when considering annual top five swimmers. These results support recent findings for the 'Catalina Channel Swim' [9] and the 'Manhattan Island Marathon Swim' [10]. However, when all women and men were considered in the 'English Channel Swim', findings of existing studies could not be confirmed where men were faster than women [1,14].
When the annual five fastest swimmers were considered, we found no differences in swimming speed between women and men examining the summary statistics for the three events and all prevalent nationalities. This finding is in contrast to the FINA races where men were faster than women with a similar sex difference for all three distances [13] but in the analysis of the FINA races, all women were compared to all men. However, after correcting for nationality, repeated measurements within swimmers and interaction terms: event-time, our findings for annual top five are in line with the FINA races [13].

Swimming speed was the fastest in 'Manhattan Island Marathon Swim'
A further finding was that swimming speed was the fastest in 'Manhattan Island Marathon Swim' although the event was the longest with 45.8 km around the Manhattan Island in New York compared to 33.7 km across the English Channel between England and France and the 33 km across the Catalina Channel in Southern California. The faster swimming speed in 'Manhattan Island Marathon Swim' compared to the other two events is explained by the current of the Hudson River and the tides (http://blog.marathonswimmers.org/2011/06/tides-areeverything). In the 'English Channel Swim' and in the 'Catalina Channel Swim', swimmers have to swim against currents. In the English Channel, also tides can prevent swimmers from achieving fast swim times (www.bbc.com/news/uk-england-kent-10782301).

Australians were faster than Americans, British and others
An important finding was that both female and male swimmers from Australia were the fastest. Based upon finding for triathletes competing in 'Ironman Hawaii' one might assume that mainly athletes from the local region where the events are held would participate to them and would also be the fastest [15]. Indeed, in the 'English Channel Swim' between 1875 and 2013, mainly British swimmers participated and were also among the fastest [2].
However, in the present analysis, female swimmers from Australia were the fastest in all three events although these events were held in the United States of America and in Europe. It is well known that swimmers originating from Australia are among the fastest together with US-American swimmers in pool swimming at world class level events such as the Olympic Games [11,12].
Australia was placed third in the Rio 2016 Olympic Games aquatics medals and second in the swimming events (www.fina.org/event/xxxi-olympic-games/medalsm). Australia was placed eighth in the London 2012 Olympic Games aquatics medals, third according to the total number of medals (www.swimming.org.au). Based upon the actual findings, Australians might also be the best in open-water ultra-distance swimming.

Limitations, strength, implications for future research and practical applications
Considering the differences among the three events, it might be highlighted that the findings of the present study were event-specific and should be generalized to other ultra-distance swimming races with caution. The greater limitation of this study was that information about the age of each swimmer was not available. For this reason, our current model might not be properly specified. Moreover, interaction between sex and nationality was not considered in the regression model because, for some region/country, the number of observations was small. On the other hand, a reduced model with only one interaction term (event-time) was used. For this reason, we found that sex differences did not change with time and event. AIC criterion suggested, as an alternative model, a three-way interaction sex-event-time but interpreting this model and identifying the global effect of each predictor would not have been immediate. Strength of this study was that, in contrast to previous research that examined a limited sample of swimmers (i.e. top swimmers) [1,9,10], it adopted a novel approach by investigating all finishers. Future studies might investigate the sex difference in performance in the FINA (Fédération Internationale de Natation) for the official World Cup races held over 5 km, 10 km and 25 km [16][17][18]. Based upon the actual findings, women should also be faster than men in these races. The results provided practical information for coaches and swimmers on important aspects of performance so they could optimize their preparation for such races. This was of great practical value especially considering that open-water swimming rapidly grew in popularity [19].

Conclusions
In our statistical modeling framework, women were faster than men, and Australians were faster than Americans, who were faster than British and Canadians. When the annual top five swimmers were considered, men were faster than women, and Americans were faster than British.
Supporting information S1 Fig. Estimated kernel density of observed swimming speeds by sex and event,