The Gender Outcomes International Group: to Further Well-being Development (Going-fwd) Methodology on Identication and Inclusion of Gender Factors in Retrospective Cohort Studies

Background: Gender refers to the socially constructed roles, behaviors, expressions, and identities of girls, women, boys, men, and gender diverse people. It inuences self-perception, individual’s actions and interactions, as well as the distribution of power and resources in society. Gender-related factors are seldom assessed as determinants of health outcomes, despite their powerful contribution. Methods: Investigators of the GOING-FWD project developed a standard methodology applicable for observational studies to retrospectively identify gender-related factors to assess their relationship to outcomes and applied this method to selected cohorts of non-communicable chronic diseases from Austria, Canada, Spain, Sweden. Results: The following multistep process was applied. Step 1 (Identication of Gender-related Variables): Based on the gender framework of the Women Health Research Network (i.e. gender identity, role, relations, and institutionalized gender), and available literature for a certain disease, an optimal “wish-list” of gender-related variables/factors was created and discussed by experts. Step 2 (Denition of Outcomes): each of the cohort data dictionaries were screened for clinical and patient relevant outcomes, using the ICHOM framework. Step 3 (Building of Feasible Final List): A cross-validation between gender-related and outcome variables available per database and the “wish-list” was performed. Step 4 (Retrospective Data Harmonization): The harmonization potential of variables was evaluated. Step 5 (Denition of Data Structure and Analysis): Depending on the database data structure, the following analytic strategies were identied: (1) local analysis of data not transferable followed by a meta-analysis combining study-level estimates; (2) centrally performed federated analysis of anonymized data, with the individual-level participant data remaining on local servers; (3) synthesizing the data locally and performing a pooled analysis on the synthetic data; and (4) central analysis of pooled transferable data. Conclusion: The application of the GOING-FWD systematic multistep approach can help guide investigators to analyze gender and its impact on outcomes in previously collected data.


Introduction
The distinction between sex and gender, which is clear and common in social sciences, has largely been neglected in health sciences. Indeed, sex and gender are often erroneously used and/or measured interchangeably. Given that sex and gender are not independent of each other, solely assessing one or the other cannot account for identi ed variations in health (1,2). Furthermore, although the reasons explaining the increasing incidence of chronic diseases are incompletely understood, changing family, social, institutional roles, and attitudes of men and women in the last decades ultimately play a role. Thus, a wide range of behavioral factors, psychosocial processes, personal, cultural, and societal factors can create, suppress, or amplify underlying biological health differences (3,4).While differences in health status and outcomes have been attributed to biological sex, it is now increasingly recognized that both sex and gender in uence the risk of developing certain diseases, presentation of symptoms, severity of illness, response to drugs or non-pharmacological interventions, and seeking care behaviors (5). More importantly, gender may also have a bearing on people's access to and uptake of health services and the resulting health outcomes experienced throughout the life-course (6). Consequently, it is now understood that the intersectionality of gender with other social factors such as race, age, ethnicity, culture, and sexual orientation, plays a central role in an individual's health. The integration of a gender-based framework in health research is a crucial and long-awaited development (7).
When considering gender aspects in the evaluation of clinical outcomes, the rst challenge for scientists originates from the apparent lack of standardized method to measure the complexities that gender encompasses. Recently, through a Pan-Canadian collaboration of a multi-disciplinary team of scientists, a comprehensive list of gender-related variables was established and collected in the setting of premature cardiovascular disease. Constructed with the aim of exploring the impact of gender on the clinical outcomes of young patients with acute coronary syndrome, the GENESIS-PRAXY (Gender and Sex Determinants of Cardiovascular Disease: From Bench to Beyond) Gender Score (GPGS) was developed (8,9). The GPGS measures a comprehensive group of gender-related factors and offers a pragmatic means to prospectively explore the relationships between sex, gender, and health outcomes. In patients with premature and established cardiovascular disease, gender factors, independent of biological sex, emerged as powerful predictors of the acquisition of risk factors as well as of one-year adverse health outcomes (9,10). Most signi cantly, regardless of sex, patients who exhibited gender factors most traditionally ascribed to women's identity and roles in society were more likely to have a recurrent cardiac event within the rst year. While these results have important direct implications for expanding the measurement of gender determinants of health to other populations, they may also identify novel determinants of health care costs that could be averted.
To facilitate the integration of sex and gender-based analyses, we developed a standard methodology that can be applied to retrospective studies for testing the associations of gender-related factors with clinical and patient-relevant outcomes.

Methods
The GOING-FWD (Gender Outcomes INternational Group: to Further Well-being Development), is a personalized medicine project that utilizes a big data approach. It was recently co-funded by the Canadian Institutes of Health Research, and GENDER-NET plus, which is a part of the European EU H2020 initiative (http://gender-net-plus.eu/joint-call/fundedprojects/going-fwd/).
For the GOING-FWD project, around thirty accessible databases of observational studies and registries that include noncommunicable chronic diseases (NCDs) among a four-country transatlantic network (i.e. Austria, Canada, Spain, Sweden) were identi ed. The overarching aims of the GOING-FWD project were 1) to integrate sex and gender dimensions in applied health research, and 2) to evaluate their impact on clinical cost-sensitive outcomes and patient-reported outcomes related to quality of life in NCDs including cardiovascular disease, metabolic disease, chronic kidney disease and neurodegenerative disease. Each partner of the Consortium provided the data dictionary of the retrospective cohort studies conducted in their respective countries.
The GOING-FWD Consortium is composed of investigators with multidisciplinary expertise in gender dimension, psychosocial science, computer science, epidemiology, endocrinology, internal medicine, renal and cardiovascular medicine, reproductive health, neuroscience, preclinical and clinical experimental research, health outcome research, nursing, and biostatistics. The investigators were assigned to one of the 3 work packages. The GOING-FWD methodology proposed therein is the result of the integrated activities carried out by the GOING-FWD investigators from March 2019 to December 2019. A 5-step procedure was developed that can be applied to pre-existing observational cohorts for the integration of gender-related factors in assessing their association with selected health outcomes.
GOING-FWD also has a patient partner advocate group. All interactions with patient partners are based on inclusiveness, support, mutual respect, and co-building. For example, patient partners can assist in knowledge dissemination (e.g., summer institutes, on-line educational materials, trainee journal club meetings, public forum presentations, may co-author manuscripts and provide feedback on draft manuscripts during development and participate in teleconferences. A patient partner representative also attends monthly GOING-FWD steering meetings. Ethics approval for the project was obtained from the coordinator center at McGill University, Canada (2020-5452).

Results
A multistep methodology was developed as summarized in Figure 1.
Step 1 -Identi cation of Gender-related Variables Based on the data dictionaries provided by all participating centers, a preliminary list of the gender-related factors available in selected datasets was compiled by the coordinating center. The template including the identi ed sex and gendered factors was presented and discussed at the rst consortium meeting (Montréal, April 2019) by all investigators and stakeholders. Guided by the gender framework of the Women Health Research Network (i.e. gender identity, gender role, gender relations and institutionalized gender) (11) (Figure 2), and available literature in the four NCDs areas, the investigators created an optimal "wish list" of gender-related variables/factors: the de nitions and validity of the proposed variables were discussed and expert consensus reached.
Investigators considered variables that differ between men and women in terms of prevalence and/or identi ed (in the published literature) as exerting different effects on the outcomes of men and women as 'gender-related' variables. A revised draft template including additional gendered variables was created (Table 1).
Step 2 -De nition of Outcomes Each of the cohort data dictionaries were screened for outcomes of interest (including clinical/survival and patientreported outcomes) by the coordinating center. Similar to gendered variables, a second working group was tasked with developing a list of outcome variables, using the International Consortium for Health Outcome Measurement (ICHOM) framework (12) and cost-sensitive variables and/or PROMs collected in all databases as the basic 'outcomes variable list'.
The ICHOM Standard Sets are standardized outcomes, measurement tools, time points, and risk adjustment factors for a given condition (e.g. chronic kidney disease [CKD] diabetes, etc.). Developed by a consortium of experts and patients in the eld of outcomes research, the ICHOM Standard Sets focus on clinical and patient-centered outcomes. By creating a standardized list of the outcomes based on the patient's priorities, the ICHOM framework ensures that the patient remains at the center of care (12). A pre-speci ed list of potential outcomes was created by all GOING-FWD investigators.
Depending on the study population and type of dataset (e.g. administrative, observational cohort study), we identi ed as relevant the following cost-sensitive outcomes: (a) In-patient outcomes including: in-hospital length of stay, in hospital complications and/or death and readmission within 30 days of discharge; (b) Out-patient outcomes including: access and/or numbers of visits and procedures, admissions, death, progression of disease, and disability. We also looked for the availability of any PROMs, including symptoms (e.g. pain), functioning, health related quality of life, depression, and stress. The ICHOM speci c-disease outcomes were considered for each of the 4 main clinical areas of interest. The revised draft template with outcomes, compiled by the investigators, was discussed and approved by all ( Table 2). Step

-Building of Feasible Final List
The two lists were sent to each participating center to re-screen their datasets for the presence of the identi ed sex and gender-related and outcome variables. A cross-validation between gender-related and outcomes variables available per database was performed both locally and centrally. In case of disagreement or discordant de nitions of variables among the wish-list and the actual-list, a discussion to reach consensus between coordinator center and local PI's was performed.
In principle, a more inclusive approach was pursued for both gender-related variables and outcomes de nition.
After the double check of wish list and local actual list, the nal feasible list of variables (core dictionary) was built and each country partner used the lists to apply to their respective research ethics boards according to the country regulations.
Step 4 -Retrospective Data Harmonization Once the nal list was compiled, the harmonization potential of gender-related and outcomes variable was assessed using the Maelström Research guidelines for rigorous retrospective data harmonization and merging when possible (13) (Core Dataset).
The harmonization across the different databases is a premise for assessing the feasibility of big data analysis, as well as minimizing deviations in data measurement across independently recruited databases. Data harmonization methodology consists of assessing the presence and de nitions of common variables across the different databases, followed by the creation of a harmonized dataset and subsequent extraction of information from study-speci c datasets into the harmonized dataset. For example, while many datasets may record smoking status, the exact de nition of this variable may differ between datasets: one may de ne this variable as dichotomous, others may quantify the number of cigarettes smoked by the participant. Through data harmonization, a new variable de nition is created to include the information from each of these datasets, which in this case would be reduced to smoking status as a dichotomous variable.
Throughout this process, harmonized de nitions that are created are scrutinized until a consensus is reached.
Step 5 -De nition of Data Structure Beyond harmonization, the structure and country speci c management of health data was recognized as crucial to planning and conducting the nal analysis addressing the relationships between the gender-related factors. The analysis plans for each country will be based on the following options: 1) if data is not transferable even when anonymized -studyspeci c data analyses will be done locally followed by a meta-analysis combining study-level estimates; 2) for multiple cohorts in different countries, analyses will be done centrally, but the individual-level participant data will remain on local servers using a federated analysis approach; 3) the local data is synthesized and then a pooled analysis of the synthesized data is performed, or 4) if the data is transferable: data will be pooled and analyzed at a central location ( Figure 3).

Discussion
The GOING-FWD methodology is a multistep process that provides insights on how to incorporate a measure of genderrelated factors when variables have already been collected. The Big Data paradigm shift is signi cantly transforming healthcare and biomedical research (14). Massive volumes of aggregated biomedical data often display different levels of granularity fostering the capability to explore, on large international scales, the effect of variables such as gender and/or sex. Big data allows researchers to overcome sample size issues and perform types of analysis such as interaction or mediation that would not be feasible and reliable in small cohorts/studies. Nevertheless, there are some important issues related to data privacy and the merging of different databases when a cross-country comparison is planned, especially where issues on General Data Protection Regulation (GDPR) need to be addressed in detail.

Strengths
The GOING-FWD framework is a feasible methodology to foster the assessment of the gender impact on outcomes in retrospective studies. The screening of each dataset is a step that not only allows to identify the gender-related variables but also provides the rationale for selecting psycho-social factors that could be collected prospectively in the same cohort. The effort of investigating how sex and gender-related factors impact clinical and patient-related outcomes in NCDs is essential as it provides evidence for sex-and gender-tailored interventions.
We have learned that a multidisciplinary team is a prerequisite for developing such methodology including gender-experts and patient partners. Patients with lived experience can contribute to understanding what is really important for a speci c disease which further strengthens the concept of patient empowerment in clinical practice.
The international nature of GOING-FWD methodology highlights important considerations on the complexity of gender.
Gender norms, identities, and relations vary by culture, historical era, ethnicity, socioeconomic status, geographic location, and other factors. We expect that the gender behaviours and attitudes captured by our variables may differ among women, men, and gender-diverse individuals as well as between these groups. Gender norms also change overtime and across countries. Therefore, as researchers, we need to recognize its dynamic nature when we integrate gender in clinical research questions.
We also envisioned our multi-country analyses as an opportunity to capture institutional gender by including some country speci c variables that are commonly available like the Gender Inequality Index (GII) developed by the United Nations Development Programme (UNDP) (15). The GII is a composite measure to quantify gender inequality within a country and measures opportunity costs, reproductive health empowerment, and labor market participation. Another similar measure of gender is the European Institute for Gender Equality (EIGE)'s Gender Equality Index, which includes additional details about country speci c domains of health, violence against women, work, money, knowledge time and power (16). The idea is to re-look and re-think on how we can gain the most from data on gender that are already available.
Finally, the GOING-FWD approach is timely and might foster inclusion of gender in understanding the 2019 coronavirus disease (COVID-19) pandemic. In fact, the global COVID-19 economic and medical crisis could be the rst outbreak where sex and gender differences are recorded and taken into account by researchers and policy makers. The GOING-FWD methodology will be instrumental in exploring the impact of various gender domains on outcomes across countries.

Challenges
In developing the GOING-FWD methodology, we have faced practical challenges. Firstly, the lack of a standardized de nition of gender-related factors is perceived as an obstacle to researchers even if they are interested in the topic. The low availability of gender-related factors in retrospective studies is not surprising but this should not preclude analyses.
We strongly encourage clinical and even preclinical researchers to start from what they have even if only one genderrelated factor is available. Merging more datasets allows us to perform analyses that incorporate interaction and mediation given large sample sizes. Secondly, in the current era, data accessibility and data protection issues in international networks can represent a deal-breaker in pursuing this kind of research approach. Increasingly strict data protection regulations in many jurisdictions limit the ability to share sensitive health information. This requires the application of privacy enhancing technologies to enable the necessary analyses to be performed without the transfer of personal health information. Finally, harmonization is a necessary step to allow big data analysis, but it is a timeconsuming process and susceptible to pitfalls related to the quality of the process and di culties of maintenance when several databases from different countries are merged. Personnel with explicit knowledge and skills are required to perform data harmonization from both technical (i.e. computing science, mathematics) and clinical (i.e. life science) perspectives.
We believe that our example of a derived "wish list" based on selected variables offers a standardized tool that can be widely used to explore the consistency of associations with health behaviours and outcomes.

Perspective and Signi cance
The GOING-FWD Consortium, a multidisciplinary network of Canadians and European researchers and patient-partners, provides a framework that will support clinical researchers in integrating gender relevant factors in their research questions when using already collected databases hence providing solutions for the challenges that such approach poses.

Page 7/16
The application of a systematic multistep approach de ning gender-related variables, the use of data harmonization and country speci c data structure models, inform the identi cation and inclusion of gender factors in retrospective cohort studies. Gleaning important information on gender will not only strengthen current clinical practice but will also provide a stepping -stone for sex and gender -tailored interventions and care.

Declarations
Ethics approval and consent to participate: "Not applicable" Consent for publication: "Not applicable" Availability of data and materials: "Not applicable"

Roles
Institutionalized Gender Abbreviations: BSRI -Bem Sex-Role Inventory; HAD -Hospital Anxiety and Depression Scale; SES -socioeconomic status; GII -gender inequality index; hx -history; PPAQ -Pregnancy Physical Activity Questionnaire. The GOING-FWD Multistep Methodology on Identi cation and Inclusion of Gender Factors in Retrospective Cohort Studies. Figure 2