The Avon Longitudinal Study of Parents and children ALSPAC G0 Partners: A cohort profile [version 2; peer review: 1 approved with reservations]

ALSPAC is an ongoing population-based, observational study designed to investigate how genetic/environmental characteristics might influence the health/development of children and their parents. It has evolved to facilitate the measurement of many outcomes in the parental cohort. Pregnant women resident in Bristol, UK with expected dates of delivery between April 1991-December 1992 were eligible. 14,541 pregnancies were originally enrolled. Partners of the pregnant women were initially invited to take part by the women with formal enrolment of individuals since 2010. Data has been collected from 12,113 partners, with 3,807 formally enrolled. Data collected to date: 21 questionnaires, clinical follow up in 2012 (mean age: 53 years) and a family-based clinical follow-up currently ongoing (mean age: 63 years). Questionnaires have asked about a wide range of environmental measures, physical/mental health and other phenotypic details at regular timepoints up to 2005, once in 2012 and regularly again since 2018, including six questionnaires completed during the COVID-19 pandemic. Clinical measures include anthropometrics, blood pressure, body composition, cardiovascular health and a fasting blood sample. DNA has been extracted with genome-wide data available on >3,000 partners and exomes on ~1500 trios. The data contributes to one of the most deeply phenotyped birth cohorts in the world, providing trios of data, allowing comparison between parents and offering multi-generational information, and is fully accessible through a managed access process.


Amendments from Version 1
The second version of this manuscript takes into account the useful comments from the reviewer (detailed in our response to the reviewer). The primary updates are summarised as follows: - Updates to the abstract regarding when questionnaires were collected and emphasising the benefit of trios of data.

Introduction
Why was the cohort set up?
The Avon Longitudinal Study of Parents and Children (ALSPAC) was originally designed to determine the ways in which genetic and environmental characteristics influence the health and development of children and their parents (Golding et al., 2001). Substantial follow-up has focussed primarily on the mothers (G0 -generation 0), the children (G1 -generation 1) and more recently the children's children (G2 -generation 2), with cohort profiles available for these groups of participants (Boyd et al., 2013) (Fraser et al., 2013) (Northstone et al., 2019) (Lawlor et al., 2019. However, as noted in the mother's cohort profile, substantial information has also been collected on the partners of the mothers, as one aspect of the child's environment. Data has primarily been collected through self-completion questionnaires completed by the partners themselves but also information collected from the mother about her partner(s). More recent data collection has shifted focus to the G0 cohort of both mothers and their partners as a population of interest in their own right. This facilitates study across the life-course and further endorses the multi-generational nature of the study with data collected about the parent's parents and grandparents and as well as their grandchildren. At the time of writing, the study is inviting all G0 and G1 participants to attend a face-to-face clinical assessment, thereby marking the true family nature of the study and further facilitating a wide range of health and related research both across and between generations.
Who is in the cohort?
The original study design enrolled pregnant women who were resident in a defined geographical area in and around Bristol, South-West England with an expected date of delivery between 1 April 1991 and 31 December 1992. Of the estimated 20,248 eligible pregnancies (Lawlor et al., 2019) in the area over the time period, 14,541 were enrolled into the study (known as core pregnancies). From the age of 7 years, children from eligible pregnancies who were not originally enrolled were encouraged to join the study, resulting in a further enrolled 906 (non-core) children.
During the initial enrolment period and once they had consented to join, women were asked to complete postal questionnaires at various timepoints during pregnancy and beyond (Fraser et al.). At this time, following advice from the ALSPAC Ethics and Law Advisory Committee [ALEC]), it was decided not to enrol the study "fathers" directly (Birmingham, 2018), instead, mothers invited their partners to take part if they wanted to. Partner questionnaires were usually timed to coincide with those sent to the mother -she would be sent both along with instructions to pass on to her partner if she chose. Study staff did not define 'partner' and it was left to the mother to pass on the questionnaires to whoever she felt appropriate (Golding et al., 2001). In the vast majority of cases this was the biological father or father figure, but not always; mothers could pass on to a same-sex partner or in a very small number of cases her own mother, if she deemed it appropriate. This process meant that the study never held contact details for partners (as we will refer to them from now on).
In 2010, the focus shifted following feedback from G0 partners that they wished to be involved in the study 'on an equal footing' to the mothers. By this time, we had already collected blood from some partners opportunistically at child-based clinics (ages 13, 15 and 17 years), if they accompanied their G1 offspring. We therefore wanted to approach partners more directly, encouraging them to formally enrol. We still had to send information via the mother as they remained our key contact at the time. As a result, we recruited ~2000 partners, 730 of whom (37%) attended a clinic held in 2010-2011 which had the primary aim of collecting blood samples, in order to genotype the partners and obtain data on trios (both biological parents and their offspring). We also measured height, weight and blood pressure.
Shortly after, a larger "Focus on Fathers" clinic was planned. In order to increase the numbers available to invite to this clinic, we sought ethical approval to further recruit partners via the G1 cohort. As a result (and following open enrolment of partners from that point on) we currently have 3,807 partners formally enrolled (66 of whom have never provided data; described in detail in the data dictionary) and data obtained from a further 8,296 partners (who have completed at least one questionnaire given to them by the mothers but who have not formally enrolled in ALSPAC). We therefore consider 12,113 partners to be the baseline number of partners who have had contact with the study (Figure 1). However, it should be noted that 3,807 is the number of recruited G0 partners at the time of writing as we have contact details for these participants. They will form the baseline for all future data collection sweeps, but we keep enrolment open such that this number will increase over time.
Characteristics of those recruited Table 1 presents descriptive statistics for key baseline characteristics in all G0 partners who have ever provided data and for the subset of G0 partners who are now formally enrolled, and compares them to G0 mothers. It can be seen that partners who have provided data are 2.7 years older on average than enrolled G0 mothers (p<0.0001) and are more likely to be educated to degree level (19.5% versus 12.9%; p<0.0001). The proportions in the different housing and ethnic groups are very similar to G0 mothers. However, G0 partners who formally enrolled in the study are 1.3 years older on average than G0 partners who have given data only (p<0.0001) and are more likely to have a degree level education (29.4% versus 19.5%; p<0.0001) and to own their homes (86.4% vs 75.8%; p<0.0001).
Given the way in which G0 partners were initially invited to take part in the study, it was possible for them to change over time (i.e. the mother may have changed who she gave the partner questionnaire to). Date of birth, provided in all data collection sweeps apart from two, was used as the only piece of identifying information available to the study to determine likely changes in partner. Whilst questionnaires also asked who had completed it, this was inconsistent and proved not to be useful for determining real change. After cleaning all dates of birth (please see the data dictionary for further details),  For a further 251 (2.1%) cases it was not possible to determine any changes due to incomplete dates of birth. Researchers should consider taking these changes into account, especially when analysing G0 partner data longitudinally -variables are available on the partners' cohort profile dataset to indicate changes and the time point at which data were collected before the change occurred.
How often have they been followed up?
The timeline on the study website summarises all data collection activities to date: G0 partners have completed up to 15 self-completion questionnaires that have been provided via the G0 mother prior to formal enrolment (two during pregnancy, two in the first year of the child's life and approximately annually thereafter up to the age of 12) -these were all paper based. Three general questionnaires (paper and online versions) have been sent since formal enrolment in 2012, 2018 and 2020 (i.e. to those for whom we have contact details), plus a further six COVID-specific questionnaires -these were all hosted fully online for G0 with the exception of the fifth COVID questionnaire. Figure 2 presents the response rates to the primary questionnaires.
There have been three opportunities for face-to-face data collection (in purple on the timeline), via the partner's clinic (in 2010-11, the "Focus on Fathers" clinic (2011 to 2013) and the ongoing "@30" clinic (started in 2021 and due to finish in 2024). Blood samples have been collected from G0 partners at all of these dedicated clinics or opportunistically if the G0 partner happened to attend a Focus clinic with their G1 child. Opportunistic blood samples were obtained at clinics when the children were 13, 15 and 17 years of age. Saliva was also collected as part of a sub-study ("Before Breakfast Study"; (Ong et al., 2004)) from both parents of study children who were in the "Children in Focus" sub-cohort. Cell lines were generated for a small number of G0 partners if no other DNA sample was available (Ring et al., 2001).
Those G0 partners who have formally enrolled and now form the basis for all current and future contacts differ from those who may ever have taken part from the start ( Table 1). As reported earlier, they are more likely to have higher educational attainment, own their own home and are slightly older on average. It should be noted that we will always welcome enrolment of eligible G0 partners, but it is unlikely that this overall bias will change.

Methods
What has been measured?
The study website contains details of all the data that is available through a fully searchable data dictionary and variable search tool.
Questionnaires. The over-arching topics collected via questionnaire are summarised in Table 2. A particular advantage of the overall ALSPAC design is that many of these topics have also been reported on by the G0 mothers (see Table 2) enabling comparison within relationships. Table 3 summarises the measures that have been collected in previous face to face clinic settings and in the "@30" clinic which started in September 2022 and will be ongoing until the end of 2023.

Clinic Assessments.
Biological samples and measurements. The different biological samples (Blood, saliva, urine, hair and nails) that have been   DNA is currently available from 3,300 G0 partners; lymphoblastoid cell lines are available from 2,400 and exomes in approximately 1,500 trios of mother, partner and study child.
Record linkage. G0 partners were flagged for mortality and cancer with the Office of National Statistics. This has now moved to NHS-digital and we are currently in the process of obtaining full consent from G0 partners for access to health and other administrative records, as detailed on the study website.

Results
What has it found? Key findings and publications ALSPAC is first and foremost a pregnancy cohort and has contributed to a substantial number of areas related to pregnancy and child health and development (see the website for a full list of all publications). Examples of key areas that the G0 partners have contributed to are summarised below.
Men before, during and after pregnancy. Detailed work in ALSPAC has shown that postnatal depression is as important in men as it is in women in terms of the impact it can have on offspring wellbeing (Ramchandani et al., 2005). As with women, the study has revealed the possibility of a biological clock affecting fertility in men -older men were shown to take longer to conceive, even after taking into account their partner's age (Ford et al., 2000). A comparison of self-reported drinking and smoking showed that G0 mothers and partners reports of the G0 partner's drinking/smoking status were in almost perfect agreement (Passsaro et al., 1997). However, when quantifying the amounts, women tended to report lower amounts compared to their partners. This suggests that proxy reports can be used for drinking/smoking status, but more detailed information should be used with caution.

The role of fathers in growing up.
Changes over time in the involvement of fathers/partners in bringing up the child have been reported by the Fatherhood Institute. A recent report stated that 28% of ALSPAC fathers were noted as playing no part in infant care compared to virtually none in the Millennium Cohort Study investigating children born a decade later (Burgess & Goldman, 2022). Negative responses to parenthood at 21 months of age in fathers were associated with increased depressive symptoms at 16 years of age (Scourfield et al., 2016), conversely positive psychological aspects of father involvement has shown a protective effect against depressive symptoms at the same age (Opondo et al., 2017). A more recent paper has reported an effect of father absence in childhood on offspring wellbeing such that depressive symptoms reported at 24 years are higher in those whose father was absent, particularly during childhood (Culpin et al., 2022).
Relationships. Improvements in marital relationships over time have been shown to have a positive association with a number of cardiovascular risk factors (Bennett-Britton et al., 2017). On average, higher levels of marital conflict have been reported by both mothers and partners after the birth of the study child compared to pre-birth with both mothers and fathers reporting very similar levels at each timepoint (Hanington et al., 2012).
Epidemiological methods to help establish causality. As described in the mother's cohort profile, researchers have used data reported by the partner as well as by the mother to assess whether there is a direct biological effect of intrauterine exposures via the latter on the developing offspring (Lawlor et al., 2019), this is an example of negative control analysis (Lipsitch et al., 2010). More recent examples of this negative control work include testing the association between pre-pregnancy body mass index (BMI) and offspring intelligence quotient (IQ) (Coo et al., 2019) and the association between prenatal alcohol exposure and offspring depression (Easey et al., 2020).
Transgenerational effects. ALSPAC has demonstrated small but consistent associations between the age at which father's started smoking and increased body fat in their sons but not daughters (Northstone et al., 2014). Girls whose fathers had problem gambling during their childhood have been shown to be more likely to become problem gamblers themselves (there was no father-son transmission), although the absolute numbers of problem gambling are small at 20 years of age (Forrest & McHale, 2021).
What are the main strengths and weaknesses?
The primary strengths of this cohort are the duration of follow up, the depth of phenotypic data available and the familial base within which they are set. Over the last 30 years, ALSPAC has collected a substantial amount of data on this group of participants with many repeat measures. The data facilitates a number of different study designs including familial, with trios available in the bioresource and intergenerational, with data collected from the partners about their parents and grandparents and detailed data on their children and, as we move forward, their grandchildren. An additional strength is the "Symmetrical data on a variety of measures around relationship quality and childcare actions and beliefs" (Burgess & Goldman, 2022). Data linked to administrative sources (such as health and education) will provide opportunities for additional research as it becomes available.
Key limitations include the fact that partners were not recruited in their own right at the start of the study. This has resulted in 1) reduced response rates as reminders could not be sent directly to the participants, 2) changes in respondents over time and 3) a biased and much smaller group who did actually enrol in later years and are now available for follow-up. Changes in respondents have been identified on the basis of date of birth (after substantial cleaning), as the study did not always ask explicitly or consistently who had completed each data collection, and the simple methods used to identify changes may have missed changes covered by errors in the data. Finally, the study is pregnancy-based; this is not the best starting point for recruiting a male cohort as it does not represent those men who did not have children and thus makes it difficult to identify a target population and therefore the generalisability of any results. It is also likely biased towards those men in stable relationships (since they were initially invited to take part by the mothers).

Ethical approval
Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time.
Consent for biological samples has been collected in accordance with the Human Tissue Act (2004).

Data availability
ALSPAC data access is through a system of managed open access. Please see the ALSPAC data management plan which describes the policy regarding data sharing (http://www.bristol. ac.uk/alspac/researchers/data-access/documents/alspac-datamanagement-plan.pdf), which is by a system of managed open access. The steps below highlight how to apply for access to the data included in this study and all other ALSPAC data: 1. Please read the ALSPAC access policy which describes the process of accessing the data and samples in detail, and outlines the costs associated with doing so.
2. You may also find it useful to browse our fully searchable research proposals database, which lists all research projects that have been approved since April 2011.
3. Please submit your research proposal for consideration by the ALSPAC Executive Committee. You will receive a response within 10 working days to advise you whether your proposal has been approved.
In relation to this sentence "Three general questionnaires (paper and online versions) have been sent since formal enrolment (i.e. to those for whom we have contact details)," it would be helpful to indicate at what time points (age of the child) this data was collected.
○ What COVID5 (page 7) refers to, and why is it important to mention it here? ○ In relation to the face-to-face data collection, it would be helpful to indicate at what age of the child these data collection for the partners occurred -this is helpful to understand what data is available at what time point.