Study protocol : The INTERMAP China Prospective ( ICP ) study

Unfavourable blood pressure (BP) level is an established risk Background: factor for cardiovascular diseases (CVD), while the exact underlying reasons for unfavourable BP are poorly understood.  The INTERMAP China Prospective (ICP) Study is a prospective cohort to investigate the relationship of environmental and nutritional risk factors with key indicators of vascular function including BP, arterial stiffness, and carotid-intima media thickness. A total of 839 Chinese participants aged 40-59 years from three Methods: diverse regions of China were enrolled in INTERMAP in 1997/98; data collection included repeated BP measurements, 24-hour urine specimens, and 24-hour dietary recalls.  In 2015/16, 574 of these 839 persons were re-enrolled along with 208 new participants aged 40-59 years that were randomly selected from the same study villages.  Participant’s environmental and dietary exposures and health outcomes were assessed in this open cohort study, including BP, 24-hour dietary recalls, personal exposures to air pollution, grip strength, arterial stiffness, carotid-media thickness and plaques, cognitive function, and sleep patterns.  Serum and plasma specimens were collected with 24-hour urine specimens. Winter and summer assessments of a comprehensive set of Discussion: 1,2 3,4 5 6 7


Introduction
Unfavourable blood pressure (BP) level is an established risk factor for cardiovascular disease (CVD) 1 , and is the single largest risk factor for disease burden globally 2 . Associations between BP level and CVD are positive, independent, strong, graded, and continuous with no apparent threshold 1,3,4 . Although adverse levels of BP are highly prevalent in middle and older-aged individuals worldwide 5 , the exact underlying reasons are poorly understood. Environmental and dietary risk factors such as air pollution and sodium intake have emerged as potentially important contributors that warrant further, coordinated investigation [6][7][8][9] . CVD is the leading cause of death in China with 3.97 million deaths attributed to it in 2016 10 ; previous studies in China show large regional differences in the incidence of CVD 11-14 and in the prevalence of environmental and dietary risk factors 6,15-18 . Studies that rigorously quantify exposures to these modifiable risk factors and evaluate their associations with BP and other cardiovascular outcomes may help to explain these geographical variations and inform the development of region-specific interventions to prevent CVD.
The International Study of Macro/Micronutrients and Blood Pressure (INTERMAP) 19 has produced extensive findings on cross-sectional associations between multiple dietary factors (e.g., macro-and micro-nutrients/food groups) 9,20,21 and urinary metabolites [22][23][24][25] with BP in individuals. The findings from INTERMAP indicate the cumulative effects of individual nutrients may be sizable 20 and important in accounting for the high prevalence of adverse BP patterns in populations. However, INTERMAP was cross-sectional and thus causal associations could not be inferred regarding the role of dietary factors in the development of adverse BP or to assess the influence of nutrient intakes on dynamic BP change. Prospective follow up of the INTERMAP participants will enable investigation of patterns in nutrients, food groups, metabolites and diet, and their associations with BP over time. It will also achieve a better understanding of the mechanisms by which diets lead to adverse BP levels.
Since the inception of INTERMAP, substantial epidemiologic and toxicological evidence has linked particle air pollution exposure (e.g., particulate matter with aerodynamic diameters less than 2.5 µm [PM 2.5 ], black carbon) to higher BP and increased risk of CVDs and related mortality. Air pollution from household biomass and coal (i.e., solid fuel) stoves -used by over 600 million Chinese for cooking, heating, and other energy needs 26 -has emerged as a potentially important environmental determinant of adverse BP levels and central haemodynamics. 7,8 , but is far less studied compared with urban and traffic-related air pollution. Adding detailed measurements of exposure to air pollution in settings of household solid fuel use to the existing profile of INTERMAP cohort could provide deeper insights into the aetiology of adverse BP patterns by studying their environmental and nutritional risk factors and related interactions.

Methods
The The aims of ICP Study are to: 1) extend the observations of profiles of CVD risk factors and health outcomes from INTERMAP to present day; 2) produce evidence for region-specific environmental and nutritional interventions to address adverse BP profiles; and 3) leverage advances for identifying biomarkers of air pollution exposure and nutritional risk factors and for understanding the mechanisms and pathways of their effects on CVD. Overall, this cohort study will investigate the transitions in nutrients, food groups, dietary patterns, household energy use, urinary sodium, and vascular indicators between baseline and follow-up visits. Results and findings from ICP Study will provide better insight into the evolution of raised BP and CVD in China, and its underlying environmental and nutritional risk factors.

Recruitment and enrolment
The ICP enrols an open cohort to study the risk factors for CVDs.
In 1997-1998, participants of the INTERMAP study were randomly selected from each target population of three rural sites in China. The recruitment programme for each site was stratified by age/gender and approximate 65 persons were randomly selected from each of four age-gender subgroups of population lists: men and women, aged 40-49 and 50-59 years. Selected participants from population lists were then contacted and invited to participate by staff in local collaborating centres. Only one person per household was to be included. Extensive efforts were made to recruit participants who were randomly selected at the beginning. For the ICP study data collection, if substitute participants were needed at follow-up to replace persons who refused to participate or did not satisfactorily complete full data collection, these participants were also invited to participate in a similar manner. , and a follow-up of the ICP cohort is planned for 2020-2021 ( Figure 1).
All measurements and biological samples are conducted using the same SOP for both the INTERMAP and ICP Study to ensure comparability and aggregation of data and all research staff must attend a compulsory, hands-on training programme and be certified before data collection. The measurements conducted in INTERMAP and ICP Study are summarized in Table 2. Trained staff collected information on household demographics, education, employment, tobacco smoking, alcohol use, medical history, family history, medication use, and physical activity using a structured questionnaire (Extended data: Main questionnaire 27 ) 19 .

Other measurements for ICP Study
We assessed physical activity levels of participants during ICP campaigns using a pedometer (Omron HJ-328). We also collected historical (INTERMAP) and current (first follow-up of ICP Study) household energy use including fuel and stove types (to assess historical exposure to air pollution) using structured questionnaires 36 (Extended data: Fuel and stove use questionnaire 27 ). Sleep patterns were assessed using structured questionnaires (Extended data: Main questionnaire 27 ). We measured cognitive function in the Beijing and Shanxi study populations using the Montreal Cognitive Assessment 37 . For the second follow-up, data on household energy use and cognitive function will be collected.

Causes of mortality
In addition to active follow-up, passive follow-ups will be carried out as well. Medical information of participants is being collected via linkages with their medical records in county hospitals using the health insurance database of the

Statistical analysis and sample size
The goal of analyses of nutrient/metabolite/environmental exposure-BP associations in INTERMAP and ICP is to estimate associations between average daily nutrient/metabolite/ environmental exposure levels of individuals and their average systolic BP and diastolic BP.
For descriptive statistics, means, standard deviations, and medians were computed for nutrients, metabolites, environmental exposure levels and other variables for the three population samples, by gender and age. BP of each individual was the average of eight measurements from the four visits (INTER-MAP) or four measurements from two visits in each season (ICP). To examine relationships of dietary, environmental, and other variables to BP, quantile and multiple regression analyses were used, with control for age and gender, then control for other potential confounding variables in sequential linear regression models.
Estimates of power are based on coefficients uncorrected for regression/dilution bias, corresponding to uncorrected partial correlations of nutrients/metabolites with BP. INTER-MAP was designed to detect partial correlations of 0.06 or greater; with regression dilution bias resulting from dayto-day variability in nutrient/metabolite levels, an observed partial correlation of 0.06 can be expected to correspond to a true correlation of 0.10 or greater. Regarding the ICP Study, we may not have adequate sample size to detect the association between nutrients and BP in three Chinese populations; however, we should have adequate sample to test other hypotheses as this study generates a considerable amount of both environmental exposure and CVD outcome data. For examining the association between personal exposure to PM 2.5 and BP with cross-sectional data collected in 2015-16, we assumed difference of exposure to PM 2.5 was 80 ug/m 3 between higher and lower exposure groups and this led to a systolic BP difference of 3.1 mmHg using the predicted dose-response curve from previous study 7 ; therefore, nearly 90% power are expected for ICP Study (standard deviation=14 mmHg, α=0.05). When comparing BP and urinary sodium differences between north and south sites, we assumed differences were 3.5 mmHg for systolic BP and 80 mmol for urinary sodium 6 , with the sample size in this study, nearly 91% and 99% would be achieved (standard deviation=14 mmHg and 80 mmol, α=0.05).
The demographic and health characteristics of ICP participants that completed at least one season of data collection are presented in Table 3.

Discussion
For both INTERMAP and the ICP Study, data collection procedures have been extensively tested and validated to minimise systematic and random error 20,31,40 . For example, a random 10% of urine samples were split locally and sent to the laboratory with different identification numbers to evaluate the precision of urine analysis 20 ; all dietary recall data were assessed independently by trained staff 31 .
The main strengths of this cohort include: 1. Populations from three different regions enable investigation of the environmental and nutritional risk factors for higher BP and other cardiovascular outcomes across China.
2. Data collection followed the same SOP at baseline and follow-up visits to generate high-quality data, and thus enables comparisons of patterns in diet, household energy use, and other lifestyle factors over time.
3. Comprehensive assessment of environmental and nutritional risk factors in the winter and summer seasons with high precision: ambient air temperature; standardised multi-pass 24-hour dietary recalls; timed 24-hour urine collection; questionnaire-based assessment of current and historical stove and fuel use; multiple 24-hour measurements of personal exposure to air pollution; and blood specimens were also collected for biochemical analyses.
4. Although the sample size of this cohort is relatively small compared with other biobank studies, our high precision in measurement of exposure and key cardiovascular risk factors may lead to less biased associations, and thus contribute new insights in the emerging field of exposome research.
The ICP Study is observational by design, thus subject to several biases that we sought to mitigate. To minimize the potential for selection bias, our participants were randomly selected from village rosters and we achieved high participation rates at baseline. We maintained the cohort through home visits by dedicating staff at each site and were able to follow up 84% of living INTERMAP participants. We also collected verbal autopsy reports for deceased participants to investigate the influence of survivor bias. To reduce the impact of information bias arising from measurement error, we conducted repeated measurements of cardiovascular outcomes (e.g., BP) dietary and environmental exposure variables, and urinary/ blood biomarkers using validated instruments and adhered to strict quality assurance and quality control practices during data collection. We also collected detailed information on the  Factory worker 6 (2.2) 2 (0.7) 6 (7.5) 0 (0)

Abu Mohammed Naser Titu
Emory Global Diabetes Research Center, Rollins School of Public Health, Emory University, Atlanta, GA, USA This is a protocol paper that describes the rationale and methodologies of the ICP study. The authors have described the study population, objectives, numbers of visits, and data collection methods. They have listed the exposure and outcomes of the study. I have several comments and suggestions: The mean age of the study population seems high, particularly those who were enrolled in the INTERMAP study. Age is an important biological risk factor for many CVD outcomes in this study. The study findings should not be generalizable for other age groups. The authors may explicitly mention age in the study title and papers published from this study.
In the "Introduction" and "Methods" sections, the authors have mentioned the environmental factors. I think they are considering only air pollution as environmental exposure, mainly due to household energy consumption. I was wondering whether authors need to consider whether variables such as temperature, humidity, rainfall as the environmental exposure since they can collect these data from secondary sources.
I think the statistical analysis plans need more elaboration. How will the repeated measures be accounted for? What are the strategies for adjusting p-values for multiple hypotheses testing to avoid false discovery? Should we adjust the models for seasonality or ambient temperature? Evidence from China suggests there is a higher mean blood pressure of the population in winter compared to mean blood pressure in summer.
In the first follow-up, spot urine samples will be collected. There are many controversies regarding the utility of spot urine samples to evaluate daily sodium intake. The authors may consider the collection of 24-hour urine sodium. Even a single measurement of 24-hour is not sufficient.
For outcomes (e.g., blood pressure), I feel there is a huge gap between two follow-up visits. Exposure might change between the period.
The authors may list what lifestyle variables will be collected?
In the text, the second follow-up was described as one important component whereas it was not described in the abstract. This component should be added in the abstract.

2) Exposome Research
Exposome research is one key term. However, very few was described or discussed in the text. Add more description/discussion on exposome.

3) Study aim
The third aim of the study was described as 'leverage advances for identifying biomarkers of air pollution exposure and nutritional risk factors and for understanding the mechanisms and pathways of their effects on CVD.' Since the study was not designed and powered to monitor CVD morbidity and mortality, this should be effects on BP?

4) Air pollution exposure measurements
The study collected two consecutive 24-hour personal PM2.5 and black carbon. Questions are (1) China has an air pollution-monitoring system which is zip-code based, thus the investigators should be able to calculate zip-code based past exposures to air pollution in each participant from the existing database. Air pollution has seasonality and large day-to-day variations, thus estimate of past cumulative measures appears to represent exposure to air pollution than two consecutive 24-hour personal exposure.
(2) There are many other pollutants that may potentially be associated with blood pressure than PM2.5 and black carbon.

5) Pulse wave velocity measurements
The study evaluated arterial stiffness using brachial-femoral PWV. Measurement of arterial stiffness is important in hypertension research partly because some studies show that arterial stiffness precedes incident hypertension whereas other studies show the opposite. These studies typically use either carotid-femoral PWV (regarded as the gold standard) or brachial-ankle PWV. More recently some studies have started to report using cardio-ankle vascular index. Considering these backgrounds, it would be better to provide some references for brachial-femoral PWV in terms of (1) difference among this and other widely-used techniques, and (2) how to manage quality control of the measurements throughout the study period across the different sites. Though the reviewer has read the Supplemental material, it was not described well.

6) Intima-media thickness measurements
In the protocol or in supplemental material, quality control procedures of the measurements should be clearly described. How to monitor intra-examiner, inter-examiner variations, throughout the study period and study sites and to what extent, the variation would be permitted. 7) Cardiovascular outcome/CVD outcome Throughout the text, a term 'cardiovascular outcome' or 'CVD outcome' is used five times. In some places, it referred to BP and other places it referred to outcome other than BP. Please clarify.
8) The sample size of this cohort is relatively small compared with other biobank studies Biobank was used only once in the text. In the reference, UK biobank was referenced. China has many biobank studies, e.g., Kddorie Biobank, Shanghai Zhangjing Biobank. Either drop the term 'biobank' or define biobanks with some examples in China.

9) Kidney function
The initial INTERMAP study recruited subjects aged 40-59 where the kidney function can be assumed normal. The mean age of the follow-up cohort is 65 years old, thus mention on kidney function (whether it