The Polish Panel Survey (POLPAN) dataset: Capturing the impact of socio-economic change on population health and well-being in Poland, 1988–2018

The Polish Panel Survey, POLPAN, one of the longest continuously run panel studies in Europe, is designed to facilitate research on the socio-economic structure, inequalities and the individual life course under conditions of social change in Poland. POLPAN is well suited for studying how women's and men's health and wellbeing are influenced by their life conditions, such as financial and social resources, that Poland's post-1989 profound socio-economic transformations impacted, and how health outcomes further shape individuals’ attitudes and behaviours. Initiated in 1987-88, POLPAN has been fielded in five-year intervals, most recently in 2018, with wave-specific samples representative of the Polish adult population and response rates for full panelists consistently above 70%. In POLPAN, health assessment measures are collected in all waves, as part of respondents’ multi-dimensional and life course inequality profile. Data on self-rated physical and psychological health, collected since 1998 (Wave Three), are complemented with respondents’ Nottingham Health Profile and core anthropometric information about personal weight and height (Wave Five onwards); health and wellbeing related reasons for work interruptions (since Wave Four); information on extensive hospital stays (Wave Six onwards) and respondents’ chronic or protracted illnesses (in Wave Six), respondents’ disability status (all waves). The newly released integrated 1988-2018 POLPAN dataset is available on Harvard Dataverse, or upon request, via e-mail: polpan@ifispan.waw.pl.

The Polish Panel Survey, POLPAN, one of the longest continuously run panel studies in Europe, is designed to facilitate research on the socio-economic structure, inequalities and the individual life course under conditions of social change in Poland. POLPAN is well suited for studying how women's and men's health and wellbeing are influenced by their life conditions, such as financial and social resources, that Poland's post-1989 profound socio-economic transformations impacted, and how health outcomes further shape individuals' attitudes and behaviours. Initiated in 1987-88, POLPAN has been fielded in five-year intervals, most recently in 2018, with wave-specific samples representative of the Polish adult population and response rates for full panelists consistently above 70%. In POLPAN, health assessment measures are collected in all waves, as part of respondents' multi-dimensional and life course inequality profile. Data on self-rated physical and psychological health, collected since 1998 (Wave Three), are complemented with respondents' Nottingham Health Profile and core anthropometric information about personal weight and height (Wave Five onwards); health and wellbeing related reasons for work interruptions (since Wave Four); information on extensive hospital stays (Wave Six onwards) and respondents' chronic or protracted illnesses (in Wave Six), respondents '  years. It has been fielded every five years since, with the most recent wave in 2018. Since Wave Three, the sample of panelists is complemented with randomly drawn renewal samples of young adults. Waves maintain a representative age distribution for the country at the time of a given POLPAN survey. As a rule, respondent substitution is not allowed. Non-response follow-ups average four returns. Response rates for full panelists are consistently above 70%, but vary for intermittent panelists and the young. Data are collected, processed and stored in line with national and international regulations on data protection and privacy, including respondents' participation consent. Description of data collection POLPAN data are collected by the Centre of Sociological Research at the Institute of Philosophy and Sociology, the Polish Academy of Sciences, using paper-and-pencil personal interviews (PAPI) on panelists and renewal samples of young adult residents of Poland. Quality standards are implemented and monitored throughout the survey process. The interviewer training is conducted both in Warsaw and in the 22 network units around Poland responsible for data collection; questionnaires for all waves are pretested; a minimum of 10% of interviews conducted in each wave are back-checked. Data  This analysis is based on data from the Fifth (2008) and Sixth (2013) waves of the POLPAN. These waves can be found within the integrated POLPAN dataset on Harvard Dataverse or as separate surveys at the repositories, as specified above in the "Data accessibility" section.

Value of the Data
• As a panel study whose separate waves are representative of Poland's age structure at the time of the surveys, POLPAN provides both a dynamic picture of changes in the country's social structure, and allows to understand these changes under different socio-economic and political systems. Health outcomes, as a major dimension where inequality manifests, are measured in all waves, as part of respondents' multi-dimensional and life course inequality profile. • POLPAN panel survey can be used by sociologists, public health and social epidemiology scholars for studying how women's and men's health and wellbeing are influenced both by individual and structural characteristics that Poland's systemic transition and profound socioeconomic transformations impacted, and how health outcomes further shape individuals' social position, attitudes and behaviours. • POLPAN's longitudinal data offers researchers the opportunity to reach deeply into the panelist's life history and search for factors determining health many years later. This enables research on causes and consequences of health outcomes and inequalities, including cohortspecific analysis. Poland, it has been fielded six more times, most recently in 2018. A strong foundation of social science theories informs POLPAN's operational definitions of concepts inherent to research on social stratification. Methodologically, POLPAN meets quality standards proposed in the specialized literature. All POLPAN data and their documentation are available in English, free of charge.

POLPAN waves
POLPAN started in 1987-88 as a government-funded research project on social class and stratification in socialist Poland, called "Social Structure II" [1] . The study, housed at the Institute of Philosophy and Sociology, the Polish Academy of Sciences (IFiS PAN), targeted the core segment of the Polish labour force -adults aged 21 to 65 -using a cross-sectional survey design. To ensure national coverage, the project team constructed the sample using a country-wide register of households prepared in 1985-86 by the Centre for Public Opinion Research (CBOS) [2] . The survey, currently known as POLPAN's Wave One (1988), was conducted on a nationally Source: [9 , 10] representative sample of 5817 residents of Poland using paper-and-pencil personal interviews (PAPI). Table 1 provides the details on the fieldwork period, sampling frames and the number of completed interviews for 1988 and six subsequent POLPAN waves. Soon after Social Structure II was completed, Poland embarked on the road to systemic transformation from an authoritarian political regime to a market-oriented democratic mode of governance. In the early 1990s, the study's core team decided to conduct the survey again with the same respondents to observe the dynamic aspects of the social structure, thus laying the foundation for a panel study. However, the hyperinflation of 1989-90 and the economic recession dramatically affected the budgets of academic institutions and made survey implementation a challenging task. Thanks to international financial support [3] , the survey was repeated in 1993: 2500 participants were randomly selected from the 1988 sample, and 2259 were successfully interviewed.
Since 1993, POLPAN has been fielded every five years (for details on fieldwork dates, sampling frames and number of interviews, see Table 1 ). The core of the sample consists of panel respondents. Starting from the third wave (1998), to ensure the balance of different cohorts and age groups in the study, POLPAN adds subsamples of individuals aged 21-25. In 2013 (Wave Six), sufficient funding allowed POLPAN to recontact participants who previously had dropped out from the study, hence this wave had the larger number of panelists. Table 2 maps the availability of health assessment measures in POLPAN. Questions on selfrated physical and mental health appear since 1998. All respondents are asked, using 4-point scales, "How do you evaluate the state of your physical health in comparison with the state of physical health of other (the majority) people your age?" and "How would you evaluate your psychological state, your mood?"

Health measures and main correlates
Starting with Wave Five (2008), POLPAN adds questions of the Nottingham Health Profile (NHP) [4][5][6] . Developed in the UK as a population survey tool, NHP is considered a valid and reliable indicator of individuals' health status [4] , is easily administered, contains unambiguous response categories, and consists of two parts. Part one of the profile contains 38 questions, grouped into six areas of health and wellbeing referring to physical mobility (eight items), pain (eight items), sleep (five items), social relationships (five items), emotional life (nine items) and energy level (three items). The statements are presented in random order to avoid contamination of assessment. Part two of the profile contains information on seven areas of life affected by health: employment, work around the house, social life, personal relationships, sex life, hobbies and holidays [5] . For the analysis of NHP, responses are weighed, summed up, and a total score for each area is calculated. The score stands for "perceived dysfunction in different domains" [5] .
In 2008, the original 38-category test was administered to all respondents. Since then, POL-PAN selectively uses parts one and two of the NHP, depending on interviewees' age (see Table 2 , columns 6-7). This decision was informed by analyses of the 2008 data showing little variability in health problems among younger respondents.
From Wave 2008 onward, all POLPAN respondents are asked to provide information about their height (in centimetres) and weight (in kilograms). Researchers can thus calculate the body mass index and investigate its contemporary and life course predictors. Since 2003, POLPAN collects information on health and wellbeing-related work interruptions of at least three months, including maternity leaves, and illness or poor health [7] .
Respondents' disability status, an important health indicator, can be traced already in POLPAN Wave One (1988). In all POPLAN waves disability pension being part of their answers about sources of income. In Waves Four (2003), Five (2008), Six (2013) and Seven (2018) disability is among the reasons for work interruptions (see above). In 2003 and 2018 respondents were also directly asked about their disability status, when it occured, and whether they required special care.
POLPAN's Sixth Wave (2013) asked respondents if they suffered chronic or protracted illnesses, such as asthma, allergy, circulatory system ailments, cancer, diabetes or depression (yesno questions). Additionally, the most recent two waves (2013 and 2018) collect retrospective data on extended hospital stays. All respondents are asked whether, in the last five years, they were hospitalised for longer than seven days.
In POLPAN, health assessment indicators are part of the study's set of measures intended for analyses of social inequality across multiple dimensions. All seven POLPAN waves provide detailed socio-demographic information on respondents (e.g. gender age, education), their family life (e.g. marital status, having children) and family history (e.g. father's education and social position). Work and life situation indicators, such as respondents' employment and occupational history, income data at both the individual and household levels, and information on the composition of the household, living conditions and possession of durable goods, are collected in each POLPAN wave and for all age groups. Information about time spent on house chores is available selectively. Respondents' social capital can be evaluated based on the questions related to the number of friends, organisational membership and religious affiliation. Thanks to POLPAN's longitudinal structure, researchers can "track" causes of respondents' good or bad health using

Sampling and weights
Wave-specific POLPAN samples can be considered representative of the Polish population aged 21 and older at the time of the survey. The upper limit of respondents' age depends on the wave and has increased progressively, from 65 in 1988, to 95 in 2018. Given oversampling of young, cross-sectional analyses of data from Waves Five (2008) through Seven (2018) data require the use of weight variables for the shares of various age groups. Post-stratification weights for 2008, 2013 and 2018 are available in POLPAN dataset.

Data quality
Data quality is a major consideration in POLPAN. The study considers recommendations from the specialised literature on minimizing errors in representation and measurement [11][12][13] , and maximizing the study's responsiveness to users' needs [14] . To strengthen representation, POL-PAN relies on random sampling when appropriate (e.g. all renewal samples), and non-response follow-ups (on average, four returns). Beyond the first two waves, respondent substitution is not permitted. Response rates (RR) for full panelists (i.e. participants in all waves) are consistently above 70%, but vary for intermittent panelists (i.e. participants in at least two but not all seven POLPAN waves) and the young. For example, in 2018 RR for the young was 54%, while in 2013 it was 69% [10] .
Continuity in POLPAN's research team, and the strong collaboration with the data collection organisation, the Centre of Sociological Research (ORBS) at IFiS PAN, ensure that quality standards are implemented and monitored throughout the survey process. For example, interviewer training is conducted both in Warsaw and in the 22 network units around Poland responsible for data collection; questionnaires for all waves are pretested; a minimum of 10% of interviews conducted in each wave are back-checked [15] .

Funding
POLPAN waves are funded separately. Funding for 2018 and 2013 came from the Polish National Science Centre. Past funding sources highlight the extent of international involvement and include not only Polish institutions -State Committee for Scientific Research, Ministry of Science and Higher Education, Central Fund for Research and Development, and Polish Academy of Sciences (PAN) -but also United States Information Agency, The Ohio State University (OSU), US National Council for Eurasian and East European Research, and the Norwegian Research Council, and IREX. The project receives organisational support from IFiS PAN and CONSIRT -Cross-National Studies: Research and Training Program at OSU and PAN [16] . 1 All POLPAN questionnaires feature a common, stable core of questions asked to all participants. At the same time, there is some variation in questionnaire content, within and between survey waves. First, to account for differences in participation status among POLPAN respondents (i.e. panelists, intermittent panelists and new recruits), questions that specifically target new recruits, and respondents who skipped waves are included. Thus, within-wave variation in questionnaire content appears. Next, in light with the study's theoretical and methodological needs, certain quetions were added or dropped. This is the source for between-wave variation in questionnaires. Information about wave and questionnaire version is preserved in POLPAN variable names.

Ethics Statement
POLPAN data are collected, processed and stored in line with national and international regulations on data protection and privacy, e.g. General Data Protection Regulation (GDPR, EU-Lex Document 32016R0679). This includes securing all participants' informed consent. All publicly available information on POLPAN participants is anonymised.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.