Perceived values and climate change resilience dataset in Siaya County, Kenya

This dataset presents perceived values and socioeconomic indicators collected in Siaya, a rural county in Kenya in 2022. The data was obtained from 300 household surveys and group interviews conducted in six sub-counties across eleven villages. Socioeconomic data were collected with a special focus on climate change vulnerability. Information on housing, health, water accessibility and usage, electricity accessibility and usage, extreme weather events, community service, and information accessibility were mapped across survey questions. The user-perceived value (UPV) game – a perception-based surveying approach – was used to elicit local communities’ needs and perceptions of climate change challenges. The UPV game involves asking interviewees to select which graphically depicted items would be most necessary in different situations and probing them for the reasons behind their choices (why-probing). The data was collected in two languages (Dholuo and English) and then translated into English. These surveys and interviews were conducted to better understand the needs of rural Kenyan communities and their perceptions of climate change, with the aim to identify ways to build resilience. Kenyan policymakers can use the dataset to inform county-level energy and development plans, while researchers and development practitioners can use the dataset to better design their research and programmes to reflect local needs and values.

The data was obtained from 300 household surveys and group interviews conducted in six sub-counties across eleven villages.Socioeconomic data were collected with a special focus on climate change vulnerability.Information on housing, health, water accessibility and usage, electricity accessibility and usage, extreme weather events, community service, and information accessibility were mapped across survey questions.The user-perceived value (UPV) game -a perceptionbased surveying approach -was used to elicit local communities' needs and perceptions of climate change challenges.The UPV game involves asking interviewees to select which graphically depicted items would be most necessary in different situations and probing them for the reasons behind their choices (why-probing).The data was collected in two languages (Dholuo and English) and then translated into English.These surveys and interviews were conducted to better understand the needs of rural Kenyan communities and their perceptions of climate change, with the aim to identify ways to build resilience.Kenyan policymakers can use the dataset to inform county-level energy and development plans, while researchers and development practitioners can use the dataset to better design their research and programmes to reflect local needs and values.
Crown • The role of energy services within community resilience can be examined in detail and used to enhance community resilience against climatic and social risks, and vulnerabilities.• Academics can use the cleaned dataset directly for analysis, or it can be used to guide the collection of further sub-national data which reflects the needs of rural communities.

Objective
Sustainable development initiatives in the 21st century must increase well-being under the constraints of climate change.However, the perception of well-being among citizens themselves remains generally under-studied in developing regions.In order to develop effective strategies for reducing poverty and improving living standards in the face of climate change, surveys and interviews are required to provide information on what people value and need.
This dataset gathers insights from rural communities in Siaya County, Kenya regarding their values, needs, and perceptions of climate change challenges with a focus on energy services.This data can be used to identify ways to build community resilience and to understand the role of energy in climate change adaptation measures.The data was collected through a perceptionbased surveying approach called the user-perceived value (UPV) game.This involves interviewees selecting items which they value most from a contextualised set of graphically depicted items, and subsequently asking the reasons behind their selection (i.e., why-probing).For more details on the UPV methodology, please refer to Hirmer [1] .A socioeconomic survey was conducted to contextualize the UPV data, which included questions on demographics, climate events, communication, household decision-making and farming practices.
This dataset is expected to contribute to Kenya's ongoing county-level energy planning and provide indicators to better understand local resilience and community needs.The data can provide insights as to how energy services can be strategically planned to strengthen community resilience against existing vulnerabilities and climate change risks.The dataset may also be useful in understanding the role of particular items and appliances in enhancing community resilience and energy security.We hope that the data can be used to provide insights which can help to reduce energy insecurity, improve energy access decisions, and support rural communities' resilience in both daily life and climate emergencies.Such work can help to mitigate the effects of climate change and support the agricultural sector.This dataset can furthermore help address data gaps pertaining to the impacts of appliances, as outlined in the "Off-and Weak-Grid Appliances Impact Assessment Framework" by Rural Senses et al. [2] .

Data Description
This dataset presents socioeconomic surveys and UPV interviews collected in Siaya County, Kenya.Siaya has undergone significant development over the past decade, particularly in the areas of health and education.
Based on data from the 2019 census [3] , Siaya County's population is estimated at 993,183 individuals, with an annual growth rate of 1.7% between 2009 and 2019.The county is comprised of 250,698 households, with an average household size of 3.9.52.5% of the total population are female, and the majority of residents (91.4%) reside in rural areas.By conducting these surveys and interviews, we hope to support county-level planning in Siaya and in Kenya more broadly.We aim to identify gaps in existing services and infrastructure and provide insight to inform how resources can be allocated to improve overall community well-being.Based on the dataset, we performed a further analysis to identify the household's intersectional needs [4] .
The dataset is formed of two parts: (1) household characteristics, obtained from socioeconomic surveys; and (2) personal values, gathered through interviews with community members using the UPV game.Surveys and interviews were conducted in six sub-counties of Siaya (i.e., Ugenya, Ugunja, Alego Usonga, Gem, Bondo and Rarieda).They provide insights into the challenges faced by local residents, such as their living situation, infrastructure, household shocks, access to healthcare, education, employment, and subjective well-being.
The dataset is accompanied by a codebook that describes the values that were used in annotation of the UPV game transcriptions and the questionnaires that were used for data collection.All identifying variables, including names, GPS coordinates, and village names were removed, and the dataset is thereby anonymized.
The sample construction process for the dataset is documented in Table 1 .A total of 300 household representatives participated in the surveys and interviews.The recruitment venues were specifically selected to try to engage participants from different social classes and with diverse economic needs.To better reflect needs of marginalized groups, we aimed to include a 5% representation of people with disabilities within each sub-county sample.In the final sample, 28 participants (9%) reported disabilities.This strategy helps to ensure that the gathered infor- Trained and experienced annotators from primarily Kenya were used to undertake annotation.

Methodology
We employed trained annotators to conduct sentence-level annotations.Each utterance was carefully annotated with value labels following the UPV game framework.To ensure reliability, each sentence was annotated five times by different annotators.Only labels where a minimum of three out of the five annotators agreed were considered and used.This methodology helps ensure consistency and quality in the annotations.
mation can support development effort s that are inclusive and considerate of the needs of all members of the population, particularly those who may be marginalized or disadvantaged.

Survey
The socio-economic survey data was first collected orally in the preferred language of the participants.The collected information was then translated into English.Questionnaire used for the survey are provided as Questionnaire_Siaya.pdf .
The results are presented in a CSV file ( Siaya _UPV _Survey.csv ) where the questions are represented as column headers and the responses of each individual speaker are recorded in the rows.Each row contains a unique interview ID which can be used to inter-relate the data to the other files in this dataset.Note that the CSV file has also been converted to JSON for ease of analysis ( Siaya _UPV _Survey.json).
The survey questions were designed to gather data on a wide range of factors that could affect the participants' perspectives.They fall under the following eight categories: Demographic data, housing status, accessibility of healthcare services, water usage and availability, energy consumption, experience of extreme weather events, community service, and access to information.
Fig. 1 illustrates the demographic characteristics of the 300 participants in the study, obtained from the survey data.The sample consisted of 184 females and 116 males, with an age range spanning from 16 to 92 years ( Fig. 1 a).A greater proportion of females reported no education or primary education only, as indicated by a female/male discrepancy of 20 and 1.76, respectively, which exceeded the overall sample discrepancy of 1.59 ( Fig. 1 b).Conversely, the trend was inverted for secondary education (1.11), tertiary (0.43), and higher education (0.6), suggesting that males were likely to have higher education opportunities compared to females.21% of the participants held multiple occupations as shown in Fig. 1 c.The largest occupational group among the sample is farmers, comprising 50% (150) of the participants who primarily cultivate crops.Of this group, 14% (43) are livestock farmers, and 35 individuals engage in both crop and animal farming ( Fig. 1 d).

Interview
The perceived values data, which were derived from the UPV game, are presented in three separate CSV files to facilitate efficient and effective analysis.Note that each CSV file has also been converted to a JSON file with the same name for ease of analysis.
• Siaya _UPV _Paragraphs.csv: During the UPV game, interviewees were asked to select ten items they valued and discuss the reasons behind each selection, which were recorded.In addition, thirteen • open-ended questions on attitudes and perceptions regarding climate change were asked.
Responses to each question were extracted as a separate paragraph.This file therefore extracts each answer to any of these open-ended questions in a separate row (i.e., 3942 rows with one row per paragraph).Table 2 provides a description of the paragraph specifications (i.e., items in the table beginning with P).This dataset also contains interview IDs for interrelation with the survey questions.• Siaya _UPV _Utterances.csv: This file extracts each utterance (i.e., sentence) from the openended answers in a separate row (i.e., 16,650 rows with one row per utterance).Additional columns are added to tag the sentiment and value annotations for each sentence, as described in Table 2 .This dataset also contains the paragraph specifications and interview IDs for interrelation with the other files.Natural language processing (NLP) methods were employed to extract and analyse the sentences from each paragraph, as discussed by [5] .
Fig. 2 presents the distribution of utterances based on demographic characteristics.Single females with higher education who respond to questions in English exhibit a lower number of utterances in their answers.Divorced individuals tend to be more talkative when answering questions, whereas singles are generally less talkative in their responses.The impact of education on the number of utterances follows an inverted U pattern, with individuals who have received a moderate level of education exhibiting the highest level of talkativeness.Conversely, individuals with no education and those with higher education use fewer utterances in their responses.

Table 2
Data specifications in the paragraph-and extract-level UPV datasets.The specifications starting with P are present in both the paragraph and extract file.The specifications starting with E are only present in the extract file.

Key
No

Case selection
This project took Siaya County in Kenya as a case study.Kenya has made significant economic and political reforms in the past decade that have contributed to sustained economic growth, social development, and political stability.In spite of this, ongoing drought, as well as the increase in the cost of living, have adversely affected households throughout the country.The agricultural sector, which directly contributes 33% to the Gross Domestic Product (GDP) and indirectly contributes an additional 27% to the GDP through linkages with other sectors such as manufacturing and distribution [6] , plays a critical role in the country's economy.
Over 40% of the country's population is employed in the agriculture sector, with a significant portion of that employment being in rural areas where 70% of the population is employed in agriculture [7] .Siaya County, located in western Kenya, was selected as a case study due to its high levels of poverty (48% live below the poverty line) and food insecurity.This makes it an interesting location to study climate resilience, as these are risk factors in the case of climate shocks.The agricultural sector, which is particularly important to Siaya's economy and livelihoods, is susceptible to instability due to droughts, floods, etc.

Sampling strategy
The participant selection was guided by the National Commission for Science, Technology, and Innovation (NACOSTI), and was facilitated by the Siaya County Commissioner and local area chiefs.Over a two-week period, 300 individuals from across the county were interviewed.To ensure a diverse representation of the population with varying social and economic backgrounds, a targeted community sampling approach was employed to recruit participants.Participants were selected from four distinct settings: coastal/port towns, urban village towns, rural villages, and special interest villages.Eleven villages, three from each setting, were randomly selected out of the 75 villages in Siaya County [8] .Participants were sourced from diverse socio-economic backgrounds and age ranges, with an aim to achieve a balanced representation of gender, disability, income, age, education, and employment.The sample was intended to represent 5% of people with disability, but the final sample included 9% of those with disabilities.Due to availability and willingness, some demographics were not fully represented, and participants were selected based on their willingness to participate.

Preparation
To ensure the study's integrity, a risk and ethics assessment following the NACOSTI ethics board approval under license No: NACOSTI/P/22/21,652 in research involving human participants was approved.As per the guidance provided by [9] , interviewees were compensated for their time.To protect the interviewees' identities, all names were removed from transcripts in line with NACOSTI ethical procedures.

User perceived value game
The User-Perceived Value (UPV) approach is a data collection method that utilizes a pictorial game-based approach.Participants are asked to select items they value most from a set of 40 everyday items found in case study communities.Among these items, eight are considered to be energy appliances, including a fan, fridge, TV, pump (for irrigation and water), motorbike, pressure cooker, solar PV, and grain mill.The UPV game involves probing the reasoning behind participants' item selections using 3 + rounds of why-probing process, as previously described by [1 , 9] .Specifically, the UPV game used in this dataset aims to identify: 1.The five items that communities perceive as important (General format).2. The appliance that interviewees perceive as most useful and least useful (Appliancespecific format).3. The three items that are most relevant to the interviewee given the climate event that they are most concerned about.(Climate format).
The full list of survey and UPV questions is provided in Table 3 .The UPV questions are numbers 4 to 7.

Post-processing
The dataset was originally collected in two languages (Dholuo and English) and translated into English by local translators.The data were then extracted into paragraphs and utter- This dataset presents perceived values and socioeconomic indicators collected in Siaya, a rural county in Kenya in 2022.
Copyright © 2024 Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) The data collection process involved targeted community sampling to recruit participants who represent different social and economic needs of residents in Siaya county.County coordinators recruited participants from venues like restaurants, places of worship, markets and community groups from within each community.Grassroots authorities, organizations, and societies were also approached to ensure the inclusion of marginalised groups, for example, the elderly, youth and people with disabilities.Consent forms were signed by all participants.Data source location Siaya (County of Kenya) Data accessibilityRepository name: Zenodo Data identification number: 10.5281/zenodo.10160737DirectURLto data: https://zenodo.org/records/101607371.Value of the Data• The data can be used to understand what is important to rural communities in Kenya and to identify potential factors influencing decision-making for different demographic groups (e.g.women, youth, elderly and low-income).

Table 1
Timeline and logistics of dataset construction.