Data for understanding trust in varied information sources, use of news media, and perception of misinformation regarding COVID-19 in Pakistan

The current data from 537 Pakistani millennials tell us about their trust in different information sources, the use of news media, and the perception of misinformation regarding COVID-19 in Pakistan. The dataset includes variables such as age, marital status, gender, social class, residential area, trust in the source of information, use of news media for coronavirus information, and perception of misinformation regarding COVID-19 in Pakistan. We fielded a survey from April 24 to May 12, 2020, via Qualtrics to obtain a convenient sample of younger and older adults in Pakistan. During this time, the number of new cases increased from 12,733 to 34,336. The surge took place despite the country being under a strict nationwide lockdown with the government relentlessly seeking the support of its policies from the people. This data may help scholars to understand how people of Pakistan interacted with different information sources, in comparison with other countries.


Specifications
Infectious diseases, media and communication, information science Specific subject area Social science theories will be applied to understand the infectious disease data to examine the use of news media, trust in the source of information, and perception of misinformation regarding COVID-19. Type of data Tables  How data were collected Online survey Data format To obtain the descriptive statistics, the raw data were analyzed with the help of R and RStudio. The authors have provided a raw version of the data file in Comma Separated Values (.CSV) format. Parameters for data collection There is no parameter used for data collection. Anyone could participate in the survey if they have an Internet connection. Description of data collection An online survey was administered through Qualtrics (one of the leading online sites to collect data). We used social networking sites, personal connections, and emails to collect representative data from Pakistani nationals currently living in Pakistan. Anyone who had access to the Internet could take the survey.

Data description
The COVID-19 has proven to be a crisis with far-reaching global implications. The World Health Organization (WHO) declared it a pandemic and an international emergency in March 2020. Since the beginning of the outbreak, scholars from various fields such as public health, information communication and technology, risk perception, and social psychology started researching COVID-19 related issues. However, much of the focus has remained on Western countries where adequate research infrastructure already exists. Consequently, countries such as Pakistan have not been able to curate a dataset that can disentangle the complex human behavior regarding their interaction with COVID-19 related information [2] . Hence, this dataset, in addi- Note: The respondents could skip questions they did not want to answer. Therefore, the number of observations varies for each question. Note: The respondents could skip questions they did not want to answer. Therefore, the number of observations varies for each question. Note : The respondents could skip questions they did not want to answer. Therefore, the number of observations varies for each question.
tion to demographic variables, allows scholars to draw insights on trust in information sources, use of news media, and the perception of misinformation regarding COVID-19 in Pakistan. Table 1 represents the socio-demographic variables. Amongst them, age and social status (0 = low , 10 = high ) were a continuous scale variable, whereas, categorical variables that subsequently were recoded, i.e., gender (1 = male , 2 = female ; marital status (1 = in relationship , 2 = married , and 3 = single ), urban/rural (1 = don't know , 2 = rural , 3 = urban ). Table 2 shows the descriptive statistics of the first measured variable in which respondents were asked on a scale from 1 ( no trust ) to 10 ( complete trust ) to assess the trustworthiness of each of the following sources of information regarding COVID-19.
To understand how people have used different sources in the last four weeks, respondents were asked to indicate which of the mentioned medium they have used in last month to get news related to the coronavirus summarized in Table 3 . Table 4 presents the descriptive regarding people's perceptions toward misinformation that they may have encountered regarding COVID-19 from multiple sources. Note: The respondents could skip questions they did not want to answer. Therefore, the number of observations varies for each question.

Table 5
Codebook and related questions.

Gender
What is your gender?

Age
How old are you?

Marital status
What is your current marital status?

Employment
How would you describe your current employment status?

Social class
Please think of a ladder at top of which stands best off people and at bottom stands worst off people in Pakistan. Where do you see yourself on that ladder?

Area of living
Which of the following best describes the area you live in?

Media use
From 0 to 10, which, if any, of the following (online websites, TV, newspaper and so forth) have you used in the last month as a source to get coronavirus related news?

Trust in sources
From 0 to 10, how trustworthy would you say news and information about coronavirus from the following sources (scientists, politicians, TV, and so forth).

Misinformation
From 0 to 10, how much false or misleading information about coronavirus, if any, do you think you have seen or heard from each of the following sources (scientists, TV, Facebook and so forth) within the last month?

Data collection
The data collection took place between April 24 and May 12, 2020, via an online survey. In total, 537 Pakistani nationals aged 16-68 participated in the survey. The survey took on average 5-7 min to complete. Since the focus of our work was to collect representative data across age and gender only, therefore, we tried to distribute our study across all regions of Pakistan. Moreover, we used online sources such as WhatsApp, Facebook, and email to share the questionnaire. We used Qualtrics software provided by the University of Kansas, Lawrence, United States, to organize and distribute our survey.
Though the data collection was aimed to be representative across age and gender due to a large group of younger participants [3] , most of the respondents were between 18 and 39 years old. This attribute increases the significance of the data. Because the past literature shows that young adults (24-39 years of age) are more likely to engage with multiple media platforms, and are more prone to use online media for consuming information, which is a relevant measure in this data, compared to the older population [4][5][6] . Furthermore, concerning gender, both males and females are equally represented in the dataset. We used R and RStudio to perform data analysis.

Ethics statement
Before data collection, the instrument was reviewed and approved by the first author's university Institutional Review Board. The authors received informed consent from participants. Participation was voluntary, and they could withdraw from the survey at any point. As an ethical research team, we value the privacy rights of human subjects. Therefore, the data we submitted does not identify participants based on their responses. The survey did not collect any identifiable information from the participants.

Credit author statement
Waqas Ejaz conceptualized and created the survey. Muhammad Ittefaq added the survey in Qualtrics and distributed it among participants. Both authors were involved in the data collection process. Muhammad Ittefaq wrote the paper. Waqas Ejaz reviewed and edited the final version.

Declaration of Competing Interest
The research team did not receive financial support from any institutions. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.