Survey dataset on citizens’ perspective regarding government's use of social media for provision of quality information and citizens online political participation in Pakistan

This article describes the survey-based dataset which was collected to assess citizens’ perspective regarding government's provision of quality information on social media and factors including transparency, trust, responsiveness of the agency, and citizens’ online political participation. The survey was conducted online through Google Forms and 388 responses were collected from the social media followers of a government agency in Pakistan. The questionnaire consisted of thirty-three pre-validated items derived from the existing literature. The data was collected using five-point Likert scale and is related to the extensive model tested in [1]. The analysis was performed in [1] by using AMOS 20. This article specifically highlights data relating to demographic variables (including gender, age, education level, employment status, and city) with citizen's online political participation and is analysed through SPSS. The data is valuable as it provides empirical evidence regarding the state of social media presence of a government agency in Pakistan and can be used by researchers to make inter-agency/cross-national comparisons. The dataset is available and can be accessed from Mendeley Data repository (https://data.mendeley.com/datasets/3mhk2jv94m/2)

Social Sciences (General) Specific subject area E-governance: Social media in government Type of data Raw data in spreadsheet format (.xlsx) Tables How data were acquired The data was acquired by means of an e-survey on Google Forms by using both English and Urdu versions of the questionnaire. All questions in the e-survey were marked as mandatory to avoid missing values. The link to English version of the survey is as follows: https://forms.gle/Zx7stguXSg1DYU9a9 Data format Raw Analysed Parameters for data collection The data for testing the relationship among constructs was collected using a five-point Likert scale. For measurement of all constructs, the five-point Likert scale followed the pattern of 1 = Strongly disagree, 2 = disagree, 3 = neutral, 4 = agree, 5 = strongly agree except for one construct (citizens' online political participation) for which the scale followed the pattern of 1 = never, 2 = rarely, 3 = sometimes, 4 = often, 5 = always. The questionnaire also included questions related to demographics of the participants including gender, age, education level, employment status, and city. Description of data collection The Facebook and Twitter followers of a government agency in Pakistan (Punjab Food Authority) were contacted at random with the assistance of the social media team of the agency.

Value of the Data
• This dataset is useful as it provides citizens' view regarding the quality information provided by government on social media and its influence on factors like transparency, trust, responsiveness and citizens' online political participation in a developing country context where e-government is at an initial stage and thus there is scarcity of research in this area. • The government institutions in developing countries can take benefit from this data by looking at impact of provision of quality information on important governance indicators like transparency, responsiveness and trust of citizens as well as online political participation and hence can formulate their communication policies and decisions based on this empirical evidence.
• This dataset can be used by the researchers to make comparisons between social media usage by different government institutions in developing countries or comparison of government institutions from developing and developed countries can also be made. • This data represents primary data collected from a public agency on multiple relevant constructs measured through existing adapted tools which can be used by future researchers to compare across other study results using similar adopted tools. • Future studies based on this data might aid to understand better how governments' utilization of social media (especially developing counties) impact important constructs including transparency, responsiveness, trust, and online political participation.

Data Description
The survey data was collected by using a self-completion online questionnaire consisting of thirty-three items which were measured using five-point Likert scale. The Likert scale ranged from 1 = strongly disagree to 5 = strongly agree for all constructs except for the construct named online political participation for which it ranged from 1 = never to 5 = always. Other than these, five questions were asked related to the demographic characteristics of the respondents including gender (male, female), age (18-21, 22-25, 26-29, 30-33, 34-37, or Above 37), education level (matriculation/ O-Levels, intermediate/ A-Levels, graduation, post-graduation, or PhD), employment status (student, unemployed, employed/self-employed, or retired) and city. IBM SPSS version 20 was used to compile and analyse the data. In the Excel data file, the seven items coded as GPQI (GPQI1 to GPQI7) belong to the construct named government's provision of quality information on social media, the four items coded as PGT (PGT1 to PGT4) belong to the construct named as perceived government transparency, the four items coded as TGA (TGA1 to TGA4) belong to the construct named trust in government agency, the six items coded as PR (PR1 to PR6) belong to the construct of perceived government responsiveness, and the twelve items coded as PP (PP1 to PP12) belong to the construct named as citizens' online political participation. PP was the dependent variable of the study. Table 1 shows the descriptive statistics of PP across the two categories of gender. Table 2 shows the differences in the mean of PP across the two categories of gender. Levene's test revealed that there was homogeneity of variance ( p > 0.05), therefore the assumption of running a t -test was fulfilled. T-test was used as gender had less than three categories. It can be seen from the table that there was no significant difference between male and female participants in terms of PP.  Table 3 shows the descriptive statistics of PP across the six categories of age. Table 4 shows mean difference between six categories of age in terms of PP. Levene's test revealed the absence of the homogeneity of variance ( p < 0.05) which means that the assumption to run oneway analysis of variance (ANOVA) was violated. Therefore, instead of one-way ANOVA, the more robust tests i.e. Welch and Brown-Forsythe tests of equality of means were performed [2] which showed that there was a statistically significant difference between age groups in terms of PP ( p < 0.05). To identify which of the groups showed statistically significant differences, the Games-Howell post-hoc test for multiple comparisons was performed as shown in Table 5 , which further revealed that the differences between each group with another in terms of PP were not statistically significant.  Table 6 shows the descriptive statistics of PP across five education levels. Table 7 shows the mean difference between five levels of education in terms of PP. Levene's test showed the presence of homogeneity of variance ( p > 0.05) so the assumption to perform one-way ANOVA was fulfilled. The results of one-way ANOVA revealed that there was no statistically significant difference between groups of different education levels in terms of PP ( p > 0.05).  Table 8 shows the descriptive statistics of PP across four categories of employment status. Table 9 shows the mean difference between four categories of employment status in terms of PP. Levene's test revealed that the assumption about the homogeneity of variances was fulfilled ( p > 0.05), therefore one-way ANOVA was performed. The results of one-way ANOVA show that there was no statistically significant difference between the categories of employment status in terms of PP ( p > 0.05).  Table 10 shows the descriptive statistics of PP across the two categories of city (i.e. major cities/other smaller cities of Punjab). Table 11 shows the difference in the means of PP across the two categories of city. The Levene's test showed that the assumption about the existence of homogeneity of variance was fulfilled ( p > 0.05), therefore, t -test was performed. The result of t-test reveals that there was a significant difference between means of two categories of city in terms of PP ( p < 0.05). The mean of PP for the participants belonging to major cities of Punjab was slightly higher than the other cities of Punjab.

Experimental design, materials and methods
For collecting the data, quantitative survey method was used which allows to collect data in a time-and cost-efficient manner and eliminates the chance of any bias and intervention of the researcher. The case of a single government agency i.e. Punjab Food Authority was chosen as it was found to be one of those agencies which were actively using social media platforms. The Facebook and Twitter followers of the agency constituted the population frame which were 318,454 as of April 09, 2019 with 317,812 Facebook followers and 642 Twitter followers which shows that Facebook was the majorly used and more important social media platform for the agency. The Yamane's formula [3] was used to reach the appropriate sample size (384) with sufficient statistical power to make inferences 1 , and so over the four months duration i.e. from April to August 2019, the social media followers of the agency were randomly contacted with the assistance of social media team of the agency as the researchers did not have direct access to the list of followers and 388 responses were collected. The link to the online survey was sent to social media followers of the agency as a private message on Facebook and Twitter. The online survey was uploaded on Google Forms. In the initial phase of data collection, the social media team of the agency highlighted the inability of some respondents to understand the survey in English language and therefore the researchers translated the survey in Urdu language as well as per the suggestion of the social media team of agency to augment the response rate. A separate Urdu language survey was created on Google Forms ( https://forms.gle/58o8qk5oGUr3ozgN7 ) to collect data from those people who upon contact notified that they wanted to fill the survey in Urdu instead of English.
The data was automatically stored in separate spreadsheets for English and Urdu language questionnaires which was later compiled and transferred to SPSS 20 for analysis. All the items in the online survey were marked as mandatory to eliminate the chance of missing values. 1 According to this formula, keeping the margin of error as 5%, the sufficient size of sample would be 384.