Digital competency mapping dataset of pre-service teachers in Indonesia

This dataset used the Digital Competency Scale (DCS) to describe Indonesian pre-service teachers’ perceptions. The DCS instrument consisted of five constructs/dimensions, which are: 1) data and information literacy, 2) communication and collaboration, 3) digital content creation, 4) safety, and 5) problem-solving, with a total of 36 items using five-point agreement Likert scale. The data was gathered from 23 education and teacher training faculties at Muhammadiyah Universities in 14 provinces across Indonesia in the academic year 2021/2022. A total of 1400 students (18 to 23 years old) in their first to fifth years of study were recruited using the convenience sampling technique, where they participated in filling in the survey electronically using Google Form. The dataset was analysed with the Rasch model measurement approach using WINSTEPS version 5.2.3 software for data cleaning and validation, and reliability and validity testing of the instrument. This dataset analysis can help teacher-training institutions, or higher education policymakers design effective programmes to improve pre-service teachers' digital competencies. Furthermore, researchers can compare this dataset with more rigorous data from other countries.


a b s t r a c t
This dataset used the Digital Competency Scale (DCS) to describe Indonesian pre-service teachers' perceptions. The DCS instrument consisted of five constructs/dimensions, which are: 1) data and information literacy, 2) communication and collaboration, 3) digital content creation, 4) safety, and 5) problem-solving, with a total of 36 items using five-point agreement Likert scale. The data was gathered from 23 education and teacher training faculties at Muhammadiyah Universities in 14 provinces across Indonesia in the academic year 2021/2022. A total of 1400 students (18 to 23 years old) in their first to fifth years of study were recruited using the convenience sampling technique, where they participated in filling in the survey electronically using Google Form. The dataset was analysed with the Rasch model measurement approach using WINSTEPS version 5.2.3 software for data cleaning and validation, and reliability and validity testing of the instrument. This dataset analysis can help teachertraining institutions, or higher education policymakers design effective programmes to improve pre-service teachers' digital competencies. Furthermore, researchers can compare this dataset with more rigorous data from other countries.  Table   Subject Social science Specific subject area Digital competency, technology in education, teacher's competency Type of data

Value of the Data
• This dataset is valuable for policymakers in establishing teacher education programmes that assist pre-service teachers in comprehending their motives and attempts to enhance their knowledge, abilities, and experiences in digital competency. • Due to the Covid-19 pandemic that had shifted learning processes to encourage digital technology integration in Indonesia, the data gathered would meet the demand for digital competence among teachers, and prospective teachers. • The data is helpful for researchers who want to compare the results of this study to similar research related to digital literacy or digital citizenship frameworks in seeking and finding solutions to overcome obstacles related to digital technology integration in the higher education context using other statistical analyses. • This dataset on pre-service teachers in Indonesia can provide valuable information in mapping and predicting the progress of information technology-based education in the country with the largest population in Southeast Asia. • Due to teaching concerns, this dataset can be used by lecturers to train their students about data pre-processing using Rasch model measurement.

Objective
After the Covid-19 pandemic had disrupted the way education works, university systems have used the Information and Computer Technology (ICT) in many aspects of teaching and learning. Therefore, many countries have recently been developing national and international policies to improve and support digital competencies [1] . In teacher training for higher education, the need for digitally literate teachers has increased significantly because the graduates must have the ability to enhance their students' digital competencies, and use digital technologies in the learning-teaching process effectively in the future [2] . This dataset is influential for assessing digital competency in several aspects, such as data and information literacy, communication and collaboration, digital content creation, safety, and problem-solving. According to the literature, research on mapping these variables with respondents from various backgrounds at the national level needs to be made more apparent. The higher education policymakers also need brief information to design courses or training programmes that meet the demand for digitally competent teachers. Previous studies emphasised the implementation and importance of this competency in several developing countries [ 3 , 4 ]. However, only a few published studies investigate how to map pre-service teachers' digital skill potential in a vast country. Furthermore, the data can inform the formulation of action plans, decision-making considerations, or interventions that best support teacher training and education programmes.

Data Description
For that reason, this paper presents a dataset that describes the competency map of Indonesian pre-service teachers from the first to fifth study years (between ages 18 to 23) using the Digital Competency Scale (DCS), which follows the DigComp framework [5] , with several demographic information added. The data was divided into two groups: (1) demographic information, including gender, region, year of study, and department; and (2) the determinant competencies, which consist of the dimensions of data and information literacy, communication and collaboration, digital content creation, safety, and problem-solving. A total of 36 items of a 5-point Likert-type scale were used to measure the knowledgeability of respondents as a form of digital competency mapping. Originally, there were 1400 pre-service teachers from 14 provinces in 6 big islands in Indonesia who participated in the study electronically using Google Form.
The Rasch model is an analysis model used in this study. It has been widely employed across diverse domains, including education, commerce, psychology, healthcare, and other disciplines within the social sciences. The model is suitable for measuring latent traits in assessing human opinions, perceptions, and attitudes [ 6 ]. The Rasch analysis provides several statistical analytics: descriptive analysis, Chi-square ( χ2), unidimensionality of rating scale, person and item reliability, and Cronbach Alpha index. The current analytical model provides a thorough methodology for clarifying the degrees of item complexity via precise and meticulous measurement, commonly referred to as item calibration [ 7 ]. The utilization of a conjoint-measurement technique is implemented in order to calibrate a measurement model that establishes the correlation between an individual's ability and the difficulty level of an item. This is achieved through the use of a logit (logarithm odd unit) scale as a standardized unit of measurement, as outlined by Linacre [ 8 ]. The first stage of analysis was conducting data cleaning and validation using WIN-STEPS, a Rasch measurement model software, to detect outliers (responses with extreme maximum and minimum values), and misfits (responses for having an Outfit MNSQ index larger than 2.0) [5] . In the end, 1264 responses were further analysed, showing adequate data stability far beyond the minimum requirement for any sampling size method. The tables below depict respondent analytics before ( Table 1 with 1400 respondents) and after data cleaning ( Table 2 with 1264 respondents).
As shown, the person logit mean went from 1.61 logit (standard deviation, SD 1.44) (table above) to 1.95 logit (SD 1.29) after data cleaning (table below), which was an increase of 0.3 logit scale. This shows that all respondents perceived themselves as having a higher level of digital competency with a standard deviation higher than 1.0, indicating a very wide dispersion level among the respondents. However, the data quality in Table 2 showed better reliability, where the person separation index became 4.13, and person reliability rose to 0.94. Furthermore, Table 2 indicates that the data fits to the model, where the mean values of Infit and Outfit MNSQ are close to ideal 1.0, but are not able to be revealed in Table 1 since there are outliers. Table 3 presents the characteristics of the respondents based on the demographic information after the data cleaning. The sample of this study was dominated by female students (82.5%). The biggest portion of participants were in their second and third year of study (736 students, 58.2%), with a large majority of them majoring in social studies (42.2%), followed by language (29.8%),   85168.10 * * p < 0.01  Table 4 shows the two-facet item and person rating scale model processed for the 36 Digital Competency Scale (DCS) items, and 1264 respondents using the Rasch rating scale model (RSM) approach. As shown in the table below, the mean measure (logit) of the items is 0.00 logit, and the standard deviation is considered good (0.56), suggesting that the dispersion of measures was wide across the logit scale in terms of the item difficulty level. For person, the logit mean was 1.95 logit, showing that all respondents tended to perceive themselves as having a higher level of digital competency, with a standard deviation of 1.29, indicating a very wide dispersion level among the respondents. The standard error value was small, informing that the measurement is precise and accurate in terms of measuring this digital competency variable. The raw variance is higher than 40%, and the chi-square statistics result was significant, showing a uniform fit to the model. The separation index (more than three) [ 6 ], and reliability (more than 0.9) [ 7 ] of the item and person statistics suggest very good reliability. Table 5 presents the participants' response to each construct of DCS, which indicates their knowledgeability of DigComp. The construct of digital content creation had the highest logit (0.505 logit), showing that students perceived that they are good at it, followed by the construct problem solving , and communication and collaboration , which can be considered as competent, but not really good. The last two constructs, which were safety (-0.321 logit), and data and information literacy (-0.352 logit), had negative logit mean values, indicating that the respondents can be mostly considered not really good in these two competencies. Table 6 gives information about item difficulty by the mean value of each area, which presents the difficulty level of a specific competency area. The items in the DCS instrument were classified into four difficulty levels by dividing the distribution of the item logit scores based on mean (0.00), and standard deviation values (0.56), as shown in Table 4 . A higher logit value item (LVI) indicates that the item has high-level difficulty with the respondents. In total, there were 7 items (19%) in the category of very difficult , as agreed to by respondents (LVI > 0.56 logit). In second was the category difficult ( + 0.56 > LVI > 0.00), for which there were 13 items (36%), the biggest number of items. In the next category, which is easy (0.00 > LVI > -0.56), there were 12 √ C5 I am used to asking permission from the copyright owner before copying, or distributing content.

C6
I make use of, and create digital content, or at least be able to practise programming in solving problems.

√
( continued on next page )  ). Graphically, this is also shown in Fig. 1 below. Table 7 focuses on grouping participants' responses based on their logit score. The person logits were classified into four levels of knowledgeability by dividing the distribution of the person logit mean score (1.95), and its standard deviation value (1.29), as appeared in Table 4 . The distribution of the pre-service teachers' knowledgeability with regard to digital competency based on demographic background is described by the logit value of the person (LVP) in Table 7 above, and Fig. 2 below.

Experimental Design, Materials and Methods
This segment provides a description of the cross-sectional quantitative survey method employed for the study. The sample was 1400 undergraduate pre-service teachers from various fields of study from teacher training faculties in 23 private universities/higher education institutions (HEIs) on six big islands in Indonesia (Sumatera, Java, Kalimantan, Nusa Tenggara, Sulawesi, and Papua) during the academic year 2021-2022. Regarding ethical considerations, the students' consent to participate in this study was sought before filling in the questionnaire.
Data was collected through an online questionnaire using the Digital Competency Scale (DCS) developed from the Digital Competency framework, or the DigComp framework for citizens, and the digital literacy framework put forth by the Indonesian Minister of Information and Communications [9] . The DCS contained four basic demographic questions (i.e., gender, year of study, field of study, and region), and 36 items in five constructs/dimensions that addressed various aspects of pre-service teachers' digital competencies.
All collected data was inputted into a Microsoft Excel file, and checked by WINSTEPS version 5.2.3, a Rasch measurement model software for data validation and cleaning. 33 respondents had outlier responses (maximum or minimum ratings). Next, data cleaning was performed to identify inconsistency in the respondents' answers, and there were 103 aberrant responses. These responses and outliers were removed from the data, so the final number of respondents was 1264. The Rasch measurement model approach was the WINSTEPS software, which mathematically transformed raw ordinal (Likert-type) data by calibrating item difficulties and person abilities. The transformation was based on the frequency of response, which appeared as probability to become logit (log odd unit) via the logarithm function, which assesses the overall fit of the instrument, and person fit [10] . Later, a measurement model was calibrated by conjoint measurement to determine the relationship between the item difficulty level, and the person's ability using the same unit scale, a scaled logit (logarithm odd unit) [ 8 ].

Ethics Statement
The authors ensured that the respondents' participation was strictly voluntary and anonymous. Regarding the data collection, the Research Ethics Committee of Universitas Muhammadiyah Surakarta was responsible for this project, and granted the approval code of 1238/HIT/FKIP/2021 . All respondents were invited by email and text message to participate in the study. The first page of the online questionnaire stated that their participation would be strictly anonymous and voluntary to address any ethical concerns. Thus, by completing the questionnaire, the respondents had given their consent. Respondents could leave the study at any time, and for any reason, with no penalty and loss of benefits to which they were entitled, if any. The online survey was written anonymously to guarantee the confidentiality of their personal data.

Declaration of Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work reported in this paper.

Data Availability
Digital Competencies Mapping Dataset of Pre-service Teachers in Indonesia (Original data) (Mendeley Data).