Items parameters of the space-relations subtest using item response theory

This article describes the item parameters analysis result of the space-relations subtest measurement. This subtest is part of the differential aptitude test (DAT) instrument. The item parameters are characteristic psychometric refer quality of the item. The Item parameters than analyzed in this instrument are item fit model, item difficulty, item discrimination, pseudo-guessing, item information curves, and test information function. The data was collected through documentation technique from the space-relation test conducted at Biro Psikologi (Psychology Bureau) UNY, amounting to 1046 students from Yogyakarta, Indonesia. Data were analyzed using item response analysis with the assistance of BILOG program.


a b s t r a c t
This article describes the item parameters analysis result of the space-relations subtest measurement. This subtest is part of the differential aptitude test (DAT) instrument. The item parameters are characteristic psychometric refer quality of the item. The Item parameters than analyzed in this instrument are item fit model, item difficulty, item discrimination, pseudo-guessing, item information curves, and test information function. The data was collected through documentation technique from the space-relation test conducted at Biro Psikologi (Psychology Bureau) UNY, amounting to 1046 students from Yogyakarta, Indonesia. Data were analyzed using item response analysis with the assistance of BILOG program.
& 2018 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Subject area
Psychology More specific subject area Psychometry Type of data

Experimental features
Item parameters of instrument consisting of item fit model, item difficulty, item discrimination, pseudo-guessing, item characteristic curve, item information function and test information function.

Data source location
Special Region of Yogyakarta, Indonesia Data accessibility Data are within this article

Value of the data
Presents item parameter of the space-relations subtest based on item analysis using item response theory.
The findings can be used as a reference for researchers or test developers in conducting item selection and creating question compilations, particularly that relates to space relations test.
This data can be used as a reference in making improvements for items that have inappropriate item parameters.

Data
This research contains information on the psychometric characteristics of the space-relations sub-test items using items response theory. Item response theory is developed based on two postulates: (1) The examinee performance in a particular test could be predicted using a set of factors called latent traits, where traits are aptitude dimension of a person such as verbal ability, cognitive ability, and so on and (2) The relationship between examinee item performance and the traits that influence it is in accordance with monotonically increasing function called Item Characteristic Curve (1). These values consist of item difficulty (b), item discrimination (a), pseudo-guessing (c), item characteristic curve (ICC), item information function (IIF) and test information function (TIF). Based on the number of item parameter under the study for dichotomous data, there are three logistic models in the modern analysis that could be used; one-parameter (1-PL), two-parameter (2-PL), and three-parameter (3-PL) logistic model [1,2,3]. 1-PL model only has one parameter, which is item difficulty level; 2-PL model contains two parameters, i.e. level of item difficulty and discrimination index; while 3-PL model, containing level of difficulty and discrimination index, and pseudo-guessing parameter. Item difficulty shows how difficult an instrument is, judging by its item. Item discrimination index is an item's ability to distinguish between a person with high and low-ability to answer questions. Pseudoguessing refers to the chances that low-ability subjects answer items correctly. Item difficulty, item discrimination and pseudo-guessing were calculated using Bilog program. The model to predict the ability person was calculated by Matlab program of 1,2,3 PL. The presented data derives from item analysis in the form of the fit model and its parameter values. These model were described by item characteristic curve (ICC) formulated 1,2,3. The ICC 1-PL all items of space-relation can be seen in Figs. 1, the ICC all items of 2-PL in Figs. 2 and 3 were ICC of 3 PL. Item information function (IIF) is a method IRT to explained the item representation, It shows the score accuracy of the items. Birmbaum [1] defined this information in formula 4. Summary of IIF is test information function (TIF), It has the same concept with reliability in classical theory of measurement (4). Based of TIF, could be calculated standard error of measurement based of test. TIF and Standard Error of measurement (SEM) were representation the accuracy of test that explained latent variable [1,2,6,7].  Table 1 depicts the result of the fit model analysis. Table 2 shows data on parameters item, they are item difficulty, item discrimination index, and pseudo-guessing index. Meanwhile, Figs. 1-3 presented the item characteristic curve and Fig.s 4-6 showed test information function and standard error of measurement each response theory model.

Experimental design, materials and methods
This study aims to analyze the data documentation of psychological test results, namely spacerelations. The findings could be used to evaluate existing measuring tools and developing it into a new format. Item parameter analysis of the instrument was done using a modern approach, namely item response theory (IRT). IRT is the analysis of characteristic instrument that focused on information item [1,2,3,4].
The research instrument is a space-relation test, part of a differential aptitude test (DAT). This instrument was adapted in Indonesia. The original instrument was composed by Bennet, Harold G. Seashore, & Wesman in 1947 [8]. This instrument is in the form of multiple choices question and consists of 60 items that must be done within 30 min. The raw data are responses derived from 1046 subjects in Yogyakarta Special Region (DIY). All data were analyzed using item response theory with    the help of BILOG-MG and Matlab program. The BILOG-MG analysis was conducted three times because, based on the number of item parameters studied for dichotomous data types, there are three logistic models in the analysis of item response theory: one parameter logistic model (1-PL), two parameters (2-PL) and three parameters (3-PL). The 1-PL model only contains one item parameter, namely item difficulty; 2-PL model contains two parameters, specifically item difficulty and item discrimination index; while the 3-PL model also includes the pseudo-guessing parameter on top of the previous two [9,10,11]. An item-match analysis is performed to ensure which logistic model fits best with the space-relation subtest data (see Table 1). In Table 1, items that match their logistic model (fit model) are marked with *. An item is said to match its logistic model if it has a probability value of Z 0.01. The significance level (α) ¼ 0.01 is the minimum fault limit value with the degree of freedom (df) defined in this study [12]. An item is considered appropriate if it meets the item parameter criterion below: 1. A good level of item difficulty is at -2 to þ2 [3,5]. The results of the difficulty analysis are presented in Table 2. Easy items are marked with * and difficult items marked with **. Meanwhile, unmarked items have a moderate degree of difficulty. 2. A good item discrimination index criterion would be above 0 below 2 [3]. Table 2 presents result of the discrimination index analysis. Almost all items have a good discrimination index when analyzed with 2 PL models. There are 4 items need to be evaluated namely item numbers 47, 49, 51, 59.  The analysis of test information function and the standard error of measurement (SEM) did by Matlab program. The information function of space-relations measurement test analyzed with 1, 2, and 3 logistic parameters. Test information function is described as having a low curve that initially increase, reaching the highest score in the middle before it goes down again away from the midpoint.  The width of the curve shows the breadth of effective ability applied from the measurement results. TIF will be effective if the curve line extends above SEM line without having a cutoff point. Figs. 4-6 illustrate TIF, SEM and interaction between them. The three images show the TIF curve to be above SEM without any cutting point, meaning that all information obtained from the measurement results will be accurate on all abilities. Maximum measurement results differ between parameter models. In the 1 PL analysis, the maximum information function is 23,7070 which is in the 0.2054 abilities; the maximum 2 PL information function is 31.5433 with abilities at 0.1781, while the 3-PL information function maximal score is 46.14 with abilities at 0.15.