Developing and implementing an instrument for assessing critical thinking and visual representations in learning Physics materials of Optical Instruments

This study aims to develop an assessment instrument to measure students' critical thinking and visual representation abilities in optical instrument material. This research and development (R&D) employed the ADDIE (Analysis, Design, Development


Scan Me:
This study aims to develop an assessment instrument to measure students' critical thinking and visual representation abilities in optical instrument material. This research and development (R&D) employed the ADDIE (Analysis, Design, Development, Implementation, and Evaluation) instructional model. This research develops a test instrument that has more specific stages so the ADDIE development stage is integrated with the development stage of the test instrument developed by Mardapi. The subjects in this study are grade XI students of State Madrasah Aliyah (MAN) 4 Bantul. This study produced a test instrument for measuring students' critical thinking and visual representation abilities, with high validity,and analyzed using Aiken's V. The items for assessing critical thinking and visual representation abilities were analyzed using the QUEST program with the following findings. (1) The analysis results of instrument validation using Aiken's V obtained an average Aiken's Value above 0.4 with high validity and moderate validity criteria so all items are valid. (2) The pre-test analysis on item estimates obtained MNSQ Infit in the range 0.77-1.30 and the value of outfit t ≤ 2 and in the MNSQ Infit case estimates in the range 0.77-1.30, so overall, the items matched to Rasch models. (3) Post-test analysis on item estimates obtained MNSQ Infit in the range 0.77-1.30 and outfit t value ≤ 2 and in case estimates MNSQ Infit in the range 0.77-1.30 so, overall, items are in accordance with Rasch models. (4) The reliability of the pre-test items is 0.92 while the reliability of the post-test items is 0.76. (5) The difficulty level of the items in the pre-test results shows that Question 10 is the most difficult while number 3 is the easiest, and in the post-test results, Question 8 is the most difficult while Question 6 is the easiest. his critical thinking skills (Butler et al., 2008) and someone who has low critical thinking skills will find it difficult to compete in a global world (Frijters et al., 2006). Critical thinking skills in the learning process are needed, where when students receive knowledge transfer, they are not immediately taken for granted, but there is a process of classification, analysis, and evaluation carried out to filter this information. The average critical thinking ability of students in geometric optical material is still low with an average score of 27.20 out of 100.00, while the highest score is 71.05 and the lowest score is 20.63 (Pradana et al., 2017).
Physics is a science that studies natural phenomena. These natural phenomena are studied mathematically using various symbols or equations, so the ability of students to represent concepts visually greatly affects their understanding of physics concepts. Most humans are visual learners, and if teaching materials are equipped with lots of visualizations, the information will last longer (Felder & Soloman in Hikmat & Efendi, 2011). Representation is something that represents, describes, or symbolizes an object and/or process (Rosengrant et al., 2007). Visual representation ability is the ability to explain a concept using a model that makes it easier for students to solve problems and find solutions, such as using pictures and graphics. Visual representation is another alternative that is used to correct communication errors when conventional methods fail to convey the concept completely (Sankey, 2005).
The results of the interview with the physics teacher at MAN 4 Bantul showed that in the material for optical instruments, a lot of optical devices were discussed, but the media in schools was inadequate, making it difficult for teachers to carry out demonstrations. The laboratory space available in schools is incomplete and rarely used. The material for this optical instrument has many similarities and images that require students' critical thinking and visual representation abilities to understand it. In previous studies, it was known that there were some difficulties experienced by students in the material of optical instruments, namely difficulties in experimenting with optical devices of 38.04%, difficulties in making diagrams of the path of light in reflection and refraction 77.17%, difficulties in solving mathematical problems of 60.87%, and difficulties learning optical instruments in class and outside the classroom (Ainiyah et al., 2020).
Students' critical thinking and visual representation abilities need to be measured with appropriate measuring instruments, one of which is a test. In the development of a critical thinking test on optical device material, after it went through qualitative and quantitative analysis, 38.9% of the test items were accepted, 61.1% were revised, and no items were rejected (Nur'asiah et al., 2015). The use of assessment instruments is effective in improving students' critical thinking skills (Asmawati et al., 2018). The level of students' understanding regarding the optical instruments material can be known by conducting an assessment. Assessments usually carried out by physics teachers to find out students' understanding are using formative and summative tests.
Previous researchers have developed critical thinking instruments on optical material (Pradana et al., 2017), and visual representations of critical thinking skills on optical material (Nur'asiah et al., 2015). In this study, in addition to developing instruments, visualization of students' critical thinking was also carried out on optical material. Critical thinking indicators used in this study include interpreting, formulating problems, analyzing, concluding, evaluating, and developing strategies and tactics. In addition to critical thinking skills, to support students in understanding physics material, visual representation abilities are needed as a medium to help students visualize a concept. The indicators used in visual representation are to represent the equations in the concept of the eye and the loop in the form of an image.

RESEARCH METHOD
This research was conducted to develop an assessment instrument so the method used was research and development (R&D), with the aim of developing an instrument for assessing students' critical thinking and visual representation abilities. The subjects of this study are 61 grade XI students of MAN 4 Bantul. The development model used is ADDIE (analyze, design, develop, implement, and evaluate) model. This ADDIE model is basically for the development of learning instructional products. Meanwhile, in this research, the product being developed is a test that has a specific flow and process, so the ADDIE development stage is integrated with the test development stage developed by Mardapi which includes: (1) determining instrument specifications, (2) writing the instruments, (3) determining the scale of the instrument, (4) determining the scoring system, (5) examining the instrument, (6) conducting trials, (7) analyzing the instrument, (8) assembling the instrument, (9) carrying out the measurement, and (10) interpreting the measurement results (Mardapi, 2012). The data collection in this study was conducted through observation, interviews, and documents in the form of photos and videos. The research flow design is shown in Figure 1.

Figure 1. Research Design
The instrument validity analysis uses Aiken's V. Aiken formulates Aiken's V to calculate the content-validity coefficient, which is based on the results of an expert panel's assessment of an item in terms of the extent to which the item represents the construct being measured (Hendryadi, 2014). The formula proposed by Aiken is presented in Formula (1).

……………….. (1)
Notes: n1 = number of raters c = the highest validity rating score (i.e. 5) s = rater scale r = the score given by the rater lo = the lowest validity rating score (i.e. 1) s = r -lo Aiken's V coefficient values range from 0 to 1. If half of the raters say an item can be used, it means the item is valid. The criteria for the level of validity are shown in Table 1 (Retnawati, 2016 V > 0.8 High Validity Source: Retnawati (2016) The items of the instrument for assessing critical thinking and visual representation abilities were analyzed using the QUEST program to determine the validity, reliability, and level of difficulty of the items. The data analyzed were the results of the pre-test and post-test. The items analyzed using the QUEST program are declared valid if the Infit Mean Square (MNSQ INFIT) value ranges from 0.77 to 1.30. The results of the analysis are shown in Table 2 and Table 3. Meanwhile, the critical thinking indicators and visual representations used in this study are described in Table 4 and Table 5.   Formulating the problem Defining terms and identifying assumptions 3.
Analyzing Examining ideas, identifying arguments, and identifying reasons and claims 4.
Concluding Questioning the evidence, surmising several alternatives, and drawing conclusions deductively or inductively 5.
Evaluating Asking for results, justifying procedures, and giving reasons 6.
Developing strategies and tactics Self-monitoring, and self-correcting Students can represent equations in the eye concept in the form of images.
Categorizing, explaining the significance, and explaining the meaning.

2.
Students can represent the equations in the loop concept in the form of pictures.
Categorizing, explaining the significance, and explaining the meaning.

FINDINGS AND DISCUSSION
This research was conducted at MAN 4 Bantul regarding the development of an instrument for assessing critical thinking and visual representation abilities. It produced some data in the form of the results of test instrument validation conducted by four experts as well as pretest and post-test outputs which were analyzed using the QUEST program.

Findings
Based on the research that has been done, the results of instrument validation using Aiken's V were obtained. The detailed results are presented in Table 6.

The Output of the Pre-test and Post-test Using the QUEST Program
The analysis of the instrument for assessing critical thinking and visual representations based on pre-test and post-test scores using the QUEST program was carried out to determine the validity of the items, the reliability of the items, and the level of difficulty of the items. The results obtained are shown in Figure Table 7 and Table 8.  Based on the results of the pre-test analysis, the mean square infit was in the range 0.77-1.30 and the outfit t value ≤ 2, so all items fit the Rasch Model. Based on the results of the posttest analysis, the mean square infit was in the range 0.77-1.30 and the outfit t value ≤ 2, so all items fit the Rasch Model.

Students' Pre-test Answers
The answers of students' pre-test are presented in Figure 8 and Figure 9.

Students' Answers to Post-test Questions
Students' answers to post-test questions are presented in Figure 10, and Figure 11.

Discussion
The assessment instrument that was developed used the Rasch Model by utilizing the valid and reliable Quest application (Sari, 2020). In this study, the validation of the assessment instrument was carried out by four experts as presented in Table 6, which shows that the average rating instrument obtained moderately valid and highly valid criteria. With reference to Table 1, moderate validity has Aiken's Value which is in the range of 0.4-0.8, and high validity has Aiken's Value > 0.8. Therefore, this indicates that the assessment instrument is ready to be tested on students.
The process of testing the instrument was carried out in class XI involving a total of 61 students. The pre-test was administered at the beginning of a lesson to find out the extent to which students' abilities were. Then at the last meeting on optical instrument material, students were again given post-test questions which were almost similar to the pre-test questions given earlier. After that, an analysis was carried out using the QUEST program to measure the validity, reliability, and difficulty level of the questions.
The way to use the Quest application is by typing all the students' answers into a notepad and making syntax. Student syntax and answers were stored in notepad form and stored in the same folder as the Quest application. Failure in analysis using the QUEST program often occurs, this is because there is something wrong in the syntax or the student's answer file so that the output results do not come out. Therefore, making syntax and writing students' answers must be done very carefully so that no failure occurs in the analysis process.
One of the Quest outputs is the Reliability of Item Estimate. Reliability of Item Estimate or reliability value based on item estimation, also known as sample reliability. The higher the reliability value, the more the items that fit or match the model being tested. Conversely, the lower the reliability value, the more the items that do not fit or match the model being tested. The developed critical thinking instrument has high reliability, which is equal to 0.86, and is suitable for use as a good measuring instrument (Mukti & Istiyono, 2018). Reliability is the level of consistency or constancy of an item, so when the reliability value is low, it cannot provide the expected information. Determination of fit items in the QUEST program as a whole is based on the average value of the Infit Mean of Square (MNSQ Infit) and its standard deviation. Determination of the fit of each item with the model in the QUEST program is based on the magnitude of the MNSQ Infit value or the Outfit t-value of the item concerned. The determination of MNSQ Infit and Outfit is for the Rasch model. Figure 2 shows a Reliability of the Case Estimate or a summary of students with a value of 0.32. The reliability value shows that the higher the value, the more convincing that the measurement gives consistent results. However, if the reliability value is low, it means that students are inconsistent. The mean of the case estimate shows that the ability of students is lower than the item difficulty level. The determination of the overall fit case with the model in the QUEST program is based on the MNSQ Infit value and its standard deviation. Therefore, the output of the QUEST program analysis related to the estimated reliability of the case obtained the average MNSQ Infit value of 0.98 with a standard deviation of 0.38. When viewed more closely by relating the standard deviation, the MNSQ Infit value is 0.98 ± 0.38 or 0.98-0.38 = 0.60 to 0.98+0.38 = 1.36. Based on the results of the analysis, it was found that the MNSQ Infit was 0.60-1.36 so some were not in the range 0.77-1.33 so that, overall, the items were in accordance with the Rasch Model. Figure 3 shows that the Reliability of the Case Estimate is a value of 0.32. The Mean of Case Estimate shows that the ability of students is lower than the item difficulty level. The determination of the overall fit case with the model in the QUEST program is based on the MNSQ Infit value and its standard deviation. Therefore, the output of the QUEST program analysis related to the estimated reliability of the case obtained the average MNSQ Infit value of 1.00 with a standard deviation of 0.22. When viewed more closely by relating the standard deviation,  Figure 4 shows that x is the identity of the subject, while the number opposite it is the item number. The figure shows that overall, the ability of the subject is lower than the difficulty level of the questions. Question 10 shows the highest item difficulty level because there are no students at all who can answer the item correctly. Meanwhile, Question 3 shows the lowest item difficulty level (easy item), although there are still some students who cannot answer it correctly.
The difficulty level is in the ability range of -2 to +2 (Hambleton & Swaminathan, 1985) and the developed test instrument has a good level of difficulty with a range of -2.00 and 2.00 (Mukti & Istiyono, 2018). As is the case in Figure 5, x is the identity of the subject, while the number opposite x is the item number. The figure shows that overall, the ability of the subject is lower than the level of difficulty of the item, but there is also something higher than the level of difficulty of the item. Question 8 shows the highest item difficulty level, but there are some students who can answer the item. Meanwhile, number 6 shows the lowest item difficulty level (easy item), there are still some students who cannot answer it.
Each item is declared fit or appropriate if the MNSQ Infit value is between ≥ 0.77 to ≤ 1.30 (Adams & Kho: 1996;Subali & Suyata: 2012). Figure 6 shows that all of the ten items are within the line in the range of 0.77-1.30. Meanwhile, Item 3 is right on the midline, which indicates a very good item. Overall, these items are in accordance with the Rasch model. Figure  7 shows that all of the ten items are within the line in the range of 0.77-1.30. Overall, the items are in accordance with the Rasch model, except for Item 3, which is outside the line. Figure 8 shows the students' answers during the pre-test, which is about eye defects. The answers are still incomplete because they mention only the types of eye defects, even though the instructions contained in the questions are to mention and explain eye defects. The problem is a matter of critical thinking, where there is a process of representing and identifying or formulating a problem. Figure 9 shows the students' answers during the pre-test about shadow formation. The student's answers are correct because they are in accordance with the instructions, namely to describe the shape of normal eye shadow, nearsightedness, and farsightedness. This question is a matter of visual representation, where students must represent the concept in the form of an image. Figure 10 shows the students' answers during the post-test about eye defects and shadow formation. The answer is correct because the students answered all the questions and filled them in with the correct answers. This question is a matter of visual representation, where students must analyze the images presented, and then answer questions based on these images. Figure 11 shows the students' answers during the post-test about the magnification of the loop. Their answers to the questions are correct but incomplete, because in one of the questions students only give answers directly, meaning they do not write down the steps for solving them. The questions are concerned with critical thinking, where there is a process of analysis, evaluation, and preparation of strategies and tactics to solve these questions in order to get the right answers.
Based on the results of these answers, students on the pretest and posttest still have low critical thinking skills. This is in line with the findings of the research by Makhrus et al. (2020) which reported that critical thinking skills are still low, and then given teaching treatments, the students show higher critical thinking skills. Other research also provides learning treatments through animation (Disman et al., 2020) and the application of learning models (Murniati et al., 2020). The importance of implementing critical thinking learning needs special attention, to prepare all components that support critical thinking skills, instruments, media, and appropriate teaching materials.