Students are Not Sure about Their Conceptual Understanding: A Comparative Study of the Level of Conceptual Understanding and the Level of Confidence Using Rasch Modeling

ABSTRACT


I. Introduction
Misconceptions are a classic and persistent issue in education, affecting students across various levels and disciplines, including physics [1]- [7].These misconceptions hinder students from grasping more complex concepts, creating a snowball effect where foundational misunderstandings lead to greater difficulties in advanced topics [8]- [11].Diagnosing and addressing these misconceptions is crucial for effective learning.
Various diagnostic instruments have been developed to identify misconceptions, ranging from open-ended questions [12], [13], to multiple-choice [11], [14], and multi-tiered diagnostic tools [15]- [21].Among these, the 4-Tier diagnostic instrument is particularly effective for large samples, as it separates confidence levels for conceptual questions and reasoning [22].Despite this, there remains a lack of consensus on the best diagnostic tool, highlighting a gap in the research.
Previous studies have shown that using a combined confidence level in the three-tier method underestimates the proportion of luck and overestimates students' scores [23], [24].However, these studies have not sufficiently explored the comparative effectiveness of the 4-Tier This study aims to fill this gap by comparing the level of student's conceptual understanding and their confidence using a 4-tier diagnostic instrument.By focusing on high school students in Yogyakarta, this research will provide new insights into the alignment (or misalignment) between student's perceived and actual understanding.The findings will offer valuable contributions to educational practices and future research on diagnostic tools.

II. Literature Review Misconception
Various terminologies are used to describe the differences in understanding between students and scientists.Commonly used terms are misconceptions and alternative conceptions.Misconceptions or alternative conceptions are concepts that are believed to be true but contradict the views of experts in a particular field, and show a systematic pattern of errors and are resistant to change [25].In the context of physics, misconceptions can be interpreted as views on physics concepts that are not in line with the views of current physicists, but are believed to be true by students.
Student misconceptions result from a variety of personal experiences.The sources of these misconceptions can come from physical experience, direct observation, intuition, teaching in school and outside school, social environment, culture, language, textbooks or other teaching materials, and teachers [26].

Four-tier diagnostic test
As the name implies, the 4-Tier type instrument consists of four levels.Level 1 (T1) is in the form of multiple-choice questions.Level 3 (T3) is the reason for T1.Level 2 (T2) is the level of confidence in T1, and Level 4 (T4) is the level of confidence in T3 [22].The four-level type instrument is an improvement over the previous Three Tier version by adding and modifying the order of question levels.The confidence level in the three-level method is used together for the two previous levels, while the fourlevel method uses the confidence level for the conceptual questions and reasoning/reasoning levels separately.The use of three tiers of confidence simultaneously results in two problems: (1).Underestimating the proportion of Luck of Knowledge, and (2).Over-score students [23], [24].

Likert Rating Scale
The Likert Rating Scale is a method often used in surveys and research to measure respondents' attitudes, opinions, or perceptions of a subject.This method involves a series of statements where respondents indicate their level of agreement or disagreement on a multi-point scale [27], [28].The Likert scale produces ordinal data [29], which means that the intervals between points may not be equal, making it difficult to use parametric statistical methods.Therefore, caution is required in analysing Likert scale data to avoid misinterpretation, and it is often recommended to use non-parametric statistics [30].
The variety of rating scales provides a more in-depth picture of respondents' attitudes and perceptions.In the context of misconception diagnostics, each point on this scale allows the researcher to identify the respondent's level of belief and uncertainty towards each given statement or concept.In addition to the 2-point Likert rating scale [31]- [33], 4-point [34], 5-point [35], or 6-point Likert scales [36], [37] are generally favoured in 4-tier misconception diagnostic instruments.

III. Method
This research is a survey conducted at a state senior high school in Yogyakarta, Indonesia.Fifty-six students (24 boys and 32 girls) were involved as participants or respondents.The number of respondents involved uses the assumption of the large sample size to achieve measurement accuracy and level of confidence.A sample size of 50 is used to achieve an accuracy of ± 1.0 Logit at a 99% confidence level [39].Therefore, the involvement of 56 respondents was deemed appropriate for use.All respondents involved in this study came from class XI, who had studied heat and temperature.The "L" symbol represents the male gender, and the "P" represents the female gender.
Students' conceptual understanding of temperature and heat was carried out using a four-tier misconception diagnostic instrument (4T-HTDT) consisting of 21 items.All items are spread over four concept groups, namely: temperature (6 items), expansion (4 items), heat of change of state and temperature (4 items), and heat and displacement (7 items).Each item consists of four levels [2].Level 1 (T1) is in the form of multiple-choice questions with several distractors.Level 2 (T2) is students' confidence in the answers they give at T1. Level 3 (T3) is the form of reasons for understanding the concepts students give to T1. Level 4 (T4) is the confidence level in students' reasons at T3.We used a 4-point Likert rating scale to explore students' confidence levels in T1 and T3.Scale 1 is used for "Just Guessing", scale 2 for "Not Sure", scale 3 for "Sure", and scale 4 for "Very Sure".
Student conceptual understanding data was collected online using 21 4T-HTDT items formatted on the Google Form platform.Google Form is distributed to students through their respective class teachers.All student participation used in this research is voluntary.Students as respondents are kept anonymous through the anonymity Analysis of the relationship between conceptual understanding and the level of students' confidence in the material heat and temperature using Ms. Excel and Winsteps 4.6.1 [40].Excel is used to do the coding and prepare the raw data.At the same time, Winsteps is used to assess students' conceptual understanding and confidence levels based on the Rasch Model.Conceptual understanding and level of confidence are analyzed through a Wright map combined with Logit Value of Person (LVP) [7], [41].Data analysis was carried out through several stages.The first stage is to carry out the process of coding the raw data.Next, the data preparation stage is carried out in *.prn format.Then an analysis of the conceptual understanding and belief level was carried out using Winsteps.The level of conceptual understanding and the level of belief are grouped into four levels.This grouping refers to LVP through the mean and standard deviation, as shown in Table 1.

IV. Results and Discussion
Students' concept understanding level (Tier 1 and Tier 3) and Confidence level (Tier 2 and Tier 4) are visualized using a Wright map or person-item map.A Wright map is a map that describes the state of the student's ability and the difficulty level of the item simultaneously.The Wright map is divided into two main sides: the left side, which describes the distribution of student abilities, and the right side, which describes the distribution of item difficulty levels.In principle, the distribution of student abilities and item difficulty levels is spread hierarchically from the bottom, which places the lowest logit, to the top, which places the highest logit [42].Students with the highest ability or conceptual understanding will be placed in the top-left location, and further down for those with lower ability.At the same time, items with the highest difficulty level will be placed at the top-right, followed by items with a lower difficulty level at the bottom.On the confidence level map (Tier 2 and Tier 4), the left side visualizes gradations of students' confidence levels, and the right side visualizes gradations of questions that are believed to be correct when answered.

Tier 1 vs Tier 2
Comparison of students' level of understanding of concepts at Tier 1 and their level of belief at Tier 2 is shown in Figure 1.The comparison between the level of understanding of the concept and the level of confidence held by students showed that some students needed to be more consistent between understanding the concept and the confidence they gave when answering the concept they understood.For example, students 13L and 14L are examples of students with a very high level of understanding.Meanwhile, their level of confidence in answering is in a low category.The same thing also happened to student 49P.Student 49P has a fairly high level of understanding, but his level of confidence in his understanding is in a low category.In contrast, two students (11P and 32P) have a low level of understanding of concepts but have a high level of confidence.While three students (03L, 43L, and 55L) had a low level of understanding, they were very confident about their understanding.
A comparison of the items' difficulty level and confidence level in the items has also been carried out.The results of the analysis show that there are inconsistencies in the number of items used.For example, the S2 item "Temperature depends on the object's material" was the most difficult item for all students.However, item S2 was identified as the most trusted item.The same thing also happened to item S12 "Materials such as wool have the ability to warm the body".Contrast with item S14, "Heating always results in an increase in temperature".Item S14 is not included in the group of questions in the very difficult category but is the item that students least believe in.

Tier 3 vs Tier 4
A comparison of students' level of understanding of concepts at Tier 3 and their level of belief at Tier 4 is shown in Figure 2. Figure 2(a) shows the distribution of the correct reasons given by students for the answers they chose at Tier 1.Most students understand why the phenomenon of temperature and heat occurs at Tier 1.As many as 8 out of 56 (14.3%) students are in the Very category Tall.As many as 21 out of 56 (37.5%) students were in the high category, and 22 out of 56 (39.3%) were in the moderately high category.Only 8.9% of students are in a low category.Concepts S9, S19, and S2 are the three most difficult reasons for students to understand, and three other concepts (S11, S7, and S21) are the easiest.Twelve questions are in the high category (difficult and very difficult), and the other nine are in the low category (easy and very easy).
Figure 2(b) shows the distribution of students' level of confidence in giving reasons at Tier 3.Only a few students have low confidence in the correctness of the reasons they give.At the same time, most of them believe the reasons they give.As many as 23.2% (13 out of 56 students) really believed the reasons they gave, and only 17.9% of students were very unsure.In comparison, most of the others (58.9%) have a fairly high and high level of confidence.On the other hand, Figure 2(b) maps students' beliefs about the reasons they answered at Tier 3. A few students had low confidence in their reasons for items S10, S16, and S20.In contrast, the other three items (S2, S3, and S21) are believed to be true by most students.The S2 concept "Temperature depends on the object's material" is the most reliable truth.Meanwhile, the S10 concept "substances that expand have a fixed density."At least the truth is believed by students.
The comparison between the level of reason and the level of belief held by students shows that some students are inconsistent.Truth reasons with the beliefs they have when giving contradictory reasons.For example, student 14L is an example of one student who has a level of understanding in the highest category at Tier 3 but has a low level of confidence.In line with 14L students, 13L students are included in the category of students with very high group understanding.But has low confidence in what he has answered.The same thing also happened to students 35P and 49P.At the same time, other students have a low understanding but are very sure of the reasons they give.For example, students 28P and 26P have the lowest understanding among all students, but their confidence level is high.Identical to students 28P and 26P, students 43L and 55L have the highest confidence among all students, but their level of understanding is below that of 14L students.
Comparisons between the level of reason and the level of belief have also been carried out.The results of the analysis show that there are inconsistencies in the number of items answered.For example, the S2 item "Temperature depends on the object's material" was the most difficult item for all students.However, item S2 is identified as the item that is most believed to be the truth of the reasons given.The same thing also happens in item S3, "The temperature of a substance can be transferred".In contrast to S2 and S3, Item S10, "Substances that expand have a constant density," is quite difficult but is at least believed by most students.

V. Conclusion
This study highlights the importance of diagnosing students' misconceptions and their confidence levels in understanding physics concepts.Using a 4-Tier diagnostic instrument, it was found that many students have a low understanding of concepts yet show a high level of confidence in their answers, and vice versa.This inconsistency between understanding and confidence suggests that students are not always aware of their true level of understanding.Further analysis revealed that difficult items were often more believed by students than easier items.These findings emphasize the need for a pedagogical approach that focuses not only on clarifying correct concepts but also on strengthening students' beliefs and incorrect understanding.The 4-Tier diagnostic instrument proved to be effective in identifying these discrepancies, which can assist educators in designing more appropriate interventions.This study provides valuable new insights for the development of better diagnostic tools and more effective teaching strategies.In conclusion, greater efforts are needed to bridge the gap between students' conceptual understanding and beliefs to improve overall learning outcomes.
Limitations of this study include the use of a sample limited to secondary school students in Yogyakarta, so the results cannot be generalized to a wider population.In addition, the 4-Tier diagnostic instrument used has limitations in detecting the entire spectrum of misconceptions.Future research could expand the sample to different levels of education and different geographical locations to obtain more representative results.In addition, the development and testing of more comprehensive and adaptive diagnostic instruments are needed to identify misconceptions more accurately.Future research could also explore effective pedagogical interventions to correct the mismatch between concept understanding and students' belief levels.The implementation of advanced educational technologies can also be considered to improve data collection and analysis more efficiently.Thus, it is hoped that new findings can make a significant contribution to improving the quality of learning in physics and other disciplines.

Indratno,
et al.Students are Not Sure about Their Conceptual Understanding …. p-ISSN: 2621-3761 e-ISSN: 2621-2889 instrument in diagnosing misconceptions and confidence levels.
Indratno, et al.Students are Not Sure about Their Conceptual Understanding …. p-ISSN: 2621-3761 e-ISSN: 2621-2889 model in the Google form.The demographic identity of the students involved is in the form of class and gender.

Figure 1 .Figure 1
Figure 1.Comparison of students' level of understanding of concepts at Tier 1 and their level of confidence at Tier 2 Figure 1 (a)shows that some students already understand the concept well.However, the average group showed that students' conceptual understanding was below

Figure 1 (
b) shows the distribution of students' level of confidence in responding to Tier 1.Most of the students believed in the answers they gave.No less than 23.2% of students are Very Confident, and only 17.9% are Less Confident about the answers they give.Meanwhile, the percentage of students with Confident and Confident Levels were 26.8% and 32.1%, respectively.On the other hand, Figure 1(b) maps students' beliefs about the questions they answer.Three questions (S3, S2, S12) were not believed most students to answer.In comparison, the other 18 questions are well believed what they have answered.The concept most students believed to be true was the doctoral question on the concept "The temperature of a substance can be transferred".While the lowest concept believed by most students in answering was S14 regarding the concept "Heating always increases temperature".

Figure 2 .
Figure 2. Comparison of students' level of understanding of concepts at Tier 3 and their level of confidence at Tier 4