Investigating and improving student understanding of the expectation values of observables in quantum mechanics

The expectation value of an observable is an important concept in quantum mechanics since measurement outcomes are, in general, probabilistic and we only have information about the probability distribution of measurement outcomes in a given quantum state of a system. However, we find that upper-level undergraduate and PhD students in physics have both conceptual and procedural difficulties when determining the expectation value of a physical observable in a given quantum state in terms of the eigenstates and eigenvalues of the corresponding operator, especially when using Dirac notation. Here we first describe the difficulties that these students have with determining the expectation value of an observable in Dirac notation. We then discuss how the difficulties found via student responses to written surveys and individual interviews were used as a guide in the development of a quantum interactive learning tutorial (QuILT) to help students develop a good grasp of the expectation value. The QuILT strives to help students integrate conceptual understanding and procedural skills to develop a coherent understanding of the expectation value. We discuss the effectiveness of the QuILT in helping students learn this concept from in-class evaluations.


Introduction
Learning quantum mechanics is challenging for introductory students and even for advanced undergraduate and PhD students   2 . Investigations of student difficulties in learning quantum mechanics are important for developing curricula and pedagogies that help students develop a solid grasp of quantum mechanics [24,. Measurement outcomes are, in general, probabilistic. Since we only have information about the probability distribution of measurement outcomes in a given quantum state of a system, the expectation value of an observable is a central concept in quantum mechanics. However, few prior research studies have focused on the conceptual and procedural difficulties upper-level undergraduate and PhD students have with expectation values of physical observables when using Dirac notation, a compact and convenient notation used extensively in quantum mechanics.
Here we discuss an investigation of difficulties that upper-level undergraduate and PhD students have with the expectation values of observables in a generic quantum state when making use of Dirac notation in courses in which this notation was used extensively. The expectation value is the average value of an observable when measurements of that observable are made on a large number of identically prepared quantum systems, and it is used frequently in quantum mechanics since measurement outcomes are probabilistic rather than deterministic. In quantum mechanics, for each observable Q, there is a corresponding Hermitian operatorQ. When the quantum system is in a state Y | ⟩ and an observable Q is measured in an experiment, one obtains an eigenvalue ofQ. Therefore, the expectation value Y Y ⟨ |ˆ| ⟩ Q in a given quantum state Y | ⟩ can be found by summing over all possible measurement outcomes the probability of obtaining a particular eigenvalue ofQ multiplied by that eigenvalue. Furthermore, expectation values of physical observables obey time evolution equations that are analogous to those in classical mechanics (Ehrenfest's theorem). Ensuring that students in quantum mechanics courses conceptually understand the meaning of expectation value and develop proficiency in calculating it is important.
Here we first focus on an investigation of the difficulties that advanced students have with the expectation value after traditional instruction in quantum mechanics courses. Then, we discuss how the research on students' difficulties was used as a guide to develop a quantum interactive learning tutorial (QuILT) to improve students' understanding of these concepts. The QuILT uses a guided inquiry-based approach to learning and was developed using an iterative approach to development and assessment.
Below, we start with a brief background on the expectation value of an observable that upper-level undergraduate and PhD students learn in a quantum mechanics course. We then describe the methodology for the investigation of students' difficulties and categorise the difficulties found. Next, we discuss the development and assessment of the QuILT including data from upper-level undergraduate and PhD students suggesting that the QuILT was effective in improving students' understanding of the expectation value of an observable.

Background
If the states = ¼¥ {| ⟩ } q n , 1, 2, 3 n form a complete set of eigenstates of a Hermitian operatorQ corresponding to an observable Q with non-degenerate discrete eigenvalues q n (i.e., =| ⟩ | ⟩ Q q q q n n n ), one can find the expectation value of the observable Q in a generic state Y | ⟩ in terms of the eigenstates and eigenvalues ofQ by expanding Y | ⟩ as a linear superposition Moreover, if the states {| ⟩} q are a complete set of eigenstates ofQ with continuous eigenvalues q (i.e., =| ⟩ | ⟩ Q q qq ) and the identity operator in terms of the eigenstates ofQ is ò = +¥ -¥ | ⟩⟨ | I q q q d , using a very similar approach to that used for the case in which the eigenvalue spectrum ofQ is discrete, the expectation value of Q in state Y | ⟩ in terms of the eigenstates | ⟩ q and eigenvalues q is 2

Methodology for the investigation of student difficulties
Student difficulties were investigated by administering multiple-choice and open-ended questions to upper-level undergraduate and PhD students in quantum mechanics courses after traditional instruction in relevant concepts. The traditional lecture-based instruction included discussions of the identity operator, probabilities of obtaining outcomes after a measurement of a physical quantity, and derivations of expectation values of physical quantities corresponding to operators with discrete and continuous eigenvalues. We observed difficulties on these questions which were administered on in-class quizzes and exams. The undergraduate students were enrolled in an upper-division junior/senior level undergraduate quantum mechanics course and the PhD students were enrolled in a first year core quantum mechanics course. Table 1 lists the questions that were administered to students as part of this investigation. The multiple-choice question was administered to 184 upper-level undergraduate students after traditional instruction as part of a quiz at four US universities (see question Q1  in table 1). The open-ended quiz and exam questions were administered to undergraduate and PhD students after traditional instruction in quantum mechanics at the University of Pittsburgh over several years (see questions Q2 and Q3 in table 1). The number of students answering the open-ended questions Q2 and Q3 is different in table 1 because, in some of the years, undergraduate students were not given question Q3. Since the performances on quizzes and exams were comparable, we present consolidated data here. The data from different years was combined because student performance and difficulties were similar in different years.
The open-ended questions Q2 and Q3 were graded using rubrics which were developed by the two investigators together. A response received full credit (3 points) if the student inserted the identity operator, used an expansion of the generic state Y | ⟩, or conceptually reasoned that the expectation value is the sum of all eigenvalues ofQ multiplied by the probability of obtaining that eigenvalue after a measurement of Q corresponding to the operatorQ to obtain the correct final answer. Students earned 2.5 points if they wrote a correct final answer with no reasoning or if they forgot to define the coefficients in the expansion of the generic state Y | ⟩. Students earned 2 points if they used a summation instead of an integral and vice versa. Students earned 1.5 points if they attempted to insert the identity operator or use an expansion of a generic state Y | ⟩ but did not determine the correct final answer. Students earned 1 point if they inserted a projection operator instead of the identity operator. All other responses were scored as zero. A subset of the open-ended questions was graded separately by the investigators. After comparing the grading of the open-ended questions, the investigators discussed any disagreements in grading and resolved them with a final inter-rater reliability of better than 95%.
Student difficulties were also investigated by conducting individual interviews with 23 upper-level undergraduate and PhD student volunteers enrolled in the quantum mechanics courses (not necessarily the same students who answered the written questions). The individual interviews employed a think-aloud protocol to better understand the rationale for students' written responses. During the semi-structured interviews, we asked students to 'think aloud' while answering the questions. Students first read the questions on their own and answered them without interruptions except that they were prompted to think aloud if they were quiet for a long time. After students had finished answering a question to the best of their ability, we asked them to further clarify issues that they had not clearly addressed earlier while thinking aloud.
Students' reasoning on questions in interviews was used as a guide to generate categories of difficulties and student responses on open-ended questions were coded into categories of difficulties. A subset of student responses on the open-ended questions were coded to determine categories of difficulties by two of the researchers separately. After comparing codes, any disagreements were discussed until full agreement was reached. Table 1. Questions involving expectation value that were administered to undergraduate students (UG) and PhD students (G) and the number of students ( ) N answering the questions. The correct answer is bolded.
1, 2, 3 n forms a complete set of orthonormal eigenstates of an operatorQ corresponding to a physical observable with non-degenerate eigenvalues q . nÎ is the identity operator. Choose all of the following statements that are correct.

UG
n n n 2 Below, we summarise the common conceptual and procedural difficulties involving the expectation value that were observed in written responses and interviews:

Failing to reason about the expectation value conceptually
In interviews, students were asked to determine the expectation value and describe conceptually what the expectation value means. Very few students reasoned conceptually that the expectation value of Q is the average of a large number of measurements on identically prepared systems to determine that n n n 2 In interviews, when students were able to elaborate on their answers, students rarely mentioned that the expectation value is an average of a large number of measurements on identically prepared systems and is quantitatively represented as a sum over the probabilities of obtaining outcome q n multiplied by the eigenvalue q .
n This difficulty occurred regardless of whether or not they understood that the probability of obtaining eigenvalue q n after a measurement of the observable Q corresponding to operatorQ is Y |⟨ | ⟩| q . n 2 Most students used a formal approach to evaluate the expectation value. While some students followed correct procedures such as inserting the identity operator in terms of the eigenstates of the operator or expanding the generic state Y | ⟩ as a linear superposition of the eigenstates of the operator, many students who tried to use these methods got lost along the way. The fact that so few students were able to reason conceptually about how to determine the expectation value points to the fact that even upperlevel undergraduate and PhD students often prefer 'plug and chug' methods as opposed to developing a coherent conceptual understanding that can facilitate the use of the simpler conceptual approaches (which are significantly less prone to error). PhD students were more facile in using the identity operator to determine the expectation value than undergraduate students. Therefore, the percentages of undergraduate and PhD students answering Q2 and Q3 correctly in table 2 is very different. However, written responses and interviews with undergraduate and PhD students suggest that many of them did not realise that the expectation value is the average of a large number of measurements on identically prepared systems and they could have reduced their chances of making a procedural mistake if they had used a conceptual approach to find the expectation value. Table 2. Distribution of student responses on question Q1 and average scores of undergraduate students (UG) and PhD students (G) on questions Q2 and Q3. Percentage of students providing the correct answers are bolded in the multiple choice question Q1. Table 3 should be consulted for the specific percentages of students displaying this difficulty. When evaluating Y Y ⟨ |ˆ| ⟩ Q , some students wrote incorrect expressions for the operatorQ acting on state Y | ⟩, e.g., Y =| ⟩ | ⟩ Q q q n n or Y = Ŷ| ⟩ | ⟩ Q q , n because they incorrectly reasoned that an operatorQ acting on a generic state Y | ⟩ will yield an eigenstate and/or eigenvalue of the operatorQ . This confusion was often due to conceptual difficulty with quantum measurement. In particular, interviewed students with this type of response often incorrectly claimed that an operator acting on a generic state Y (ˆ| ⟩) Q describes the measurement process and the right-hand side of the equation, e.g., Y =| ⟩ | ⟩ Q q q , n n is the 'outcome' of the measurement process [16]. For example, one student reasoned: ' Y =| ⟩ | ⟩ Q q q n n because by generalised statistical interpretation, an operator acting on a general state will yield an eigenvalue of that operator with probability Y |⟨ | ⟩| q .
This type of difficulty has been observed in other contexts as well [16]. Also, some students inappropriately interchanged the states Y | ⟩ and | ⟩ q n when finding the expectation value. Interviews and written responses suggest that instead of recalling that a generic state Y | ⟩ can be written as a linear superposition of a complete set of eigenstates of an operator, some students thought that Y | ⟩ can be written as an eigenstate ofQ when finding the expectation value of Q. For example, some students correctly wrote Y Y ⟨ |ˆ| ⟩ Q and then arbitrarily replaced the generic state Y | ⟩ with the eigenstate |q . n One student who stated that Y = | ⟩ | ⟩ q n in this context wrote: In Table 3. Percentages of undergraduate (UG) and PhD (G) students (out of those who attempted to answer the question) who displayed various difficulties with the expectation value.
Expectation value of an observable Q (the corresponding operatorQ has discrete eigenvalues q n ) Incorrect expression for the expansion of Y | ⟩ 2% 5% Incorrect expression for expectation value 8% 2% Attempting to use the identity operator but getting lost along the way 2% 4% Blank 11% 3% Expectation value of an observable Q (the corresponding operatorQ has continuous eigenvalues q) Incorrect expression for expectation value 11% 3% Attempting to use the identity operator but getting lost along the way 5% 7% Blank 45% 6% addition to not understanding how a generic state differs from an eigenstate ofQ and replacing Y | ⟩ with | ⟩ q , n this student (and many others) made several procedural mistakes. For example, without justification, the student introduced a summation over index i going from 1 to n but the index i is never used in the expression he was summing over (he summed over i but used the index n when writing the eigenstate | ⟩ q n ). Moreover, instead of using a Kronecker delta, he used a Dirac delta function, which diverges when ¢ = q q .
n n Another student, who arbitrarily replaced the generic This student also introduced a sum (although it is over the index n) but did not justify where it came from. Interviews suggest that at least some students who incorrectly replaced Y | ⟩ with | ⟩ q n introduced a summation because they remembered that the expectation value of an observable involves a summation.
Students who claimed that the operatorQ acting on Y | ⟩ yields, e.g., Y =| ⟩ | ⟩ Q q q , n n sometimes had the same final incorrect answer as students who arbitrarily replaced Y | ⟩ with | ⟩ q .
n For example, one student incorrectly reasoned that n n n n Another student who claimed that Y = | ⟩ | ⟩ q n wrote: Qq q q q q n n n n n and Y ⟨ | ⟩ q n is the component q n in Y.' These two students had the same final answer despite their different reasoning. In interviews and some written responses, it was clear whether a student claimed that the operatorQ acting on Y | ⟩ yields, e.g., n However, since the two difficulties can lead to the same final answer in written responses, it was sometimes unclear as to which category to code the difficulty. Thus, the two difficulties were combined into one category.

Incorrect expression for the expansion of |Ψ〉
Some students wrote incorrect expansions of Y | ⟩, e.g., n n n Table 3 should be consulted for the specific percentages of students displaying this difficulty. For example, one student stated: ' Y | ⟩ can be expanded as a sum of eigenstates ofQ, n n n ' This student incorrectly claimed that the eigenvalues q n of the operatorQ were the expansion coefficients c n when Y | ⟩ is expanded in terms of a complete set of eigenstates | ⟩ q .
n If this were the case, the expansion coefficients will always be the same regardless of what Y | ⟩ actually is. Another student reasoned that å Y = | ⟩ | ⟩ q n and wrote: Interestingly, this student did not change the generic state Y | ⟩ in the 'bra' state. We note that this type of reasoning may have led some students to incorrectly select statement 3 in question Q1 in table 1. These types of difficulties demonstrate that students have some correct knowledge, for example, they know that one can write Y | ⟩ as a superposition of the eigenstates of a generic operatorQ and use this linear superposition to find the expectation value Y Y ⟨ |ˆ| ⟩ Q . However, interviews suggest that they often struggle to determine the appropriate expansion of Y | ⟩ or the coefficients of the expansion partly because they do not have a conceptual understanding of what the expansion coefficients mean.

Incorrect expression for the expectation value
Some students wrote an incorrect expression for the expectation value, e.g., Y ⟨ |ˆ| ⟩ q Q n in which the 'bra' and 'ket' states are not the same. Table 3 shows the specific percentages of students displaying this difficulty in the open-ended questions. For example, one student wrote: These types of difficulties indicate that many students are not aware of the fact that the expectation value is found by 'sandwiching' the operator between the 'bra' and 'ket' states in which the expectation value is evaluated, i.e., Y Y ⟨ |ˆ| ⟩ Q .
4.6. Attempting to use the identity operator but getting lost along the way Some students were aware of the fact that one could find the expectation value by inserting the identity operator in the expression for the expectation value, but they had difficulty with the procedure and/or got lost along the way. Table 3 should be consulted for the specific percentages of students displaying these difficulties. For example, one common difficulty was an inability to distinguish between identity and projection operators. Other students had the correct expression for the identity operator but were unable to determine the expectation value correctly. For example, one student wrote the following: n n n n This student was able to correctly insert the identity operator but did not define Y( ) q n and left his final answer in terms of the operatorQ. Another student wrote: This student was able to correctly insert the identity operator, but then stated that =| ⟩ Q q q without the state | ⟩ q on the right-hand side. Similar difficulties have been observed in other contexts as well. For example, prior research shows that some students believe that the Hamiltonian operatorĤ acting on an energy eigenstate y | ⟩ n yields the corresponding eigenvalue E , n i.e., y = | ⟩ H E n n [16]. Other students believe that the position operatorx acting on a position eigenstate ¢ | ⟩ x [31]. Some students physically justify their incorrect responses of this type by claiming that the Hamiltonian operator acting on its eigenstate corresponds to the measurement of energy and should yield energy on the right-hand side of the equation or that the position operator acting on its eigenstate corresponds to the measurement of position and should yield position on the right-hand side.

Development and validation of the QuILT
The difficulties described in the previous section indicate that students struggled with the expectation values in the context of Dirac notation after traditional instruction in relevant concepts. Therefore, we developed a QuILT that takes into account the common difficulties found and strives to help students develop a better grasp of these concepts. The researchvalidated QuILT is inspired by a model of student learning centred on Vygotsky's notion of the 'zone of proximal development' [59]. The 'zone of proximal development' refers to the zone defined by the difference between what a student can do on his/her own and what a student can do with the help of an instructor who is familiar with his/her prior knowledge and skills. Providing scaffolding support is at the heart of the 'zone of proximal development' model and can be used to stretch students' learning beyond their current knowledge using carefully crafted learning tools that provide scaffolding support. Furthermore, a cognitive task analysis of the underlying concepts from an expert's perspective [60] was also used as a guide to develop the QuILT. The cognitive task analysis involves a careful analysis of the underlying concepts in the order in which those concepts should be invoked and applied in each situation to accomplish a task (i.e., answer the quantum physics questions in our case). The QuILT actively engages students in the learning process using a guided inquiry-based approach in which various concepts build on each other. It strives to provide appropriate scaffolding support to students in order to help them remain in the 'zone of proximal development'. The QuILT can be used in upper-level undergraduate and PhD-level quantum mechanics courses after students have had instruction in relevant topics.
The development of the QuILT went through a cyclic, iterative process which included the following stages before the in-class implementation: (1) Development of the preliminary version based on a cognitive task analysis of the underlying knowledge and research on student difficulties with relevant concepts. (2) Implementation and evaluation of the QuILT by administering it individually to students and obtaining feedback from faculty members who are experts in these topics. In addition to written free-response and multiple-choice questions administered to students in various classes, we conducted individual interviews with 23 student volunteers. The interviews used a think-aloud protocol to better understand the rationale for their responses throughout the development of various versions of the QuILT and the corresponding pretest and posttest administered to students before and after they engaged in learning via the QuILT. After each individual interview with a particular version of the QuILT (along with the administration of the pretest and posttest), modifications were made based upon the feedback obtained from the interviewed students. For example, if students got stuck at a particular point and could not make progress from one question to the next with the scaffolding already provided to them, suitable modifications were made to the QuILT. Thus, the administration of the QuILT to several PhD students and upper-level undergraduate students individually was used to ensure that the guided inquiry-based approach was effective and the questions were unambiguously interpreted. The QuILT was also iterated several times with two PhD students who conduct physics education research and three faculty members to ensure that the content and wording of the questions were appropriate. Modifications were made based upon their feedback. The QuILT strives to provide enough scaffolding to allow students to build a robust knowledge structure while keeping them engaged in the learning process. When we found that the QuILT was working well in individual administration and the posttest performance was significantly improved compared to the pretest performance, it was administered to upper-level undergraduate students in various classes.

Structure of the QuILT
The QuILT includes a pretest to be administered right after traditional instruction on the relevant concepts but before students engage with the QuILT and a posttest to be administered after students work on the QuILT. The pretest is not returned to students but the posttest is returned to them after grading. The questions on the pretest and posttest are the same and in free-response format (not multiple-choice). The free-response format requires that students generate answers based upon a robust understanding of the topics as opposed to via memorisation. The QuILT begins with a 'warm-up' that builds on students' prior knowledge about a vector in a physical three-dimensional vector space they are familiar with from introductory physics and helps them make connections between a force vector  F in a physical threedimensional vector space and a quantum state vector Y | ⟩ in an abstract vector space. Then, students learn about the basics of Dirac notation including scalar products before learning about the expansion of a state using a complete set of eigenstates of an operator corresponding to different observables, probability distributions for measurement of observables and expectation values of observables in a given quantum state, projection operators, and completeness relations. The last section of the QuILT focuses on connecting Dirac notation with position and momentum representations. Here, we will only focus on the development and evaluation of the part of the QuILT focusing on expectation values. Investigations of student difficulties with other aspects of Dirac notation such as inner products, the expansion of a state vector in terms of a complete set of eigenstates, the completeness relation, and probability distribution of measurement outcomes and how the Dirac notation QuILT addresses those difficulties are described in prior work [18, 29-32, 49, 58].
The QuILT strives to help students remain in the 'zone of proximal development' by explicitly bringing out common conceptual difficulties found via investigations and then providing appropriate scaffolding to help them develop a coherent understanding. Throughout the QuILT, students select an answer based on their understanding up to that point and then are given opportunities to check the answer via follow up questions and discussions with a peer. If a student's answer is inconsistent with the correct answer, further scaffolding is provided throughout the QuILT to ensure that students remain in the 'zone of proximal development.' The QuILT is best used in class to give students an opportunity to work together in small groups and discuss their thoughts with peers, which provides peer learning support. However, students can be asked to work on the parts they could not finish in class as homework. Students can also be asked to work on the entire QuILT as a self-paced learning tool as long as the pretest and posttest are administered in class as an incentive for students to engage with it. Below, we give some typical examples of how some of the common difficulties found via research are incorporated and how student learning is scaffolded via the QuILT.

Helping students integrate conceptual and quantitative reasoning about expectation
value. In the investigation of student difficulties, most students did not reason conceptually about the fact that the expectation value of an observable is the average of a large number of measurements on identically prepared systems. Some students also wrote incorrect expressions for the expectation value. The following questions are part of a guided inquiry-based learning sequence in the QuILT that strives to help students integrate conceptual understanding and quantitative reasoning when learning about expectation values: Consider the following statement from a student: Y |⟨ | ⟩| q n 2 is the probability of measuring q n when you measure observable Q in the state Y | ⟩. The expectation value is the average value of a large number of measurements performed on identically prepared systems. Since we know the probability of measuring each eigenvalue q n of the operatorQ, the expectation value is for the expectation value Y Y ⟨ |ˆ| ⟩ Q of an operatorQ with eigenstates | ⟩ q (which form a basis in an infinite-dimensional vector space) with continuous eigenvalues q on their own.

Addressing the incorrect claim that an operator b
Q acting on |Ψ〉 yields, e.g., b Q |Ψ〉 ¼ q n |q n 〉 or b Q |Ψ〉 ¼ q n |Ψ〉 or arbitrarily replacing |Ψ〉 with |q n (or |Ψ〉 with |q). In the investigation of student difficulties, some students wrote incorrect expressions for the operatorQ acting on a generic state Y | ⟩, e.g., Y =| ⟩ | ⟩ Q q q , n n or Y = Ŷ| ⟩ | ⟩ Q q n because they incorrectly reasoned that an operatorQ acting on a generic state Y | will yield an eigenstate and/or eigenvalue of the operatorQ. Other students inappropriately interchanged the generic state Y | ⟩ with the eigenstates | ⟩ q n of the operatorQ when finding the expectation value because they thought that Y | ⟩ can be written as an eigenstate ofQ when finding the expectation value of Q. Although these students correctly recalled that a generic state Y | ⟩ can be written as a linear superposition of a complete set of eigenstates of an operator, they then incorrectly assumed that Y | ⟩ can be written as an eigenstate ofQ when finding the expectation value of the observable Q. In the QuILT, students work through a guided inquiry-based learning sequence that strives to help them write the operatorQ in terms of its eigenvalues and eigenstates and verify that After this sequence of questions, the support is reduced when students are asked to determine an expression for a Hermitian operatorQ corresponding to an observable with eigenstates | ⟩ q with continuous eigenvalues q in terms of the eigenstates | ⟩ q and eigenvalues q. Following these questions and peer discussion, additional questions provide further guidance in concrete contexts.

5.2.3.
Helping students write a correct expression for the expansion of |Ψ〉 using a complete set of eigenstates of the operator b Q . In the investigation of student difficulties, some students wrote incorrect expansions of Y | ⟩, e.g., | ⟩ q q n n n when writing an expression for expectation value of the generic observable Q. The following guided inquirybased sequence in the QuILT helps students learn how to write an expansion of Y | ⟩ using a complete set of eigenstates of the operatorQ taking into account the common student difficulties found: • Consider the following conversation between two students: ○ Student A: If we write Y | ⟩ as a linear superposition of the eigenstates ofQ, we obtain , where a n is a complex coefficient. ○ Student B: I agree with you. But we know that the expansion coefficients, a n , are the eigenvalues q n of the operatorQ. So we can write Y | ⟩ as a linear superposition of the eigenstates ofQ like this: With whom do you agree? Explain your reasoning.
After this question, further scaffolding is provided to help students realise that the expansion coefficients of a state Y | ⟩ are not the eigenvalues q n of the operatorQ and learn how to write the expansion coefficients a n explicitly in terms of Y | ⟩ and | ⟩ q . n 5.2.4. Helping students use the identity operator to calculate expectation value. Students sometimes attempted to find the expectation value of an observable by inserting the identity operator in the expression for the expectation value, but many of them had difficulty with the procedure. Several guided inquiry-based sequences in the QuILT provide students scaffolding when using the identity operator to calculate expectation values of operators with discrete and continuous eigenvalues. The questions help students integrate conceptual and procedural skills when using the identity operator to find the expectation values of observables.

Evaluation of the QuILT
After the QuILT appeared to be effective in individual administration to students during interviews, it was administered to students = ( ) N 87 over four years in upper-level undergraduate quantum mechanics courses. It was administered in two different types of upperlevel quantum mechanics courses. In one of the courses (which we will call Course 1), undergraduate students had traditional lecture-based instruction in relevant concepts, including determining expectation values using Dirac notation in the context of an N-dimensional Hilbert space. In the second course (which we will call Course 2), students had traditional lecture-based instruction in spin-1/2 and spin-one systems. Students also learned about Dirac notation in the context of two-dimensional and three-dimensional Hilbert spaces pertaining to spin, but not N-dimensional Hilbert spaces. Students in this latter course had traditional instruction in how to find expectation values in the context of spin in a generic spin state both via sum over all possible outcomes times their probabilities and using matrix multiplication (in the context of spin-1/2 and spin-one). However, the students in Course 2 mostly used the matrix multiplication method to find the expectation values of observables in the context of spin systems in their homework. All students in Course 1 and Course 2 were given a pretest that included the questions Q2 and Q3 shown in table 1. All students had sufficient time to work through the pretest. Then, students worked through the QuILT in class and were given one week to work through the rest of the QuILT that they could not finish in class as homework. The pretest and QuILT counted as a small portion of their homework grade for the course. The pretest was not returned to students. The students were then given a posttest in class (all students had sufficient time to take the posttest). The posttests were graded for correctness as a quiz in the quantum mechanics course. In addition, students were aware that topics discussed in the QuILT could also appear in future exams since the tutorial was part of the course material.
The QuILT was also administered to PhD students = ( ) N 97 who were simultaneously enrolled in the first semester of a graduate-level core quantum mechanics course and a course for training teaching assistants in two consecutive years. In the teaching assistant training class, the PhD students learned about instructional strategies for teaching introductory physics courses (e.g., tutorial-based approaches to learning). They first worked on the pretest (all students had sufficient time to take the pretest). The PhD students worked through the QuILT in the teaching assistant training class to learn about the effectiveness of the tutorial approach to teaching and learning. They were given one week to work through the rest of the QuILT as homework. Then, a posttest was administered to the PhD students in class (all students had sufficient time to take the posttest). The PhD students were given credit for completing the pretest, QuILT, and posttest, but they were not given credit for correctness. The PhD students' scores on the posttest did not contribute to the final grade for the teaching assistant training class (which was a Pass/Fail course).
To evaluate the effectiveness of the QuILT in improving students' ability to correctly identify an expression for expectation value, we compared the scores on question Q1 (shown in table 1) of students who worked on the QuILT versus students who did not work on the QuILT. Table 4 shows the distribution of students' responses to question Q1 for the students who did not work through the QuILT (non-QuILT group) and for those who did (QuILT group). Question Q1 was administered to undergraduate students at four universities in the US who did not work through the QuILT but had at least one semester of upper-level undergraduate quantum mechanics. Question Q1 was also administered to undergraduate and PhD students who worked through the QuILT (at least one month after the students had worked through the QuILT) and can be considered to test the retention of students' learning at least one month after working through the QuILT. The performance of the undergraduate and PhD students was not significantly different on question Q1 so we do not differentiate between the two groups in table 4. As shown in table 4, the QuILT group performed better on question Q1 than the non-QuILT group (which had only traditional lecture-based instruction in the relevant concepts).
We note that concepts involving inner products, expansion of a state vector in terms of a complete set of eigenstates, the completeness relation, identity operator, and probability distributions for measurement outcomes are pre-requisites for understanding and deriving expressions for expectation values. We found that the QuILT helped students with these concepts and difficulties related to these concepts were reduced [18, 29-32, 49, 58]. Here, we focus on the performance of students on questions related to expectation values. Table 5 shows the average scores of undergraduate students on pretest and posttest questions Q2 and Q3 in Course 1 and Course 2. The number of students on the posttest does not match the pretest because students' scores on the posttest were not counted if they did not work through the entire tutorial. Average normalised gain [61] is commonly used to determine how much the students learned from pretest to posttest and takes into account their initial scores on the pretest. It is defined as where s 1 is the standard deviation of the posttest scores and s 2 is the standard deviation of the pretest scores [62]. For undergraduate students in Course 1, the effect size on question Q2 is 1.2 and the effect size on question Q3 is 1.4 (which are considered large effect sizes). For undergraduate students in Course 2, the effect size on question Q2 is 0.76 (which is considered a moderate effect size) and the effect size on question Q3 is 1.4 (which is considered a large effect size).
The lower performance of students in Course 2 as compared to students in Course 1 may partly be due to the fact that students in Course 2 only had traditional lecture-based instruction about the expectation values in the context of two-dimensional and three-dimensional Hilbert spaces involving spin-1/2 and spin-one systems and had not learned about N -dimensional Hilbert spaces. In order for the tutorial to be the most effective, students should learn about expectation values in the context of N -dimensional Hilbert spaces in traditional lecture-based format before engaging with the concept via the tutorial. However, table 5 shows that even students in Course 2 who learned from the tutorial as a self-study tool without having Table 4. Distribution of student responses on question Q1 for students who did not work through the QuILT (non-QuILT group) and students who worked through the QuILT (QuILT group). lecture-based traditional instruction in N -dimensional Hilbert spaces had respectable gains from the pretest to posttest. Table 6 shows the average scores and normalised gains of PhD students on pretest and posttest questions Q2 and Q3. For PhD students, the effect size on question Q2 is 0.84 and the effective size on question Q3 is 0.55 (which are considered moderate effect sizes).

Question Non-QuILT group
We also examined whether the difficulties found on the pretest after traditional instruction were reduced after students worked through the QuILT. Table 7 shows the percentages of students who displayed various difficulties on questions Q2 and Q3 on the posttest. In table 7, we combine the difficulties of the undergraduate students in Course 1 and Course 2 since they were similar. Comparing table 7 to table 3, we note that many of the difficulties, e.g., writing Y =| ⟩ | ⟩ Q q q n n or Y = Ŷ| ⟩ | ⟩ Q q n or replacing Y | ⟩ with | ⟩ q n and writing an incorrect expression for expectation value were significantly reduced after students had worked on the QuILT.
We also investigated how well the students retained what they had learned about expectation values after working through the tutorial. In one of the semesters of Course 1 in which the Dirac notation tutorial was used, undergraduate students were asked question Q2 (see table 1) on their midterm and final exams. The midterm exam was given approximately one month after the students worked through the tutorial and the final exam was given approximately two months after the students had worked through it. Table 8 shows the students' average score on question Q2 on the midterm exam and final exam. It is Table 6. Average scores and normalised gains of PhD students answering pretest and posttest questions Q2 and Q3.
Expectation value of an observable Q (the corresponding operatorQ has discrete eigenvalues q n ) Incorrect expansion of Y | ⟩ 4% 1% Incorrect expression for expectation value 2% 2% Inserting the identity operator but getting lost along the way 2% 1% Expectation value of an observable Q (the corresponding operatorQ has continuous eigenvalues q) Incorrect expression for expectation value 2% 2% Attempting to use the identity operator but getting lost along the way 2% 3% encouraging to note that students' average scores on question Q2 on the midterm and final exams remained approximately the same as the average tutorial posttest scores for undergraduate students in Course 1, indicating that students retained most of what they had learned about determining expectation value.

Summary
Many faculty members note that topics such as expectation value and Dirac notation are important to discuss in quantum mechanics courses [63]. However, after traditional lecturebased instruction, upper-level undergraduate and PhD students had many common difficulties with the expectation value of an observable Q in terms of the eigenvalues and eigenstates of the corresponding operatorQ (e.g., when given that the states = ¼¥ {| ⟩ } q n , 1, 2, 3 n are the eigenstates of an operatorQ corresponding to an observable Q with discrete eigenvalues q n ). Many students had difficulty reasoning conceptually about the expectation value and had many procedural difficulties in determining the expectation value. We developed and evaluated a tutorial that helps students integrate conceptual and quantitative reasoning when learning about expectation values. After the development and validation of the tutorial, its inclass evaluation is encouraging and students performed significantly better on the posttest after working on the QuILT than on the pretest after traditional instruction only. In addition, in an end of semester survey in one of the undergraduate quantum mechanics courses in which this QuILT was incorporated, many students reported that they felt that the QuILT was very effective in helping them learn these concepts. The QuILT can also be used as a selfpaced learning tool.