Evaluating the performance of university course units using data envelopment analysis

Abstract The technique of data envelopment analysis (DEA) for measuring the relative efficiency has been widely used in the higher education sector. However, measuring the performance of a set of course units or modules that are part of a university curriculum has received little attention. In this article, DEA was used in a visual way to measure the performance of 12 course units that are part of a Photogrammetry curriculum taught at Aalto University. The results pinpointed the weakest performing units, i.e. units where the provided teaching efforts might not be adequately reflected in the students’ marks in the unit. Based on the results, a single unit was considered to offer poor performance with respect to its teaching resources and was selected as a candidate for revision of its contents. Financial resources were not used as such; instead, the performance of students in previous pre-requisite units was used as the inputs. For clarity, a single output covering the overall student performance in the examined unit was used. The technique should be widely applicable assuming the grade point averages of the students who took the course unit are available along with the marks obtained in the evaluated units and their pre-requisites.


AUTHoR BIogRAPHy
Sami El-Mahgary obtained his licentiate degree in Computer Science in 2013 and is now a doctoral student focusing on the retrieval and analysis of data related to the performance of students in the courses they have taken. Petri Rönnholm is a Senior University Lecturer and is active in both academic and research work, being the author of over 70 publications. Hannu Hyyppä was Research Director at Helsinki University of Technology at the time the research was carried out, and is now Technology Manager at the School of Civil Engineering and Building Services at Helsinki Metropolia University of Applied Sciences and a docent at Aalto University. Henrik Haggrén is a full professor of Photogrammetry. His primary research interest is in developing photogrammetry as a medium for creative and innovative imaging applications. Jenni Koponen received her M.Sc. from Helsinki University of Technology, and is currently working as an educational developer.

PUBLIC INTEREST STATEMENT
This article applies a technique developed for measuring relative efficiency, known as dataenvelopment analysis (DEA for short) in a novel application. Rather than measuring the efficiency of universities or their departments, DEA was used to measure the performance of a set of university course units that were all taught at an engineering department in a Finnish university. By taking into consideration the students' standings in the prerequisites for the examined units as well as their performance in these same units, an efficiency score (0-100%) was obtained for each course unit. Units with a relatively low score were examined closer as they might be proving too difficult for the students and are also likely to make poor use of the teaching resources. We found the results useful as they pinpointed a course unit which required our attention as to how its contents could be improved and its teaching carried out more effectively.

introduction
The performances of major universities are often monitored, assessed and ranked at regular intervals. Such an assessment is usually based on a set of specific criteria, generally known as performance indicators or PIs (Barnetson & Cutright, 2000). For instance, two common PIs used to measure university research are the number of publications and the number of supervised Ph.D. theses (Tzeremes & Halkos 2010;Köksal & Nalçaci, 2006;Martín, 2007). Another PI often stressed is the students' job placement upon graduation (Maingot & Zeghal, 2008;Martín, 2007). Job placement may, however, reflect a university's reputation more than teaching quality itself. Nevertheless, the studies by Biggeri and Bini (2001) and Mohamad Ishak, Suhaida, and yuzainee (2009) confirm that indeed a wealth of PIs have been developed for use in higher education.
Even though the reliability of PIs has been questioned by several sources as shown in a compilation by Harvey (1999), there are simple and widely accepted indicators in higher education such as the grade-point average (gPA) that are not particularly prone to bias or results distortion. The gPA, a basic indicator of student performance, is readily available from a student database system as pointed out by young (1993) and is easy to compute, being simply the arithmetic mean of the grades obtained from the course units (also known as modules) taken by the student. If the gPA is further weighed according to the number of credits or European Credit Transfer System (ECTS) points awarded for completing each course unit (units for short), we get the Weighed grade-Point Average or WGPA, often a slightly more accurate reflection of a student's performance.
The primary purpose of this study is to identify those core curriculum course units with the lowest efficiency scores, as such units will have the weakest return on the allocation of their teaching resources. It is expected that such units would benefit the most from a revision of their contents so as to lead to improved student learning. The relative performance score of each course unit is measured based on the performance of the students who completed the unit, with respect to how well the students were prepared for taking the unit in question.
The technique used here is a simple application of data envelopment analysis (DEA), developed by Charnes, Cooper, and Rhodes (1978) who applied it to measuring the efficiency of similar organizations in the public sector. These organisations whose efficiencies were measured were referred to as decision-making units or units for short, to stress the fact that a unit can, independently, make decisions to try and improve its performance by reducing expenses, for example. The term "units" in DEA thus refers to the collection of units whose efficiency is being measured and which are subject to similar operating or teaching practices. Furthermore, in the DEA technique, each PI is known as a factor and is further classified as an input or as an output. An input is a PI (or factor) which expresses the consumption of a resource or takes into account some qualitative trait; an output is a factor which expresses the transformation of a resource or describes a qualitative trait of that transformed resource. Inputs and outputs are defined so that an increase in an input value does not result in a decrease in any output value.
The DEA technique has been widely applied to measuring the efficiency of many kinds of different units such as universities and their departments: Beasley (1995), Hanke andLeopoldseder (1998), Johnes (2006a), Bobe (2009), Tzeremes andHalkos (2010), and Alwadood, Noor, and Kamarudin (2011). These six studies, summarised in Table 1, differ basically as to whether financial resources are incorporated into one of the inputs. Instead of yearly budgets, in Johnes (2006a) input is measured through the quality of entering students by including their A-level exam scores while the inputs in Alwadood et al. (2011) take into account the faculty to student ratio and the total credit hours offered to students. Interestingly, the study by Johnes (2006a) is based on the premise that females achieve better scores than males, which means that the DEA analysis in question expects more out of females than males. In the study by Beasley (1995), research income has a dual nature: it acts as an output to approximate the number of publications and as an input to emphasize that it is a resource that should be invested wisely.
DEA was chosen for this study because unlike traditional PIs which measure quality or effectiveness on an absolute scale, DEA measures relative efficiency. That is, the efficiency of each unit is not computed against an ideal level of performance that may never be achievable in practice, but rather against the set of other units in the study. This means that the result set will always contain at least one efficient unit with a 100% relative efficiency score. With DEA, the relative efficiency of each unit is obtained from the ratio of multiple outputs to multiple inputs.
When a unit is relatively efficient, then its output/input ratio is optimal among the examined units, requiring no further increase in any of its output values or decrease in any of its input values. An inefficient unit on the other hand, will have a relative efficiency score of less than 100% and will have to reduce all its input values (while keeping its outputs constant) by a factor equal to the efficiency score so as to become efficient. This is illustrated in detail in the section dealing with results.

Dea studies on the efficiency of teaching
Despite the many relative efficiency assessments in higher education, most studies have focused either on an inter-comparison of different universities or in comparing different departments within the same university as attested by the studies compiled in Bobe (2009) andJohnes (2006b). Even though the use of DEA in higher education was mentioned already in the late 1970s by Lindsay (1982), few studies have hitherto been devoted to an assessment of the syllabus or the curriculum Papers in foreign journals were given more weight than conference and discussion papers.
b Department utilisation includes faculty to student ratio, mean student credit hours and the number of graduate students. of course units in a university department. This cannot be merely due to the lack of suitable analytical tools for conducting such a study.
Until the late 1990s, the use of DEA within a classroom and teaching context in higher education was still practically non-existent. As Becker (2004) put it, "DEA could be used to determine whether the teacher and/or student exhibits best practises … unfortunately, no one … in education research has yet used DEA in a meaningful way for classroom teaching practises". According to Ekstrand (2006), it is only following the new millennium that higher education efficiency studies shifted their focus on the efficiency of modules or units within a particular university. The stochastic frontier analysis by Ekstrand (2006) made use of a group of 94 students at a Swedish university who took a macroeconomics unit to test the hypothesis whether this macroeconomics unit exhibited any inefficiency. The score on the final exam was used as the output, and three inputs were used to estimate a student's preparation for the exam. These were: (1) the student's knowledge of the course unit's material at the onset of taking the unit, (2) the time spent by the student in studying the material for the unit, and (3) the student's attendance record for the unit.
To our knowledge, DEA has been used only once in measuring the effectiveness of teaching at the course unit level in the concise study by Sarkis and Seol (2010). However, the study did not compare the effectiveness of different courses, but rather measured the teaching effectiveness of a single instructor using student evaluation data such as the students' perception of the instructor's ability to clearly present the teaching material and organise the unit. This study on the other hand, examines several course units and avoids subjective data by relying solely on the academic achievements of the involved students.

Some Considerations about a DEA Assessment
It was mentioned earlier that DEA measures relative efficiency through a ratio of the outputs to inputs. Actually, this efficiency ratio is a sum of the weighted outputs over the sum of the weighted inputs. In fact, the good news about DEA is that these weights are the unknowns, which means each unit can put more emphasis on those inputs or outputs in which it has fared better, thus allowing that unit to appear in the best possible light (Boussofiane, Dyson, & Thanassoulis, 1991). By letting the DEA technique determine the weights separately and optimally for each unit under consideration, we are guaranteed that the efficiency score obtained for a particular unit is indeed the best possible score given the selected factors and the set of all examined units.
The not so good news is that there is no guarantee that all factors will be taken into account when computing a unit's relative efficiency score. In other words, a unit may obtain a high efficiency score at the expense of overlooking certain factors, that is, by assigning zero weights or minimal, nearzero weights to one or more factors. Such a unit's efficiency score, as Sarrico and Dyson (2000) aptly put it, is due to a "judicious choice of weights rather than good performance". There is also the concept of inefficient odd units, or units that are inefficient but are not homogenous with the rest of the units. An odd unit could be due to incorrect data, which as pointed out by Metters, Vargas, and Whybark (2001), can easily distort the DEA results. The low score of an inefficient unit may thus be in part attributable to random data errors rather than to authentic inefficiency (Barros & Weber, 2009). Finally, an odd inefficient unit may also simply imply that the unit is too different from the rest of the units and cannot be thus accurately rated. This is illustrated in the section entitled "Results".
To help avoid such situations, this work restricts the number of factors to three, which minimizes the chances that a certain factor is overlooked, and more importantly, it allows for visualising the results in a simple two-dimensional graph as shown later in the article. Being able to visualise the results using simple two-dimensional charts as shown in El-Mahgary and Lahdelma (1995) improves managerial understanding of DEA and avoids the use of DEA extensions such as those mentioned in Emrouznejad, Parker, and Tavares (2008) that are used to restrict the number of factors that can be overlooked, i.e. assigned a near-zero weight.
The reader desiring a lucid introduction to DEA is referred to Boussofiane et al. (1991) and to the pragmatic book by Norman and Stroker (1991), while a glossary of key DEA terms is given in El-Mahgary (1995). A rigorous analysis of DEA in the framework of economic efficiency can be found in Murillo-Zamorano (2004).
The rest of this article is organised as follows: first we discuss the aims of teaching, then focus on what it is exactly that we aim to measure, followed by an in-depth look at how the technique was applied to our sample. Finally, the results are presented along with conclusions.

teaching and its assessment
While Ramsden (1991) stresses the difficulties in defining good teaching, an immediate aim of teaching is, of course, to provide the student with a deeper understanding of the contents in a course. However, a poorly planned or carried out curriculum can render learning an inefficient process. Possible negative factors include a difficult subject area, students' or teachers' lack of motivation, poor or out-dated teaching methods or facilities, poorly-written literature or lecture material, or a generally unsatisfactory atmosphere that is not conducive to learning. An immediate consequence of poor teaching is generally poor student performance, leading to weak academic integration, which, as pointed out by Longden (2006), increases the likelihood that the student will eventually interrupt his/her studies.
An interesting Australian study by Taylor (2001) shows how pressure emanating from outside the university (in this case the Ministry of Education) can also force instructors to lower their grading standards without changing any of the teaching methods. As a participant in the study from university X relates: "… the government basically said either pass these [students] or … get a penalty". In response, the failure rate at university X dropped from 23% to 11% in a single year among first year students (Taylor, 2001).
The characteristics of students taking the course units can also affect overall performance. As mentioned earlier, Johnes (2006a) referred to evidence that females would outperform males in their studies for a university degree. However, because our study focuses on a curriculum of units that are mostly taken by the same set of students, factors such as student gender, age and marital status can be expected to play a minor role.
Typically, study programmes or curriculums contain units that can vary significantly in their degree of difficulty as well as in the way the units are conducted. When one examines several units taught by different instructors, one wonders whether it is, in fact, accurate to make an inter-comparison of such modules. As attested by young (1993), there are different grading standards not just between college institutions but downright at the unit's instructor level also. There are of course, simple ways in which the instructor can, at his/her discretion, increase the average of the students' grades taking the unit, such as: • Lowering the requirements for the unit.
• Increasing the number of learning activities other than formal lectures (such as excursions or in-class use of multimedia) and awarding more points for attendance.
• Administering final exams that are easier than expected.
To avoid situations where instructors can improve the scores of PIs without actually intervening in the pedagogy of the course, this study uses a DEA-based indicator to estimate the performance of students in a unit relative to other units of the same curriculum. The use of DEA also avoids the inherent bias that may be present in instruments that are solely based on student evaluation of teaching (SET). Beran and Violato (2005) surveyed some 371,000 student ratings over a three-year period and noticed that with SET, laboratory-type units are rated higher than lectures or tutorials. The comprehensive work by Crumbley, Henry, and Kratchman (2001) unearthed how students may resent and "punish" those teachers who require more efforts from the students. The study by Brockx, Spooren, and Mortelmans (2011) suggests that there is a correlation between the grade received by the student and how the student evaluates his/her teacher. In light of this, our study seeks to answer the following two fundamental questions using an objective approach: (1) How well do the units match the abilities of the students?
(2) Is a particular unit proving to be too difficult/too easy for students?
If a unit is found to have a weak performance score, then it is likely that students are not truly gaining the required knowledge and skills from the unit, the importance of which was raised by Maingot and Zeghal (2008). It is expected that if a unit is a pre-requisite for a more advanced one, then a successful completion of the pre-requisite unit would be reflected in the student's performance on the more advanced units. In other words, the teaching resources are incorporated into the study as measures of the overall student performance in the pre-requisite course units. These resources are then expected to reflect the overall student performance for the unit under evaluation. The better students fare in the pre-requisites of the unit under measurement, the better final marks they are expected to achieve in the unit being evaluated. A constant returns to scale is assumed to exist between the inputs (resources) and the output (overall student performance in the unit).

Measuring Performance in a course Unit
We need a way of measuring the performance of students in a single unit that is part of the curriculum, before we can say anything about the suitability of the curriculum. A straightforward way to measure this would be to take the average of the students' grades in the course unit. This average is referred to here as the Mean of the Grades in a Unit for a particular course unit. In this article, we adopt a grading system which is prevalent in Nordic universities, based on a six point scale, where the value "5" denotes outstanding achievement and "0" indicates a failure in a course as summarised in Table 2. The interested reader will find more on the equivalence of grading schemes in a report by Duke University (2011).
To help one identify the areas of the curriculum that are proving difficult to students and thus require immediate attention, we need a simple set of analytical tools that nevertheless provide reliable results. Since a direct comparison of the mean of the grades of two different units, even when taught at the same institution, might lead to erroneous conclusions, we use a more robust performance index, PCOURSE, that is based on the WGPA. The value for PCOURSE, for a student i who took course unit c, is obtained as follows: (1) subtract the student's WGPA from his/her grade (G) for unit c, and finally, to avoid negative values, (2) add "5" to the obtained difference. This is summarised in the following simple equation: As "5" is the maximum awarded grade, PCOURSE will have the range [0, 10]. The performance index PCOURSE reaches the maximum theoretical value of "10" when a student with a very low WGPA (we assume a WGPA of zero for practical purposes) obtains the best grade, "5", in a given unit. In such a case, the unit can be considered to be extremely easy with respect to the abilities of the student and there is thus no evidence of difficulties. In the other extreme, when a student with a very (1) high WGPA (now we assume a WGPA of "5") obtains the lowest possible grade, "0" in a given unit, PCOURSE will be zero, indicating maximum difficulty in that particular course unit.
The use of PCOURSE can be illustrated with a simple hypothetical example for three course units C 1 -C 3 (each worth 4 ECTS) taken by only three students as shown in Table 3. The general academic abilities of students are reflected in their WGPA measure. In this case, student "973" is considered to have the best abilities of the group (WGPA = 4.1). Note that in the last row, the mean of the grades for C 2 = 3.3, suggesting that students generally performed well in unit C 2 . However, its PCOURSE = 4.7 which is below average (below 5), indicating a small level of performance difficulty in course unit C 2 .
In practice, however, we take the mean of PCOURSE of all students who took the course unit during a certain period. The mean is denoted with PCOURSE* and describes the level of difficulty for the whole unit. The measure PCOURSE* can be additionally improved by taking into account the number of credits awarded for each unit. Assuming that the average credits awarded for a unit is 4 ECTS, and that larger course units are on the whole, harder for the student, as typically there is a wider range of material to be covered, the following weighting scheme can be used: where for units with ECTS values of 6 and 5, k = 1.05 and k = 1.025 respectively, and for ECTS values of 3 and 2, k = 0.975 and k = 0.95 respectively. A course that awards 4 ECTS will have k = 1. The number of ECTS points seemed as the most straightforward way to adjust the variable k.

The Factors Used
As the introduced measure WPCOURSE * is about the students' achievement in a unit, it is clearly an output. Additionally, we will need two inputs so that the total number of factors amount to three. With DEA, inputs are factors that get transformed into outputs by undergoing a transformation process that is usually value-adding. In this case, the added value depends on how well students are prepared for taking a particular unit. Prior to taking a certain unit, students should have completed certain pre-requisite units to help them get the most out of teaching and succeed in the unit in question. The better the students complete their pre-requisites, the better the results that can be expected. To approximate how well students are prepared for taking a certain unit, two measures will be used. The pre-requisites for each of the examined units are listed in Table 8 in the Appendix.
The first measure relating to the pre-requisites, called PCMATHS, computes the percentage of students who have completed the required mathematics pre-requisites on time, that is, before taking the unit. The other measure, PTGRADE3, computes the percentage of students who have obtained the grade of "3" ("good") or better in the pre-requisites.
Since these two pre-requisite measures relate to the use of resources, they can be classified as inputs. The factor WPCOURSE * on the other hand, is clearly an output since it measures student per- formance. These three factors (summarised in Table 4) will be used to estimate the teaching efficiency of units.

the Data set
To conduct this study, we selected a set of course units from the Department of Surveying at Aalto University's School of Engineering that are part of the curriculum for Photogrammetry. only courses taken by degree students admitted between 2003 and 2010 were considered and units with a class size of less than ten students were excluded. Non-degree students were omitted from the study because they typically do not complete any specific curriculum. All in all, the sample set consisted of 343 students of which nearly a third (31.5%) were female as shown in Table 5. The coefficient of variation was also computed due to its usefulness as pointed out by Mahmoudvand and Hassani (2009), and it showed little difference in the relative dispersion between females (cv = 20%) and males (cv = 22%).
In passing, we note that we also tested the premise that females outperform males in their academic results. This was done using the gPA of another sample (not shown in Table 5) that consisted of only those students who had finished their studies and graduated with a major in geodesy and Photogrammetry or in Real Estate Economics and who had been admitted between 2003 and 2010, for a total of 276 students. There were 112 female graduates with a mean gPA of 3.34 (skew = −0.16, kurtosis = −0.53) and 164 male graduates whose mean gPA was 3.14 (skew = 0.29, kurtosis = −0.21). Normality was checked in each test group through plotting and confirmed with both a Shapiro-Wilk test and a Kolmogorov-Smirnov test. A 2-tailed independent parametric-test supported the hypothesis that the gPA for female students indeed outperforms that of male students (t = 3.32 and p < 0.005, Cohen's d = 0.40).
Finally, unit gED-310 was included in two forms: the basic unit gED-310, which included students who had passed an older, but nearly equivalent course gED-220, and a separate unit gED-310 * which included only students who had taken a new revised gED-310 unit.
These 12 units generated a total of 697 records excluding any units taken as a pre-requisite or as a substitute. Each record was made up of the student's id, the course id, the date when the course unit was passed, the name of the instructor, as well as the grade obtained. In addition to these 697 records making up the core curriculum units, the data also contained the pre-requisites for each unit for determining whether a student had completed the necessary pre-requisites on time. The data-set was collected from the university's student database through a special report generator, (El-Mahgary & Soisalon-Soininen, 2007), and was then filtered and re-arranged into a suitable spreadsheet format. In Table 6, the values for the inputs are shown in the second and third columns and the output WPCOURSE* is shown in the fourth column. To compute the value of WPCOURSE* for each of the 12 units, the following procedure was used: first, using Equation 1, the value for PCOURSE was computed for each student who took the unit in question. The mean of the values, PCOURSE*, was then converted into WPCOURSE* using Equation 2 and the ECTS of the unit. The values for the inputs PCMATHS and PTGRADE3 for each unit were computed as explained previously using data for the pre-requisites of the unit question.
It might appear that since we are examining different units taught by different instructors, these units might not be homogenous, thus increasing the risk for introducing bias into results. However, it should be borne in mind that we are not so much contrasting different course units as measuring how a particular set of students can assimilate the contents of different units that are taught in a single curriculum. Even though different course units are being measured, the set of students taking the units is basically the same, and since nearly all students who took a particular unit took it with the same instructor, we believe that homogeneity is maintained. This issue will be addressed further while examining the results.

Results
A DEA analysis was performed on the data sample of 343 students for whom the values of the three factors (PTGRADE3, PCMATHS, WPCOURSE * ) were computed and the results are reported in the last column of Table 6. While special DEA software is generally used for finding the set of optimal weights for each factor and calculating the efficiency scores, an accurately drawn two-dimensional chart can be used to get accurate enough values for the efficiency scores as will be shown next.
Since the factors are made up of a single output and two inputs, we are going to draw the two-dimensional graph by using ratios of inputs over the output. In this way, the single output acts as a denominator for both ratios. Equation 3 gives for each unit c, its x-coordinate through the ratio of the unit's value for input PCMATHS over the value of the output WPCOURSE * . Similarly, equation (4) gives for each unit c, its y-coordinate through the ratio of the unit's value for the input PTGRADE3 over the value of the output WPCOURSE * . The result is multiplied in both equations by 100 for scaling purposes.
Using the data from Table 6, each course unit c has been plotted into the graph in Figure 1 by computing its (x c , y c ) coordinates according to Equations 3 and 4 and then multiplying the obtained  value by 100. So for instance, the coordinates for unit gED-205 are obtained through the following: x c = (53.3%/4.97) × 100, y c = (58.6%/4.97) × 100 thus placing the unit into the coordinates (10.72, 11.79) in Figure 1.
Because we are using the ratio of input(s) over output(s), the efficient units are going to be the ones lying nearest to the origin, since they will have the smallest ratio either for x c and/or for y c . In other words, the smaller these two ratios for a given unit are, the less resources will have been used up per output in the given unit, and hence, the higher the efficiency (or teaching efficacy) score of the unit must be.
As can be seen from Figure 1, there are two units, FED-310 and gED-311, that can be said to lie nearest to the origin. These units are thus relatively efficient and will have an efficiency score of 100%. It is no surprise that the number of efficient units turns out to be two, for as explained by Bessent, Bessent, Elam, and Clark (1988), when there are three factors as in our case, there should be at least two efficient units, because one of the efficient units (unit gED-310 in this case) has the best ratio for x c while the other unit (gED-311 in this case) has the best ratio for y c. Stated differently, unit gED-310 shows the best performance for students while taking into account their completion of their math pre-requisites, and unit gED-311 has the best performance for students with respect to how well they completed their pre-requisites for the unit in question. It should be noted that unit gED-311 is relatively efficient mainly due to the low values in its inputs (Table 6), that is, students come to the unit with poor standings in their pre-requisites, yet as they manage to obtain a relatively high score in the unit itself (WPCOURSE * = 5.12), there is no evidence of teaching resources being misused.
These two efficient units are then connected by drawing a piecewise linear connection between them. Moreover, it is customary to extend the line segment connecting the two efficient units through two lines that extend in parallel along each of the two axes. That is, we draw from unit gED-310 a line that extends in parallel along the y-axis and from gED-311 a line that extends in parallel along the x-axis. These extensions parallel to the axes underline the fact that there are no further efficient units beyond units gED-310 and gED-311 and so an extension along the same level is a basic and arguably fair way of extrapolation. These line segments along with the extensions that have been just drawn constitute what is known as the efficiency frontier. The efficiency frontier for our data-set based on twelve units and three factors is portrayed in Figure 1. The term DEA is due to the fact that this efficiency frontier envelopes all the inefficient units within since only units lying on the frontier itself (units gED-310 and gED-311) are relatively efficient. The further a unit lies from the frontier, the greater that unit's relative inefficiency is (implicating a lower efficiency score) since the more it will have consumed inputs per output. The units with the lowest efficiency scores must therefore be units gED-101B and gED-204, the ones lying furthest from the efficiency frontier.
given then that with three factors we need just two efficient units to determine the efficiency frontier, there will only be one so-called reference set (also known as a peer group). A reference set contains only 100% relatively efficient units and it determines at least a part of the efficiency frontier. In this case, the reference set is the set {gED-310, gED-311} and it makes up the basis of the efficiency frontier. Note that it would be perfectly possible to have more than two efficient units in the same reference set, in fact, any unit located on the segment between gED-310 and gED-311 would be efficient.
We can now use the efficiency frontier to compute the scores for the inefficient units. Consider unit gED-205, which is marked in Figure 1 as an upright triangle and denoted as point X. To become relatively efficient, gED-205 would have to lie on the hypothetical point gED-205′ on the frontier, denoted as point X ′ . This Unit gED-205′ is therefore what is known as a hypothetical composite efficient unit (HCE unit) for the following two reasons: First, it is a hypothetical unit since it does not actu- ally exist. Second, it is an efficient composite unit because its output/input values can be obtained by combining the inputs and outputs of its reference set that is, efficient units gED-310 and gED-311. As to how this is done is, however, beyond the scope of this article, the interested reader is referred to Boussofiane et al. (1991). For our purposes, it suffices to note that point gED-205′ (point X ′ ), is where the HCE for unit gED-205 is located and represents therefore an efficient point.
Since we denoted the location of unit gED-205 as point X and the location of its HCE unit gED-205′ as point X ′ , the efficiency score for unit gED-205 can be obtained from Figure 1 through the proportion of two distances, that is, from the lengths of segments OX ′ and oX where oX is a ray emanating from the origin up to point X (the unit gED-205) and OX ′ corresponds to the ray from the origin up to the point gED-205′ as shown in Equation 5: Returning to units gED-101B and gED-204, we see from Figure 1 that since they are further away from the efficiency frontier than unit gED-205, their efficiency score must therefore be less than 64.5%, for they require a greater increase than unit gED-205 in their output and/or decrease in their inputs before reaching the efficiency frontier and becoming efficient. In general then, inefficient units can be ranked according to their efficiency scores. However, as the scores are relative, it is worth remembering that some of the scores are likely to change in case a new unit that turns out to be efficient is introduced into the set of course units.
given that unit gED-205 is inefficient, we would like to know what its input values and output value should be in order for it to become efficient, that is, an HCE unit. The reduced input value(s) and augmented output value(s) that render an inefficient unit into an efficient one are known as target values. To determine these target values, what is needed then are the actual coordinates of point X ′ , the point where HCE unit gED-205′ is located. one pragmatic way of determining these target values is as follows. First, notice that any point lying on the line OX ′ will have a constant ratio of PTGRADE3 over PCMATHS. This constant ratio is easily obtained from the inputs for unit gED-205 to be 58.6%/53.3% = 1.0994 and is defined as Equation 6.
Equation 6 basically states that ratio of the target values of the inputs for point X ′ is a constant. In fact, this constant value 1.0994 turns out to be the slope of the line OX ′ and since it is nearly unity, it means that in order to increase the output, it is practically speaking as easy (or as difficult, depending on how one puts it) to decrease the input PTGRADE3 as it is to decrease the other input PCMATHS. If the slope were clearly larger than unity though, this would imply that to achieve a given increase in the output requires a greater proportionate reduction in the input PTGRADE3 than in the input PCMATHS.
As for the ratios of the x-coordinate (x ′ 205 ) and y-coordinate (y ′ 205 ) of the target point X ′ , instead of trying to read these values directly from Figure 1, one approach would be to find the intersection of line OX ′ and the line segment between gED-310 and gED-311. There is, however, another simple and accurate way for obtaining these ratios. We can make use of the fact that with DEA, an inefficient unit that is surrounded (or enveloped as the term is known) by efficient units as is the case with unit gED-205, can be rendered efficient through its efficiency score. Efficiency for the unit can be then achieved through a proportional reduction in its inputs while keeping its output(s) values constant.
So we will use Equations 3 and 4 to find x ′ 205 and y ′ 205 respectively, but instead of using directly unit gED-205's value for PCMATHS and PTGRADE3, we will use the corresponding values that are reduced Ratio (PTGRADE3 over PCMATHS) = 1.0994 = T PTGRADE3 ∕T PCMATHS HCE unit of gED-205) are (6.92, 7.61) where the X ′ -coordinate is the ratio of PCMATHS over WPCOURSE * at point X ′ and the y ′ -coordinate expresses the ratio of PTGRADE3 over WPCOURSE * at point X ′ . Expressed mathematically, this yields Equations 7 and 8 as follows where for unit gED-205, T PCMATHS denotes the target value for PCMATHS, T WPCOURSE* the target value for WPCOURSE * and T PTGRADE3 the target value for PTGRADE3.
The three Equations 6-8 that represent three factors are based on efficient point X ′ and so allow us to determine the target values for unit gED-205. In other words, these three equations are set up for unit gED-205 to become efficient and express the requirements for the target values that will specifically render unit gED-205 efficient. If, for instance, we decide that students who take unit gED-205 should be able to raise their pre-requisites PTGRADE3 up to 60%, then by substituting 60% for T PTGRADE3 into Equation 6 and solving, we find that T PCMATHS must be 54.6%. Finally, substituting T PCMATHS = 54.6% into Equation 7 will yield T WPCOURSE* to be 7.89. This means that the set of target values T PCMATHS = 54.6%, T PTGRADE3 = 60% and T WPCOURSE* = 7.89 is an example of target values that render unit gED-205 efficient.
Earlier, it was mentioned that an inefficient unit can be made efficient by reducing its input values proportionately as indicated by its efficiency score while keeping the output(s) constant. Alternatively, an inefficient unit can be made efficient through an increase in its output(s) values as indicated by the inverse of its efficiency score while keeping the input(s) values constant. Since the inputs used here are not easily reduced in practice as they reflect how well a student has completed the prerequisite course units for the unit under measure, it is more useful to find out by how much the output factor needs to be increased. So unit gED-205 can become efficient by increasing its output by a factor equal to the inverse of its efficiency, that is, 1/0.645 = 1.55 while keeping both its inputs constant. Since the product of 4.97 and 1.55 yields 7.7, then unit gED-205 would be also considered efficient if its input values were kept constant at 53.3% (for PCMATHS) and 58.6% (PTGRADE3) while its output value for WPCOURSE * were raised all the way up to 7.7.
Thus, we have uncovered mathematically another set of target values T PCMATHS = 53.3%, T PTGRADE3 = 58.6% and T WPCOURSE* = 7.7 to render unit gED-205 efficient. This is in contrast with the previous set of target values (T PCMATHS = 54.6%, T PTGRADE3 = 60% and T WPCOURSE* = 7.89), which consumes more inputs by having higher pre-requisite values for PCMATHS (54.6% as opposed to 53.3%) and PTGRADE3 (60% as opposed to 58.6%) and therefore correspondingly requires a higher value for the output WPCOURSE*. Finally, we can use Equations 6-8 as a check for the newly obtained target values. Substituting 7.7 for the value of WPCOURSE* into Equation 8 gives T PTGRADE3 = 58.6% which, when substituted into Equation 6 yields T PCMATHS = 53.3%, thus validating the obtained target values. These two sets of target values are shown in Table 7. Both sets of target values naturally result in the same HCE unit, that is unit gED-205' , the one denoted as point X ′ in Figure 1. (7) Ratio T PCMATHS over T WPCOURSE * = 6.92 = T PCMATHS ∕T WPCOURSE * × 100 (8) Ratio T PTGRADE3 over T WPCOURSE * = 7.61 = T PTGRADE3 ∕T WPCOURSE * × 100 In addition, the efficiency frontier can be used to identify inefficient odd units. Consider unit gED-TEST, which is a totally imaginary inefficient test unit that has been added to Figure 1 for explanatory purposes only. Its HCE unit is gED-TEST′, which lies outside the efficient segment determined by the reference set {gED-310, gED-311}. Unit gED-TEST is not only inefficient, but is also known as a nonenveloped unit. Unit gED-TEST is not properly enveloped because gED-TEST′ (its HCE unit), lies on an extension of the frontier and does not fall within the proper efficiency frontier segment defined by gED-310 and gED-311, and thus gED-TEST′ is enveloped only from its right side by unit gED-311, there being no efficient unit on the left side of gED-TEST′. As unit gED-TEST has only one efficient unit in its reference set {gED-311}, there is thus a strong possibility that unit gED-TEST is indeed different from the other units, and the reasons for this should be analysed. of course, the oddity in gED-TEST might also simply be due to incorrect data. It might be useful to note that gED-TEST′ is not considered by DEA to be efficient because it lies on an extension of the frontier. Unit gED-TEST′ is likely to get a high efficiency score (perhaps in the neighbourhood of 0.99), but its efficiency score will remain less than unity.
Ignoring the imaginary unit gED-TEST, we can say that among the actual units in the study, there is no evidence of an inefficient odd unit, as each inefficient unit has the same reference set that consists of two efficient units as required. Figure 1 is therefore valuable also in the sense that it points out inefficient units that are potentially not homogenous with the rest of the units. Detecting inefficient odd units without a visualizing graph is tricky, because a zero weight associated with an input/output is not always a sign of a non-enveloped unit, as pointed out by Portela and Thanassoulis (2006). The visualized efficiency frontier thus serves a dual purpose: not only does it provide a quick way of detecting a unit's efficiency, it also helps in detecting inefficient units that might be odd. Furthermore, the efficiency frontier can also pinpoint efficient odd units: if gED-TEST were located instead at the location of gED-TEST′, it would be an example of an efficient unit that is very different from the rest of the units. Since unit gED-TEST would not act as a reference set to any inefficient unit, it would however, have no effect on the efficiency scores of other inefficient units.

Applying the Results to a Course Unit
It is safe to say that courses that obtain a relative efficiency score of 100% (such as gED-310 and gED-311) or nearly so, show no evidence of a poor return on allocation of teaching resources. It is also unlikely that such units are proving too difficult for the students. Assuming there is a high PCOURSE * value associated with such a course, then on the average, students must be performing better than expected based on their gPA, which is evidence for efficient use of teaching resources. Such high scores might also suggest that the course is not challenging enough for the students, but this is likely to be the case only if students with a low gPA are clearly performing as well as those with a higher gPA, which was not the case in this study.
From the two units that had a clearly weaker efficiency score, we selected course unit gED-101AB, which had the largest amount of participants for an in-depth analysis. The unit with the weakest efficiency score, gED-101B, differs from unit gED-101AB only in its student makeup: gED-101B is made up of Real Estate majors, but the contents for both units is the same. Together with the help of a study counsellor, and discussions with students, we concluded that students were indeed experiencing difficulties with the unit. Therefore, a thorough revamping of the module was made after it was discovered that lack of motivation for the unit's subject was one of the reasons for poor student performance.
As to effective teaching methods, the comprehensive study by Bjorklund and Fortenberry (2005) geared at engineering students identified some helpful guidelines. Among them were: (1) to encourage student-instructor interaction, (2) to develop a sense of reciprocity and mutual cooperation among the students (as opposed to rivalry), (3) to make it clear to students that there are high expectations for them, (4) to give students feedback promptly, and (5) to promote so-called "active learning techniques". The idea is thus to create an atmosphere where students are guided through difficult concepts and have someone to turn to (either the instructor or their peers) when they need help in understanding key concepts in engineering and problem solving. In light of this, an alternative way to complete the course unit was also arranged. Instead of having the students sit for a final exam which solely determined their grade for a particular unit, new, extended tutorials were introduced. These tutorials covered all the core material and gave students the experience of putting theory into practice while improving student/teacher interaction. This learn-by-doing method proved successful as students came to the units that are taken after gED-101AB better prepared. Replacing an exam with written assignments and class discussion is supported in a study of university students majoring in educational psychology by Tynjälä (1997), where two different groups of students completed the same course unit in one of two ways. The first group was a control group that followed traditional teaching methods that involved taking an exam while the second group was a "constructive learning" group that completed the unit though written assignments, an extensive essay and numerous in-class discussions. In the study, the constructive group of students exhibited more critical thinking as opposed to rote learning and memorization of facts.

conclusions
DEA is useful because it is not based on regression or some imaginary standard of performance that may never be achieved in practice. Instead, using the input and output values of all units under measurement, an efficiency frontier (that is the equivalent of a production function) is drawn. Based on this frontier, the teaching efficacy of each unit can then be measured. Moreover, DEA can easily incorporate hundreds of course units as long as they are homogenous, that is, taught at the same institution and belong to the same curriculum. When only three factors are used as in this study, it becomes possible to illustrate and check the efficiency scores in a visual graph, which should help management to better understand the technique. The graph can also pinpoint anomalies in the results, such as an efficient unit that effectively neglected the importance of one of its inputs or outputs.
We feel that without this DEA analysis, we would have not thought that unit gED-101AB needed our attention, for the student feedback did not reveal any significant problems in the way the lectures or tutorials were being carried out. Neither had a core contents analysis for the unit (using Bloom's Taxonomy) indicated any problems. The DEA study gave us a more objective perspective in understanding the sources of the difficulties experienced by students and showed the variations in students' performances between different units.
At first, the aim of the study was to use DEA to measure teaching efficiency, but that proved too daunting a task as measures such as teacher-student contact time or student satisfaction with the unit would have needed to be included. We also wanted to avoid the use of SET which, according to Ramsden (1991) can be subject to "formidable problems". Instead, we opted to conduct a study that showed the relative performance of each unit, that is, the efficient use of teaching resources allocated to the course unit. In essence, the performance scores of each unit act as a guideline so that the higher the efficiency score, the less likely the unit is in need of revision of its contents. As no use of financial resources was made, the teachers' efforts are reflected in the final scores that are based on the relative student performance in the unit while taking into account the students' previous performance in the pre-requisites and their general academic performance. As the grade obtained in a unit is not used as such but rather compared to the student's WGPA, errors due to subjective differences in evaluating students are reduced. This addresses the concerns about using examination results as PIs when they may be non-comparable due to different grading criteria (Kong & Fu, 2012). Moreover, two independent studies have confirmed that the problem of non-comparable grades is not apparent in sets of course units taken from the same syllabus or curriculum in a university (young, 1993).
It would be interesting to repeat this study after a few years or so with a new batch of students. If our revision for the contents of course unit gED-101AB was successful, one would expect that the relative efficiency for that unit would then be higher. This is built on the assumption that in the new students batch, students with relatively high values for the inputs PTGRADE3 and PCMATHS would correspondingly obtain better marks in the revised unit (i.e. reflected in WPCOURSE * ), even though their abilities may not be significantly higher than the currently examined batch.
A final observation regarding these results is the role played by student motivation and goal orientation. It should be remembered that a gifted student might be obtaining poor scores in a course simply due to a lack of motivation in that particular course subject. As to whether or not that lack of motivation is due to something that needs to be improved in the course (i.e. the course material or the lectures) is harder to detect from the obtained results, since as pointed out by Pulkka and Niemivirta (2013) students can perceive the course material differently depending on how goal oriented they are. In some cases a student evaluation of the unit, such as a FASI-type (Formative Assessment of Instruction) instrument, as detailed in Adams and Wieman (2011) may shed light on the reasons behind a poor score.
If a university aims to maintain a standard of excellence in its teaching, then there must be continuous curriculum monitoring that allows for wise distribution of education resources as well as an on-going development of the course units, both in their contents and in their teaching methods. The study presented herein can measure the relative efficiency of course units with relative ease, requiring mainly access to the student information database and suitable software to pre-process the data. Since time and personnel resources for developing teaching are typically limited, the results obtained can be used to focus the development efforts onto those units that most urgently need attention.