Abstract
Most of us are familiar with measurement in the physical world, whether it is measuring today’s maximum temperature, the height of a child or the dimensions of a house, where numbers are given to represent “quantities” of some kind, on some scales, to convey properties of some attributes that are of interest to us.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
AERA, APA, NCME (1999) Standards for educational and psychological testing. American Educational Research Association, Washington
Allen MJ, Yen WM (1979) Introduction to measurement theory. Brooks/Cole Publishing Company, Monterey, California
Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16(3):297–334
Gulliksen H (1950) Theory of mental tests. Wiley, New York
Khurshid A, Sahai H (1993) Scales of measurements: an introduction and a selected bibliography. Qual Quant 27:303–324
Lissitz RW (ed) (2009) The concept of validity. Revisions, new directions, and applications. Information Age Publishing, Inc., Charlotte
Lord FM, Novick MR (1968) Statistical theories of mental test scores. Addison-Wesley, Reading
Messick S (1989) Validity. In: Linn R (ed) Educational measurement, 3rd edn. American Council on Education/Macmillan, Washington, pp 13–103
Novick MR (1966) The axioms and principal results of classical test theory. J Math Psychol 3(1):1–18
Nunnally JC, Bernstein IH (1994) Psychometric theory. McGraw-Hill Book Company, New York
OECD (2009) PISA 2009 Assessment framework—key competencies in reading, mathematics and science. Retrieved 28 Nov 2012, from http://www.oecd.org/pisa/pisaproducts/44455820.pdf
Samejima F (1973) Homogeneous case of the continuous response model. Psychometrika 38:203–219
Stevens SS (1946) On the theory of scales of measurement. Science 103:667–680
Thomson M (2003) The application of Rasch scaling to wine judging. Int Edu J 4(3):201–223
UNESCO-IIEP (2004) Southern and Eastern Africa Consortium for monitoring educational quality (SACMEQ) Data Archive. See http://www.sacmeq.org/data_archive.htm
Wilson M (2005) Constructing measures: an item response modeling approach. Lawrence Erlbaum Associates, Mahwah
Wright BD, Masters GN (1982) Rating scale analysis: Rasch measurement. Mesa Press, Chicago
Further Reading
Bartholomew DJ (ed) (2006) Measurement. Volume 1. Sage Publications Ltd.
Brennan RL (ed) (2006) Educational measurement, 4th edn. Praeger publishers, Westport
Furr RM, Bacharach VR (2008) Psychometrics: an introduction. Sage Publications Ltd, Thousand Oaks
Thorndike RM, Thorndike-Christ T (2010) Measurement and evaluation in psychology and education, 8th edn. Pearson Education, Upper Saddle River
Walford G, Tucker E, Viswanathan M (eds) (2010) The SAGE handbook of measurement. SAGE publications Ltd., Thousand Oaks
Author information
Authors and Affiliations
Corresponding author
Appendices
Discussion Points
-
1.
Discuss whether latent variables should have a meaningful zero and why it may be difficult to define a zero.
-
2.
Given that there could be a meaningful zero for test scores where zero means a student answered all questions incorrectly, are test scores ordinal, interval or ratio variables? If test scores are used as measures of an underlying ability, what level of measurement are test scores?
-
3.
Is the following a “measurement” instrument? If so, what is the construct being measured?
Car Survey
“What characteristics led to your decision for the specific model?”
Tick four categories
Customer 1 | Customer 2 | Customer 3 | Customer 4 | |
---|---|---|---|---|
Economy | ✓ | ✓ | ||
Handling | ✓ | ✓ | ✓ | |
Interior design | ✓ | ✓ | ||
Exterior design | ✓ | ✓ | ✓ | |
Reliability | ✓ | |||
Price | ✓ | ✓ | ||
Comfort | ✓ | |||
Safety | ✓ | ✓ |
-
4.
Is the following a “measurement” instrument? If so, what is the construct being measured?
Taxi Survey
Rating taxi rides
Melb airport to kew | Taxi 1 | Taxi 2 | Taxi 3 | Taxi 4 |
---|---|---|---|---|
Comfortable temperature | ✓ | ✓ | ✘ | ✓ |
Driver’s certificate displayed | ✓ | ✓ | ✘ | ✘ |
Uniform correct | ✘ | ✓ | ✘ | ✓ |
Driver presentation | ✓ | ✓ | ✘ | ✓ |
Pleasant odour | ✓ | ✘ | ✘ | ✓ |
Internal cleanliness | ✓ | ✓ | ✘ | ✓ |
External cleanliness | ✓ | ✘ | ✘ | ✓ |
Vehicle handling | ✓ | ✓ | ✘ | ✓ |
Driver quality | ✓ | ✘ | ✘ | ✓ |
Correct change | ✘ | ✘ | ✘ | ✓ |
Politeness | ✓ | ✓ | ✘ | ✓ |
Peak time | ✘ | ✘ | ✓ | ✓ |
Metered fare | $47.10 | $48.40 | $50.40 | $51.00 |
-
5.
Messick (1989) provided a definition of validity as “an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment.” Compare and contrast this definition of validity with what we have discussed in this chapter. Do you think this is a good definition of validity? Provide reasons for your answers.
Exercises
Q1. The following are some data collected in SACMEQ (Southern and Eastern Africa Consortium for Monitoring Educational Quality, UNESCO-IIEP 2004). For each variable, state whether the numerical coding as shown in the boxes provides nominal, ordinal, interval or ratio measures?
-
1.
PENGLISH
Do you speak English outside school?
(Please tick only one box.)
-
2.
XEXPER
How many years altogether have you been teaching?
(Please round to ‘1’ if it is less than 1 year.)
-
3.
PCLASS
Which Standard 6 class are you in this term?
(Please tick only one box.)
-
4.
PSTAY
Where do you stay during the school week?
(Please tick only one box.)
Q2. Which questionnaire titles in the following list would appear to be about “measurement” (as opposed to a survey)?
Q3. On a mathematics test of 40 questions, Jenny got a score of 14. Eric got a score of 28. Mary got a score of 30.
We can be reasonably confident to conclude that (write Yes or No in the space provide)
-
1.
Jenny is not as good in mathematics as Eric and Mary are. [ ]
-
2.
Mary is better at mathematics than Eric is. [ ]
-
3.
Eric got twice as many questions right as Jenny did. [ ]
-
4.
Eric’s mathematics ability is twice Jenny’s ability. [ ]
Q4. A movie guide rates movies by showing a number of stars. For example, a movie with 3-and-a-half stars is not as good as a movie with 4 stars (★★★★☆).
What is the most likely measurement level provided by this kind of ratings?
Q5. In the context of educational testing, the term “measurement error” usually refers to
Q6. In the context of educational testing, test reliability refers to
Q7. A student with limited proficiencies in English sat a Year 5 mathematics test and obtained a poor score due to language difficulties. Is this an issue related to test reliability or validity?
Q8. In a Grade 5 spelling test, there are 20 words. This is a very small sample of all the words Grade 5 students should know. If the test is used to measure students’ spelling proficiency in general, which of the following best describes the likely problems with this test?
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Wu, M., Tam, H.P., Jen, TH. (2016). What Is Measurement?. In: Educational Measurement for Applied Researchers. Springer, Singapore. https://doi.org/10.1007/978-981-10-3302-5_1
Download citation
DOI: https://doi.org/10.1007/978-981-10-3302-5_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3300-1
Online ISBN: 978-981-10-3302-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)