Construct Validity of MSRT Reading Comprehension Module in Iranian Context

Many researchers were interested in validity of the language proficiency tests in the previous decades. The present study aims to study the construct validity of the Ministry of Science, Research, and Technology Reading Comprehension module (MSRT) in the Iranian context. After administering a standard language proficiency test (OPT) 65 intermediate EFL learners were selected. The participants passed some 72 hours learning MSRT techniques and the strategies needed to be adopted in the MSRT test sessions. After all these hours, MSRT test was administered to those learners in some institutes. Later EFL questionnaire on reading was run among them and they were asked to check the skills they have applied during MSRT reading comprehension test. Both qualitative and quantitative approaches helped this study to collect and analysis the data. By qualitative approaches, this study collected expert’s and test takers’ judgments and by quantitative approaches, collected factor analysis. The results revealed that there were a significant consent among the judgments of the experts and an agreement among the test takers on the skills being measured by MSRT reading module. Finally, explanatory factor analysis did not reveal similar findings as those in the judgmental phase of the study. The items in the MSRT reading comprehension tests didn’t confirm that MSRT reading parts assess the reading skills in the Iranian context. This study highlighted the importance of designing and using more reliable and valid tests, for researchers and designers, and those who are interested in the use of such tests.


Introduction
Throughout the history of language testing, measurement has almost always been dealing with three fundamental questions, namely, how to test, what to test, and why to test (Canale, 1988;Bachman, 1990;Douglas & Chapplle, 1993;cited in Tavakoli, 2004).The ultimate goal of testing is to measure learning language ability which is a theoretical construct and cannot be directly assessed.Because of the emergence of communicative competence which emphasized the role of different competencies in using the language, there was a shift of emphasis from linguistic competence to communicative competence.This shift caused the language institutes and universities to provide standardized test such as TOEFL, IELTS, FCE, TOLIMO, MSRT, ….The concept of the test validity has attracted the attention of many language test constructors in the last few decades.The tendency towards validation studies has been more evident in relation to standardized or universal language tests of proficiency.The present investigation explores the construct validity of reading with a specific reference to the reading skills which are claimed to be assessed by the MSRT reading tests.The study considers the use of the MSRT reading tests in the context of Iran about EFL learners.

Material Studies
Test constructors have devised different tests based on different purposes; however, they have to make sure about the abilities being measured by individual items in their tests.In the reading comprehension tests, the most important thing is validity.Several researches have posed related issues and problems with reliability when talking about the abilities/skills measured by individual items in reading tests (e.g.Alderson, 1990aAlderson, , 1990b)).Language tests such as MSRT assess the language proficiency of students.It is quiet new to the Iranian context and has been used in assessing the abilities of Ph.D candidates who study different majors other than English at different universities.This present study aims to investigate the way such a test is viewed by Iranian candidates and whether the items in the reading paper of the MSRT measure the skills by the MSRT developing board.The application of the results of this study might open windows to study the other universal language tests introduced to Iranian educational context.Reading comprehension tests are used in assessing all languages.

Reliability and Validity
"Reliability refers to the consistency of the students' scores which is received on alternative forms of the same test" (Wells & Wollack , 2003, p. 2).A test is considered reliable if it produces similar results repeatedly.Reliability is an essential characteristic of a good between the scores and the test, because when a test is not reliable, one could not rely on the scores obtained from the test as to be an accurate measure of students' achievement.Based on the results of the research, if there is the subtraction of the standard error of the measurement in the tests, so we will have the addition of reliability.Although reliability is necessary, that's not enough by itself because a test requires to be valid as well in order to be reliable, too.According to Bachman (1995) "reliability is a requirement for validity and the investigation of reliability and validity can be viewed as complementary aspects of identifying, estimating and interpreting different sources of variance in test scores" (p.239).Traditionally, validity was classified into different types (e.g.content validity, face validity, criterion validity and construct validity), but according to Messick (1980), this view is inadequate (cited in Bachman, 1955& Wood, 2001).Since this study focuses on construct validity, its nature and background are explained below.

Construct Validity
It explains the extent to which performance on tests is consistent with predictions that we make on the basis of a theory of abilities, or constructs.Construct validity in reading is as the ability we wish to test.The present study was concerned with investigating the construct validity of the MSRT test.

Reading Comprehension
Reading comprehension is a kind of thinking, problem-solving, reasoning and all these mental processes have their roots in past experiences, consequently, success in the process of reading depends on the reader's ability to make a clear relationship between written materials and his/her own past experience.The students can't understand the meaning of the written sentences just in the surface of them.

Standardized Language Proficiency Testing
A standardized test has standard purposes and criteria that are kept firm the shape of the test to another one.The criteria in large-scale standardized tests are designed to apply to a broad band of competencies that are usually not exclusive to one particular curriculum.The incorrect impression that all standardized tests consists of items that have predetermined responses presented in multiple-choice format is reducing these days.While it is true that most standardized tests have a multiple-choice format, by no means is multiple-choice a prerequisite characteristics.Multiple-choice format provides the test procedure with an "objective" means for determining correct and incorrect responses, and therefore is the preferred mode for large-scale tests.

MSRT Test
MSRT is an acronym "Ministry of Science, Research and Technology".It is a standardized test of English language proficiency.It is managed by Sanjesh Organization and was established in 2013.The emergence of MSRT goes back to 2002, the first version of an exam called MCHE: Ministry of Culture and Higher Education".It was introduced as a way of testing applicants to graduate from national universities for Ph.D. degrees.It runs every month.The objective of the test is to test overall proficiency.This test has only one type that is paper-based.Its response modes are multiple-choice responses.MSRT reading test forms are similar to other kinds of tests.They consist of three parts: Listening Comprehension, Structure and Written Expression, and Reading Comprehension.This test contains four passages usually followed by 10 questions forming a total of 40 questions, to be administered in 45 minutes.There are a lot of Iranian applicants who should take these tests because of getting their certificates from their universities.Although the candidates spend a lot of time and fortune learning English, they still have some hardship in answering MSRT questions in the exams.It is really useful to know whether or not these tests are suitable for Iranian context.The primary concern for the test reading paper is to focus on various reading passages following instructions, finding main ideas, identifying underlying concept, identifying relationships between the main ideas and drawing logical inferences.These cases are characterized by MSRT (2009) are multiple choice items, identification of writer's views/ attitudes/ claims, finding the main idea of the passage, identifying the true and false sentences, referring to the pronunciations and finding the best synonyms for the special words.

Area Description
The objective of this research is to inquiry the construct validity of MSRT reading section interior the situation of Iranian test takers.Due to a number of reasons, only reading skill has been selected.First, reading comprehension is the serious affair methodological cases in the field of Teaching and learning English, second it is the main goal of the learners in the countries like Iran where the people learn English as a foreign language.Third, reading comprehension tests are widely used and more objectively scored than the other skills.This study aims to tackle the problem of construct validity, i.e. the skills measured by the MSRT reading paper, through the process of gathering different types of evidence: (1) EFL expert judgments, (2) the EFL test-takers decisions on the skills assessed by the MSRT reading items and finally (3) factor analysis to analyze items.
The first two are qualitatively oriented and third data collection approach is quantitatively oriented.

Test Hypotheses
Based on the above problems in the nature of reading construct and reading skills, the present research struggles to answer these three lower questions: 1) Do Iranian EFL testing experts believe that different items on MSRT reading section would actually measure the reading ability of the applicants?
2) Do Iranian EFL learners believe that different items on MSRT reading section would actually measure their reading ability?
3) Does the result of explanatory factor analysis confirm that it would assess the L2 reading skill in the Iranian context?

Research Question
These three research questions can be formulated in the front of the following null hypotheses.RH1) Iranian EFL experts do not believe that different items on MSRT reading section would actually measure the reading ability of applicants.RH2) Iranian EFL students do not believe that different items on MSRT reading section would actually measure their reading ability.RH3) Result of explanatory factor analysis does not confirm that MSRT reading parts assess the reading skills in the Iranian context.

Participants
To collect the required data, a total of a hundred Iranian EF learners both male and female, age 22 to 40 were randomly selected out of some English institutes in Isfahan.All the participants were getting prepared to take proficiency test like MSRT, to pursue their education.
As regards their educational background, they all had university degree, at least M.A in majors other than English from state or Azad universities all around Iran. Oxford Placement Test (OPT, Allan, 2005) was run since all willing to take the test needed high level of proficiency in English.Based on the scoring system of OPT, Sixty five participants who had intermediate or upper intermediate levels in English were chosen to take the special course aimed at helping them improve their general proficiency.They all passed seventy two hours learning proficiency test techniques similar to those of MSRT and the strategies needed to be adopted in the MSRT test sessions in Jahad-e-Daneshgahi Institute.In general, twenty hours were devoted to improve reading comprehension techniques required for the test.
Twenty five university lecturers were also asked to participate in the study.These highly educated participants, all having at least five years of teaching experience mostly with master and a few with PhD degree, were selected from the state and Azad universities as well as different language institutes.

Instruments
Instruments of this study consist of three different parts.(OPT, Allen, 2005) The OPT (Allen, 2005, See Appendix A) includes 200 items totally.100items out of 200 items in this test relates to the grammar section.The researchers selected just grammar part for this study.After administering the test, the gathered conclusions were analyzed and the students who received 70 or more in this test were selected as intermediate and upper-intermediate learners.

MSRT Test
MSRT test was used to elicit the participants' scores.The scoring procedure follows the guidelines provided by the MSRT handbook (1392).Since this study has focused on the MSRT reading module, merely MSRT reading scores were considered.All reading sections had 4 texts usually followed by 10 questions for a total of 40 questions.(Appendix C).

Questionnaire
Measuring the construct validity of the test is the main goal of this study.The students should have some abilities for the reason that they can overcome to all target questions, so they need the construct validity.
To do so, a questionnaire based on Weir's checklist (1997) which was adopted by Barati (2005) was used to elicit both participants' and university lecturers' ideas (Appendix E).The present study used this taxonomy since it is through and has been validated qualitatively.

Procedures
First of all OPT was administrated to a group of learners (N=100) in order to select 65 intermediate and upper intermediate level learners from State and Azad universities in Isfahan.Their scores were 70 or more.
The selected participants passed seventy two hours learning MSRT techniques and the strategies needed to be adopted in the MSRT test sessions in Jahad-E-Daneshgahi institute.One third of the mentioned hours were completely allocated to improve reading comprehension skills required for the test.After passing all these hours learning MSRT strategies, MSRT test was administrated to the participants in the institutes.
On the other hand, there were twenty five teachers to cooperate this research by taking the MSRT reading section as the expert judges.The learners took this test, too.The experts' judges are so important because they should determine the relationship between the skills in the EFL group of reading skills and individual test items in this test.

Results
Conducting a reliable and valid test has always been a goal for test designers, so for this test at first the correspondence according to answer the first and the second research questions, the average of the experts' and learners' judgments over skill/item correspondence were investigated.Exploratory factor analysis was used.At first, descriptive data analysis was conducted to examine the normal distribution of the variables (participants) across the MSRT reading parts.Then the reliability was calculated using Cronbach's Alpha.Finally the reading skills were identified by running explanatory factor analysis.Note.The whole questionnaire is valid because is slightly near above 0.7.
When Cronbach alpha of the instrument is slightly above the minimum, 0.7 suggested by Nanly (1987), the instrument enjoys an appropriate level of reliability.In this study for estimating the reliability, Cronbach alpha was estimated for each level-of the questionnaire and the whole questionnaire.Cronbach' alpha coefficient of the whole questionnaire showed that it had reasonably high internal consistency.However, the Cronbach's alpha coefficient for the levels A and B was 0.6 and for levels C and D met a moderate alpha Coefficient slightly near 0.7.
Regarding the distributed questionnaire for each question, 3 choices have been determined.For analyzing the data, a triple index has been used to have valid statistics based on Linkert-scale (Appendix E) Yes with the score of 3, No Idea with the score of 2, and No with the score of l.In order to determine that the population has agreed on employing any factors, the averages were compared by using one-tailed t-test.Therefore when HO is confirmed, it means that most of the population did not use that technique or at least had no idea about using it.
On the other hand, when HO is rejected, it means that most of the students or experts have applied that strategy.
Here the experts' and learners' agreement on using the factors have been investigated respectively.In order to answer the first null hypothesis, 25 EFL Ph.D. or M.A. holders were asked to check the beneficial factors in answering the MSRT reading comprehension tests.It can be said that the first null hypothesis was rejected.To address the second null hypothesis, the present study asked 65 students to participate in MSRT tests and after answering the reading comprehension questions, they were asked to check the factors which were useful for answering them.Based on the findings, the second null hypothesis of this study was also rejected.In order to address the third null hypothesis, qualitative approach was used.To this end, explanatory factor analysis was run on both the experts' and students' answers in the questionnaire.Based on the total variance explained for the experts the most used 13 components were extracted.
Based on this plan, we had a pause after the thirteen parts.

Figure 1. Expert's eigenvalue
Based on the component matrix for the students, the rotated component matrix for students was distinguished.Those factors which had the coefficient less than 0.4 were eliminated because they did not have internal consistency.
The analysis of explanatory factor analysis illustrated that the students' ideas did not match with the category of MSRT reading comprehension tests.Based on the expert's ideas, there were 14 factors out of 35 factors with variance above 1.Those 14 factors were found useful while they were answering the MSRT reading comprehension tests.All these factors are displayed in Screeplot of the students' ideas in the Figure 2.These 14 factors had Eigenvalue above 1.

Figure 2. Student's eigenvalue
There were 14 factors out of 35 factors with variance above 1.Based on the conclusions of the Factor Analysis the items in MSRT comprehension tests were not conform on different components.Bachman (1995, p. 237) "validity can be viewed as complementary aspects of identifying, estimating, and interpreting different sources of variance in test scores".The results of factor analysis indicated that the MSRT reading comprehension module used in this study was not valid enough to be used with confidence in order to measure EFL learners' reading ability.The moderate level of Cronbach' alpha coefficient of questionnaire 0.799 >0.7 showed that the questionnaire was an appropriate instrument to be used.The findings also showed that the 2.60% of the Ph.D and M.A. holders agreed on applying the fourth skill on answering MSRT test in Reading Expeditiously for Global Comprehension which was getting the main idea of the paragraph.On the other hand, the students agreed on employing the first skills in this level which, showed that 2.33% of the learners first had looked at the topic.In level B the 2.84% of experts came to agreement that the tenth skills was identified by MSRT reading comprehension tests in Reading Expeditiously for Local Comprehension which was looking for examples.2.07% of learners came to agreement that the seventh skills identified by MSRT reading comprehension test in level B which looking for specific information.Investigating the level Reading Carefully for Global Comprehension 2.26% of the students found seventeenth skill was really important, being able to answer the questions which related to key information of the text, without which the comprehension of the text would be difficult.However 2.84% of the experts agreed on the importance of the twentieth skill being able to make on outline of the main points.Finally 2.38% of the students believed that 30th skill was really important in level D Reading carefully for Local Comprehension.On the other hand, 2.88% of the experts had applied the 30th skill more than the others in level D, too.

Discussion
The aim of this study was assessing the construct validity of MSRT test so three hypotheses and research questions were formulated.Regarding the first research question the normality of distribution and the Cronbach' alpha coefficient of questionnaire were estimated.The present study used judgmental approach to examine the identifiability of MSRT reading module.MSRT reading questions were handed to 25 EFL experts.They were asked to decide about the correspondence between MSRT reading comprehension tests and the reading skills in the EFL questionnaire.The experts agreed on using all levels while answering MSRT reading comprehension.The average for their agreement for level A was 2.568 for level B, level C and for level D was out of 3.
In summary, there is an agreement among the greatest of experts' judgment on the reading part being assessed by the items in MSRT Reading module.T0 answer the first question which stated that: do Iranian EFL testing experts believe that different items on MSRT reading section would actually measure the reading ability of the applicants, was yes.Based on their agreement on applying those variables in MSRT reading comprehension tests, the researcher came to the conclusion that different items on MSRT reading section would measure the reading ability of the applicants.
To address the second research question and examine the EFL experts' idea on the relation between the items in MSRT reading module and the skills from the EFL questionnaire, a group of 65 EFL Iranian students were asked to identify which skill from the questionnaire they thought that they had applied while answering the MSRT reading comprehension tests.The second question of this study stated that: do Iranian EFL learners believe that different items on MSRT reading section would actually measure their reading ability?The answer is yes.According to these factors Iranian learners believed that MSRT reading comprehension test would actually measure their reading skills.The idea is that what EFL learners cannot come to agreement on is the skills being assessed at the times of the test.There may be a problem in determining what the items are really testing.Their agreement here could be assumed as a piece of evidence for the validity of MSRT.They Agreed on using the 4 variable when answering the tests, level A with the mean of 2.2070, level B with the mean of 2.0105, level C with the mean of 2.0877, level D with the mean of 2.1817 and the total mean of 2.1217 which illustrated that majority of learners agreed on the skills assessed by MSRT reading comprehension test.Comparing with experts' idea with the total mean of 2.5493 and level A with the mean of 2.3920, level B with the mean of 2.6880, level C with the mean of 2.5200, and level D with the mean of 2.5971, it seemed that there was a considerable agreement.The key findings of the present study indicated that the MSRT reading skills assess Iranian EFL learners' ability qualitatively by EFL experts' and learners' ideas but not via FA.The Iranian context and rejected the first and second null research hypotheses.
To address the third question which stated that, does exploratory factor analysis confirm that MSRT reading module would assess the L2 reading skill in the Iranian context?Factor analysis was run.The experts shared in thirteen factors from the questionnaire being assessed by the MSRT reading module.That is nearly all judges agreed on the skills to be assessed by these 13 factors in the questionnaire.
In summary, the answer of the third question which states that does exploratory factor analysis confirm that MSRT reading module would assess the L2 reading skill in the Iranian context was no.Because as it was mentioned above, the experts have merely found that 13 factors had factor load and the students have agreed that 14 factors had factor loads.The items in reading comprehension tests don't test 13 reading ability of Iranian MSRT candidates.

Conclusion
For examining the validity of MSRT reading test, qualitative and quantitative approaches were used.Based on the quantitative phase of the study, 65 EFL students' answers and 25 expert's answers on the MSRT reading comprehension test and Factor Analysis were examined.Based on the qualitative phase of the study, EFL expert judges' and the EFL test takers' decisions on the relation between the reading skills in the EFL questionnaire and the items in the MSRT reading comprehension test.
Based on the results of this study, MSRT reading skills assess Iranian EFL learners' ability qualitatively by EFL experts and learners' ideas but not via FA.Hence the quantitative findings Claims that the MSRT reading module assesses reading ability of test-takers.Therefore, the findings showed the improvement of construct validity of the reading MSRT module in the Iranian context and rejection of the first and second null research hypotheses.
The judgmental phase of the present study could not support the findings of the quantitative phase (FA).In the qualitative phase of the study, the majority of the EFL expert judges and the test-takers agreed on the skills to be assessed by the MSRT reading module.In other words, the MSRT reading skills were similarly identified by the expert judges and the test-takers to be assessing heterogeneous list of skills.However the skills were not confirmed when Factor Analysis was run.The results of the Factor Analysis did not show the items in the MSRT reading module to be loaded on different components.Since the qualitative phase of the study, subjective one, confirms that MSRT reading module assess EFL learners' reading ability in Iranian context, but the quantitative phase, objective one could not confirm different skills for the construct of the MSRT reading module was not known to be unitary in the context of Iranian EFL learners.So, the answering of these questions are: 1) Do the greatest of the Iranian EFL testing experts believe that different items on MSRT reading section would actually measure the reading ability of the applicants?

 YES
2) Do the greatest of the Iranian EFL students believe that different items on MSRT reading section would actually measure their reading ability?
 YES 3) Does the result of explanatory factor analysis confirm that MSRT reading parts assess the reading skills in Iranian context?
 NO

Appendix
There are some appendices for complete the proofs.

Table 1 .
Cronbach's Alpha of the questionnaire

Table 2 .
Investigating experts' agreement on using total factors

Table 3 .
Investigating students' agreement on using total factors

Table 6 .
Component matrix a for students