Investigating Iraqi EFL University Students’ Lexical Knowledge: A Cross -Sectional Study

ABSTRACT


Introduction 1
Lexical knowledge plays a major role in second/foreign language learning as it is the basis for communication, and it is emphasized that grammatical errors still result in understandable structures, while lacking lexical knowledge disturbs communication as a whole, Schmitt (2000: 55) states that "Lexical knowledge is central to communicative competence and to the acquisition of a second language".Many researches have proved that Lexical knowledge is one of the critical problems or, in other words, one of the largest obstacles which learners or students face when learning a new language (Huckin, 1995, Meara, 1980), and its importance is not less than grammar or any other aspects in learning a new language, but, on contrary, its importance may exceeds other aspects such as grammar.Lexical knowledge is like the "building blocks of language" as Read (2000, pp.1-3) describes it.And as cited in Lewis (1993, p. 25) by Krashen, "When students travel, they don't carry grammar books, they carry dictionaries".
What makes this problem even worst is, unlike syntax and phonology, that there are no fixed rules to follow in order to develop vocabulary knowledge or what type of vocabularies should be learnt or focused on by a learner at the first place.It is believed that languages, in general, are updatable which means that there are thousands of different kinds of words being invented every day.Furthermore, Nation (2001) argues that the relationship between language use and lexical knowledge is a complementary matter as he describes it: vocabulary knowledge enables language use and, conversely, language use leads to an increase in lexical knowledge.From this perspective, the investigation of students' lexical knowledge is inevitable so as to determine the knowledge they acquired through university courses and at the same time to provide information about the level of vocabulary knowledge they have for pedagogical purposes.
Lexical knowledge, on the other hand, can be investigated quantitatively (breadth) and qualitatively (depth).Quantitatively refers to the amount of lexical knowledge acquired, or, in other words, vocabulary size, while qualitatively refers to the depth of lexical knowledge of words, or how well the target word is known in different contexts.For this reason, several kinds of lexical knowledge tests have been made through literature such as Heaton (1999), Read (2000), Nation (2007), Read (2017), etc.

Literature Review
Lexical knowledge refers to the words which have to be known in order to communicate effectively.Generally, lexical knowledge can be described as, according to Crystal (1987, pp. 251-253), an oral use of words and reading vocabulary.Oral ones refer to words which we use in speaking or we recognize in listening, while reading vocabularies refer to words which can be recognized or used in print (Crystal, 1987, pp. 251-253).
The importance of lexical knowledge is discussed many times through literature.John Read (2000, pp.1-3), for instance, argues that lexical knowledge is one of the fundamental issues that a second language learner faces when learning a new language.He describes it as the "basic building blocks of language" since it is the units of meaning that forms the larger structure of language such as sentences, paragraphs and texts.Thus, investigating or assessing lexical knowledge is both "necessary and reasonably straightforward".
Concerning native speakers, lexical knowledge develops unconsciously in early years of life (childhood) and it continues to develop gradually through adult life by encountering new experiences, inventions, concepts, social trends, etc.For a second language learner, the situation is more complicated since acquiring a new system of language (including new vocabularies, grammar, social trends) starts, for most times, in adulthood or as Read (2000, p. 1) notices "the acquisition of vocabulary for a second language learner is typically a more conscious and demanding process".Read (2000, p. 4) also argues that even in an advanced level of lexical knowledge a second language learner is aware of limitations or lack of competence concerning vocabularies.Those limitations are considered as gaps in lexical knowledge that a second language learner faces when encountering a concept, word, expression, etc., which cannot be recognized by the speaker/ learner.Therefore, many learners of a second language consider lexical knowledge as an essential matter for their second language development and they spend a lot of time memorising new vocabularies.
In this connection, Schmitt (2000, p. 55) asserts that lexical knowledge is a crucial factor for the second language learners since limited number of vocabularies impedes the communication success as a whole, unlike grammar, "lexical knowledge is central to communicative competence and to the acquisition of a second language…".
To simply put, lexical knowledge can be seen as a fundamental area in language teaching\acquisition and it requires test scales to observer learners' progression in vocabulary, and to investigate how adequate their knowledge is "to meet their communication needs".

Lexical Knowledge Development
The ultimate aim of this study is to investigate lexical knowledge development through university courses.Deighton (1959, p. 3) expounds that there are, in general, four principles of context which may help in the process of lexical knowledge/ vocabulary development.They are as follows: 1-The first general principle refers to unfamiliar word and its context.Context may reveal different meanings vary from one reader to another according to readers' experience.

2-B.
The second general principle is that context may reveal only one of the meanings of an unfamiliar word, in other words, one sense of the given word.Most of English words have more than one meaning, and it can be found in different dictionaries with multiple meanings.
3-Context rarely clarifies the meaning of the whole of any, such as a sentence.Most of the times, context provides synonyms, but synonyms do not have the exact equivalents in meaning if a word is in its contextual structure.Context often provides the reader with several clues in order to infer the convergent meaning of a given unfamiliar word.

Word Knowledge
Knowing a word involves different aspects of knowledge, and those aspects of knowledge can be classified according to its strength and detail, and to their levels of fluency.These aspects are considered to be high priority objects or aims for the teachers of a language.
As Nation (ibid) states, the principles below are related to the term "knowing a word": 1. "Not all aspects of word knowledge are equally important.
2. Word knowledge can be described in terms of breadth (aspects), depth (strength), and fluency.
3. Word knowledge develops over a period of time.
4. Some knowledge is limited to individual words, while other knowledge is systematic.
5. Some knowledge needs to be learned, while other knowledge is constructed through common sense and knowledge of the world.
6.The difficulty of acquiring knowledge (learning burden) is affected by a variety of factors including regularity of patterning, the learner's L1, other known languages, opportunity and experience, personal commitment, the quality of teaching, and the quality of course design.
7. Vocabulary knowledge is most likely to develop if there is a balance of incidental and deliberate appropriate opportunities for learning.
8. Learned aspects of word knowledge are affected by a small number of psychological learning conditions.9. Fluency of word knowledge can be a useful learning focus.
10. Testing word knowledge requires careful thought about the purpose of testing, the aspects and strength of knowledge to be tested, the effects of test item type, and the people being tested." To sum up, Nation (2013, p. 49) refers to the receptive-productive distinction in word knowledge, and he states that this distinction runs through nine aspects: a-The form of a word which includes spoken, written, and word parts.b-The meaning of a word which includes Form and meaning, concept and referents, and associations.c-The use of a word which includes grammatical functions, collocations, and constrains on use (register, frequency, etc.).Nation (2013) elaborates that a word receptive knowledge is the knowledge we need for listening and reading, and this knowledge requires recalling the intended meaning of a word when sees or hears it.While a word productive knowledge is more about speaking and writing.According to Nation (2013), a word receptive knowledge can be acquired easier than a word productive knowledge.

Incidental vs. Intentional Learning
The incidental learning is the process of learning new things without having the intention to do so.It is also can be considered as learning one or more than one thing while intending to learn another.Incidental learning motivates learners for extensive reading and looking for new text books in order to find new words and contexts.The ability of guessing the meaning of new words from context is what the incidental learner should have (Coady, 2001, p. 3).
The incidental-intentional vocabulary learning has a strong link to this study in a way or another.The investigation of lexical knowledge includes the four stages of the Department of English and Literature.The first and second stage study the English language as it is in intentional learning since students do not have novels, dramas, or poetry in their materials but it is more like comprehension, grammar, phonetics, literature in general, etc.The extensive reading comes after the second stage; the third and fourth stages are both full with extensive reading, and especially the fourth stage.By the nature of their extensive reading materials, third and fourth stages are much more exposed to the incidental learning, since they read three novels, three dramas, poetry, linguistics, essay writing, phonology, and literary criticism and transformational grammar in the fourth stage only.This huge controversial gap by the nature of materials between the first and second and the third and fourth stages creates the contrast between the intentional vs. incidental vocabulary learning.

Hypotheses Development
Lexical knowledge has a crucial effect on students'/learners' communicative abilities and reading comprehension, therefore, it affects their career in the future and it disturbs their second language use.Acquiring as much vocabularies as possible by reading English literature and other materials provided in courses can help students to develop their lexical knowledge inventory.
Considering the previous discussion, it is hypothesized that: 1-There is an apparent development in learners' lexical knowledge among the four stages, and it develops in an average extent across the four stages.
2-Learners' scores in recognition test items are better than their scores in the production items.

Test Design and Sources of Data
This study adopted a firmed test design stated in Read's (2000, p. 2) Assessing Vocabulary.Those test items are, according to Read (ibid), "easy to write and to score, and they make efficient use of testing time".English students are familiar with such test designs/items since these tests are prevalent in the EFL and the ESL fields.For example, a multiple-choice test has been commonly used in standardized tests.Read (2000) noticed that multiple-choice test is "highly reliable" and it distinguishes learners' capability according to their level of lexical knowledge.
The word families (items of the test) were Chosen from the AWL (Academic Word List) proposed by Coxhead (2000).Coxhead (2000) has divided these word families in the proposed list into 10 sub-lists according to its familiarity or to its frequent use in academic writing.Therefore, sub-list 1 includes the most frequent or common word families, whereas sub-list 10 includes infrequent or rarely used word families.

Population and Sample
The sample of the study is taken from the Department of English Language and Literature /College of Arts/Mustansiriyah University of the academic year 2019-2020.
For the purpose of accuracy and reliability of results, the four stages of the Department of English Language and Literature will be involved in this investigation.In this way, the accuracy of learners' lexical knowledge growth and statistical analysis rate will be increased since the four stages will be involved without any exclusion.The participants are native speakers of Arabic having the same EFL background.The total number of participants is 140 students.The total number of students in each stage ranges between 40-50 as long as the number of students varies from one stage to another.
The study takes 5 participants from each stage (20 in total) for the pilot study in order to determine the validity and the reliability of the final test on one hand, and the difficulty, time needed, and resources consumption of the final test on the other hand.
For the final test, this study takes 30 participants from each stage of morning studies only (120 in total) which is more than 50% of the total number of students in each stage.

Research Models
In order to test the hypotheses, this study adopted the test items provided in Read's (2000, p. 2) Assessing Vocabulary.These test items are as follows: A-Recognition Items: 1-Multiple-choice (Choose the correct answer) I accepted his resignation with great reluctance.The KR-20 formula is as follows: The value ρKR 20 = 0.96 shows that the test has high reliability (See appendixes C).

Scoring Scheme (Recognition)
Scoring scheme of the first part of the test was divided according to the number of items and the number of chosen vocabularies included in the recognition test.The first question of the recognition test included 20 items (one vocabulary in each item) and the second question (matching test) included 10 items (5 vocabularies in each item).The total number of items included in the recognition test is 30 and the total marks of the recognition test are 80.The first question scoring scheme is 1.5 marks for each item, 30 marks in total.The second question total marks are 50. 1 mark for each vocabulary included in the item which means 5 marks for each item.Participants are required to choose only one correct answer for each item included in the questions.

Scoring Scheme (Production)
Scoring scheme of the second part of the test is divided equally to the total number of items included in the two questions.
Each question of the production test included 20 items and each item included one vocabulary underlined and bolded in a sentence.Two marks are given to each correct 1 mark out of 2 is given for a spelling mistake, and zero mark is given if a participant fails to write the correct answer, left it blank, or has 2 or more spelling mistakes.The production test is designed to investigate participants' writing and spelling abilities concerning each given vocabulary, and since the test is in the form of a written test and is not related to the spoken forms of vocabularies.

Pilot Study
According to Thabane (2010, p.1), pilot study is "a small scale preliminary study conducted in order to evaluate feasibility, duration, cost, adverse events, and improve upon the study design prior to performance of a fullscale research project." Pilot studies are carried out before conducting the large-scale in any quantitative research method, and it is usually performed on members of the same population.It is important to show the advantages and disadvantages of the large-scale experiment which gives an opportunity to amend or adjust the large-scale which gives a precise outcome before making or deciding to do any further steps."It is a potentially valuable insight and, should anything be missing in the pilot study, it can be added to the full-scale (and more expensive) experiment to improve the chances of a clear outcome."(ibid) The pilot study is performed on randomly chosen students form each stage of the four stages.In the first day (3 rd of Feb/2020), leaners were engaged with the first part of the test which is the recognition one, and in the next day (4 th of Feb/ 2020), the same participants were engaged in the second part test which is the production one.The estimate total time to answer the two parts of the test is 2 hours which means 1 hour for each part.The results of the pilot study gave significant remarks for the final test.The remarks are as follows: 1-The time needed for answering the first part of the test (recognition) is more than one hour which means that learners couldn't answer the whole items.(See appendix A) 2-The first question (synonymy test) in the recognition test includes 30 items and the second question (matching test) includes 20 (3 vocabularies in each item), 60 vocabularies in total, most of them were wrong answers and the others were left blank.
3-The total pages of the whole two-part test are 13 pages, 10 for recognition, and 3 for the production.The total number of items of the recognition test items what participants complained about for its difficulty and its time consuming.

4-
The time needed for the production test items is sufficient and its difficulty is moderate.
Procedures taken to amend the recognition test items are as follows: 1-Decreasing the total number of items of the first recognition test (synonymy test) to 20 items.

2-Abstracting vocabularies that have 0% correct answers for the first recognition test items (synonymy test).
3-Decreasing the total number of items concerning the second question in recognition from 20 (3 words each) to 10 (5 words each), which means that the second question includes 50 vocabularies in total.The "I don't know" option has been removed, according to the supervisor recommendation.
4-Decreasing the total number of pages for the whole test to 10 pages, 7 pages for the recognition part, and 3 pages for the production one.

Final Administration of the Test
The main test was performed in the 7 th of Feb/ 2020, and it lasted for two weeks from the initial date.Participants were given two hours to answer the questions, and it was done under the supervision of some professors from the Department of English Language and Literature in their lectures to avoid bias and cheating, after taking the permissions needed from the head of the department.Participants were given certain instructions about how to answer and what are the requirements of the questions.The instructions were given using both, their native language and the target language to avoid any misunderstandings in any part in the questions (Olshtain & Cohen 1983, p. 32).
The ultimate aim of this test is to investigate EFLstudents' lexical knowledge, thus, participants were told that the test has nothing to do with their marks except it is for research purposes only.Additionally, participants were asked not to write their names on their response papers to avoid any embarrassment and to make participants feel free and comfort in answering the questions depending on their own lexical knowledge and not using their cellphones or any other translation devices.

Participants' Performance at the Recognition Level
First year participants have 80% of their total number scored under 29%, which is something normal since they are fresh and have not got enough lexical knowledge and they did not have enough time to attend the university courses.
On the other hand, 77% of the second year participants scored 29% and less in the final test, and the third year participants have 67% of participants scored less than 29% which is something remarkable.Participants of the second and third stages left some of the items in the recognition test empty without making any choices, although the words were not that hard, such as previous, only 1 participant has answered this word correctly, and 29 out of 30 have failed.The other example is environment, 100% of the third year students have failed to answer this word, while 37% out of 30 of the second year students have answered it correctly.
The lexical knowledge of students of the first and second stage is limited and their materials focus on a limited number of vocabularies, unlike the third and fourth stage students.Thus, some of the chosen vocabularies included in the test might not occur in their materials and they did not have that knowledge about words and words formation (See Figure 2 below).
The best scores of the recognition test items were recorded by the fourth year participants.13.3% of the fourth year participants have scored more than 70% in the recognition test which put them at the top of the list.

Participants' Performance at the Production Level
Participants of the fourth stage have performed better than the other stages in the production test items with a rate of 13.3% success.3% of participants from the first and the third stage have succeeded in the production test which equals to one participant only out of 30.The second year students have recorded a rate of 7% out of 30 participants.
Most of the participants have left the production test paper blank, especially the first and the second year participants, which indicates a lack of knowledge, lack of reading comprehension, and low experience about words and contexts.Some of the participants did write the correct answers but they failed to write its spelling correctly which causes them to lose half of their marks, and this is another reason for their low performance in production test, the spelling mistakes.
Guessing the correct answer is another problem participants were fallen into since production test items depend entirely on participants' thinking, their knowledge about a given word in a different context, and declassifying the context correctly where the intended word occurs, which indicates that most of the participants have failed to do so.
The other remarkable issue is that the production test items which have hints before each blank to restrict the choices, firstly, and secondly to stimulate the reader's memory to remember the intended word or to choose the right answer.On the contrary, most of the participants didn't take this advantage and they were guessing words starting with the same letter/hint regardless of the contextual needs, for example, item No.6 despite is the correct answer, but most of answers wrote days, dislike, etc.
In general, participants have scored lower in the production test items in comparison to their scores in the recognition test items.The number of participants who had 0% is much higher than in recognition test statistics which recorded 0% for null scores (Hypothesis No 2. Proof) (See figure 3

Error Analysis
According to Norrish (1983, p. 7), an error is a systematic deviation that happens when a learner has lack of knowledge about something or has not learnt something, and keeps doing it wrong.
Richard et al. (2002, p. 184), an error can be defined as utilizing a word, a speech act or a grammatical item in an imperfect or incomplete way.
On the other hand, Chomsky (1965, p.4) argued that there is a distinction between errors and mistakes.He (ibid) stated that ''we thus make a fundamental distinction between competence (the speaker-hearer's knowledge of his language) and performance (the actual use of language in concrete situations)''.Therefore, errors are an indication of an incomplete learning or lack of knowledge unlike mistakes which are considered to be the misuse of knowledge under certain situations.Grammar: Omission:  We wait ^ the bus all the time. He was ^ clever and has ^ understanding father.Addition:  Students are do their researches every semester. Both the boys and the girls they can study together.
3-Developmental Errors: is a kind of error that is in somehow part of the overgeneralization which occurs when a learner has started to develop their linguistic knowledge and fail to produce the correct rules.For example, come = comed.4-Induced Errors: are types of errors caused by misleading teaching examples.It happens when a teacher explains a rule without highlighting or illustrating the intended meaning he wants to convey to the learner.5-Errors of Avoidance: they occur when the learner fail to apply particular rules of the target language as they believe that these rules are difficult to achieve.6-Errors of Overproduction: are types of errors when learners are frequently repeating a certain structures.This happens in the early stages of language acquisition when the knowledge of learners about the language is not sufficient to produce finite structures of the target language.
The following errors are what most of the participants have fallen into: 1. Most of the participants could not recognize, or in other words, they failed to differentiate between vocabularies of relative meanings in different contexts, for example, for the word transfer in question No.1, most of the students' answers were moved, and selected, but few of them went for transported which is the right answer in that context (See appendix (B) for more information).This indicates either they do not know the exact meaning of the word transport, or they failed to declassify the meaning of transfer in that context.The other example is sector and section, most answers were area.These errors are more likely to be overgeneralization errors.
2. In the second recognition question (matching test), some of the participants were not accurate in choosing the right answer, in other words, they were guessing the correct answer by using the closest pronunciation of the given word, in other words, these type of errors are considered to be simplification errors, for example: a-Item No 2./ D includes the word consequent, some of the participants' answers were on calculate.b-Item No 9./ C includes the word simulate, answers were on stimulate.
3. Most of participants' low performance was in the production test items.They failed to figure out the required correct words from the given hints, which can be classified under simplification errors, for example, a-Production Q1/ item 7, She just had an article pub-----------……., answers were publication, public, instead of published which is the right answer.b-Production Q1/ item 10, She collects first ed---------……….., answers were edit, edited, etc. instead of editions.And in question No 2., they failed to give the chosen words its English equivalent, these errors can be classified either under overproduction errors or under developmental errors, for example, a-Production Q2/ item 8, She accused the party and, by implication, its leader too.Answers were having, invitation, etc. instead of saying something indirectly.b-Production Q2/ item 18, they imposed a five percent levy on alcohol.Participants' answers had no relation to the given word levy, such as material, drink, etc. instead of tax, or an amount of money has to be paid.

5.Conclusions
The following conclusions are related to the investigation of Iraqi EFL university students' lexical knowledge.The study aims at investigating students' lexical knowledge at the recognition and production level, and to identify their knowledge in vocabularies according to the Academic Word List provided by Coxhead (2000).The test was carries out on 120 university participants at the Department of English Language and Literature\ College of Arts\ Mustansiriyah University.Based on the Test (Match each word with its meaning) B-Production Items 1-Completion Test (Complete the following missing letters) She's just had an article pub______ in their weekend supplement.2-Translation (Give the English equivalent of the underlined word) I assumed (that) you knew each other because you went to the same school.The reliability of the test is obtained through the KR 20 formula proposed byKurder and Richardson (1937, pp.151-160).This equation measures the internal consistency of the reliability of a test.

4. Findings and Discussions 4 . 1
Participants' Performance in the test Participants of the fourth stage have performed better among the other stages in the test at both, recognition and production level.There is a lexical knowledge development among the four stages of the department and it reaches its peak at the fourth stage.Lexical knowledge was gradually developed among the four stages, but not equally in its rates.For example, the first stage participants scored the lowest with 15% average final score, 17.2% average final score of the second year participants, 18% for the third stage, and then the major development was recorded by the fourth stage participants with 34.3% average final score.

Figure 1
Figure 1 below shows participants' average performance at the recognition and production level, and it also shows the average of their final scores in the test.

Figure 1 :
Figure 1: Average Percentages of Participants' Scores

Figure 2 :
Figure 2:Participants' Performance at the recognition level

Figure 4 :
Figure 4: Participants' Performance in the AWL

4 -
The growth of lexical knowledge is a matter of a gradual context revelation, which means that a reader may figure out the meaning of a given unfamiliar word according to the experience and the clues he had about the word from different encounters with it in different contexts.The more encounters a reader would have about a word, the more experience he will get about it.
below).Participants from all stages have performed better in sub-list 1, since it is the easiest word list in the AWL and it includes the most familiar words used in everyday activity by any learner\student.The average rate was 34.5% correct answers (See figure4 below).Thus, hypothesis No3 has been refuted since it hypothesized that the average correct answers' rate range between sub-lists 3 and 4.2-It is obvious that the fourth year participants have more experience about the words included in the AWL(2000), and this experience could be gained either from university courses, the nature of their materials, or from an outer source.3-Thirdyearparticipantshave failed to answer the AWL sub-list 3 for most, since they have scored only 9% correct answers.Words like rely, technique, emphasis, partnership, etc. are what they failed to answer, although these words are supposed to be familiar for advanced learners such as the third stage.Other examples, previous, the first and the fourth year participants have scored the same in this word, 8 correct answers only.Environment, all of the third year participants did not answer this word family correctly, although this word family belongs to sub-list 2.4-Some of word families are commonly used and repeated many times in novels and dramas.Words such as consequent, feature, evaluate, major, period, interpretation, resolution, resident, author, cycle, debate, imply, perspective, theme, appendix, invoke, levy, etc. but participants did not get a high rate of correct answers in these words.5-Otherwords are commonly used in composition, comprehension or in linguistic materials such as grammar and phonology and phonetics, i.e. procedure, similar, assume, coordinate, emphasis, task, hierarchy, identical, passive, coherence, reluctance, etc. but participants did not answer them correctly as well.
Simplifications: are types of errors when learners try to be linguistically creative and produce their own linguistic rules, sentences, and utterances.In this type of errors, learners may success in answering the correct answers by Richard, et al. (2002, p. 267).classified intralingual errors into 6 categories: 1-Overgeneralizations: they are types of errors occur when a learner utilizes a grammatical rule in cases where it cannot be applied.I.e.tooth = tooths.2-