Distribution of Articles in Written Composition among Malaysian ESL Learners

The study aimed to investigate the distribution patterns of the English grammar articles (a, an, and the) as well as the distributions of their colligation patterns in written compositions of English among Malaysian ESL learners. This paper reports the results of a corpus-based study on articles used by these learners. The method used in this study was quantitative content analysis, in which it utilized the data from the Malaysian Corpus of Students’ Argumentative Writing (MCSAW). The findings indicated that all the three articles were presented throughout the entire compositions and that the frequency of occurrences made up 18 percent of the entire tokens. The findings also showed that the distributions of the colligation patterns of the articles were at variance. Some of the colligation patterns were used heavily while others were not often applied in the compositions. The findings can help language teachers identify areas in the structure that need further emphasis in their teaching.

the language especially in relation to language produced by the second language learners. Studies in this area are still lacking and should be considered as one of the neglected aspects in the educational field in Malaysia where much concerns are being put on the development of language input. Therefore, teachers can benefit by measuring learners' ability in using, and also their comprehension of the English article system to be able to develop plans on the handling of the subject matter. Teachers should acknowledge that knowing the heart of the problem itself is the fundamental basis of tackling the situation.

Objectives of the Study
This study is aimed at investigating the usage patterns of articles in written compositions among Malaysian ESL learners. The specific objectives of the study are: 1) To investigate the distribution of articles in written compositions among Malaysian ESL learners.
2) To investigate the colligation patterns of articles according to word classes in written compositions among Malaysian ESL learners.

Research Questions
Based on the research objectives mentioned, the following research questions were devised: 1) What are the distribution patterns of articles in written compositions among Malaysian ESL learners?
2) What are the colligation patterns of articles according to word classes in written compositions among Malaysian ESL learners?

Literature Review
Corpus-based studies are relatively new in the Malaysian educational arena. Krieger (2003) mentioned that corpus linguistics is one particular area on the computer frontier that is yet to be fully explored. Apart from being greatly beneficial, corpus linguistics is increasingly seen as crucial nowadays as it can be applied to various aspects of language teaching and learning, especially the teaching and learning of English as a second or foreign language. Barlow (2002) listed three areas in which corpus linguistics can be applied in teaching, namely syllabus design, materials development, and classroom activities. However, the use of corpus linguistics is not only limited to those.
In Malaysia, very few English Language corpora have been compiled and developed. The most recent work was created by Mukundan & Rezvani Kalajahi (2013), compiling Malaysian ESL learners' argumentative writings called MCSAW. Another corpus was developed by Mukundan & Anealka (2007) which comprises of English Language textbooks being used in Malaysian secondary schools. Additionally, Menon (2009) compiled textbook corpora including the English for Science and Technology (EST) and science textbooks. Studies based on these corpora have been tremendously done but are collectively limited. Apart from these, there are the English Language of Malaysian school corpus (Arshad et al., 2002), and Corpus Archive of Learner English Sabah-Sarawak called CALES (Botley, De Alwis, Metom & Izza, 2005).
It is said that the English article system is one of the most abundant in terms of its usage in the language, yet it is also the trickiest to be taught and learned by non-native English speakers (Yamada & Matsuura,p. 50). As Alimi (2007) pointed out, articles are complex grammatical structures. This statement is particularly true for most Asian students. For example, Yoshii & Milne (1999) noted that almost all the Japanese and Chinese students face difficulties in using articles "…since they do not have articles in their languages…and, thus, cannot accurately reproduce them from sentences they hear." They also pointed out that the article-related concepts are too ambiguous and too vague to be applied to real-life situations. Researchers alike stress that the lack of an article system in the Korean language poses challenges to ESL/EFL learners in Korea and they would commonly resort to omission of obligatory articles in both their spoken and written English (Kim, 2006;Park, 2006 the + plural noun the tables Structure 9 (S9) the + noncount noun the furniture Structure 10 (S10) the + proper noun the Niagara falls Structure 11 (S11) the + superlative the prettiest / the most expensive Structure 12 (S12) the + ordinal the first / the second Structure 13 (S13) the + adjectives the naughty boy Structure 14 (S14) the + 'only' the only person Structure 15 (S15) the + expression of time in the morning *Adapted from Mukundan, Leong & Nimehchisalem (2012, p.65) There are 15 items in the structure which define the common colligations of word classes with the articles a, an and the, as illustrated in Table 1 above. Colligation of articles a, an, and the with word classes were divided into specific categories based on a structured list adapted from Mukundan, Leong & Nimehchisalem (2012), as proposed by Celce-Murcia and Larsen Freeman (1999). Stubbs (2001) stated that large amount of language use are comprised of words occuring in conventional combinations and distinguished them as central characteristic of language in use. This proves that predictable assemblage of lexis constitutes a hugeproportion of normal language use. Colligation of words are common in nature, and thus articles, being one of the most used elements of grammar in English, is in no exception at being ones that colligated most regularly with. Cowie (1998) argued that there are always tendency of words to occur in preferred sequencing, and described them as one linguistic paradigm referred to as 'phraseology' which can be applied to phenomena including word combinations, collocation and prefabricated and formulaic expressions. The term 'colligation' is oftenly used to refer to combination of lexis and grammar (Tognini-Bonelli, 2001;Hoey, 2005). In earlier work, Sinclair (1992) also mentioned different forms of a lemma or different word classes of a word have clearly distinct colligational preferences. Hoey (1993) for example investigated the circumstances of the performances of signalling functions in written text. It showed how particular collocations and colligations are associated with particular word functions by focusing on one signalling word reason. Yamasaki (2008) later studied how collocational and colligational behaviour of anaphoric nouns differentiate their discourse functions within specific contexts using a large-scale corpus, and this further adds to the emphasis of evidence that "types of words and grammatical categories favoured or avoided by a particular word or word sense vary considerably according to contextual usage and language variety".

Methodology
This research employs the corpus-based analysis as its tool to study the distribution of articles in written compositions among Malaysian ESL learners. It is proposed that "corpus based analysis is an ideal tool to reevaluate the presentation of linguistic features in textbooks and to make principled decisions about what to prioritize" (Barbieri & Eckhardt, 2007, in Philip, Mukundan & Nimehchisalem, 2012. Computer-aided content analysis method is applied in this study to examine the frequency of articles found in the written composition, its distribution patterns and colligation patterns of articles according to word classes in the corpora.

Sample
This study utilizes data from the Malaysian Corpus of Students' Argumentative Writing (MCSAW) Version 1 that comprises of 296 essays from Form 4 students, 274 essays from Form 5 students and 440 essays from college students. For the purpose of this study, only 50 essays from each group level of students were used as the samples which make a total of 150 essays. In relation to the distribution of word size, college students contributed the highest portion of 12807 tokens, followed by Form 4 students of 9932 tokens, while Form 5 students make up 9340 www.ccsenet.org/elt English Language Teaching Vol. 6, No. 10; tokens from the total of 32079 running words. Mukundan and Rezvani Kalajahi developed this corpora for the purpose of establishing baseline data of the English language proficiency of Malaysian students in writing, providing benchmarks of the learners' language proficiency and examining the language developmental patterns of the learners across three age range and educational levels which are Form 4, Form 5 and college level students (Mukundan & Rezvani Kalajahi, 2013).

Instrumentation
This study uses a concordance software called Oxford WordSmith Tools version 4.0 developed by Michael Scott (1996, 1997, 1999 as its main instrument to analyze the data from MCSAW. Many researchers deemed this software to be the most appropriate tool to be used for corpus analysis (Mukundan, 2009;Mukundan & Menon, 2006;Mukundan & Roslim, 2009). WordSmith Tools serve the purpose of looking at how words behave in texts.
There are three major components in this software, namely Concord Tools, Keywords Tools and WordList Tools, but only two of which were used for the purpose of this study. In order to retrieve the word clusters from the corpus, the Concord Tools was opted, while WordList Tools was used to retrieve the frequency of articles used in the written composition.

Distribution of Articles
All the three articles (a, an, the) occur throughout all the selected written compositions obtained from MCSAW. The frequency of occurrences of each of these articles is determined in this study to obtain the number of times these articles were used by the students in their writings. There were a total of 5783 articles out of 32079 tokens used in the compositions. This shows that articles a, an and the made up 18% of the entire running words written by the students.  Table 2 shows the frequency of usage of grammar articles out of 150 samples with 5783 tokens. Looking at the percentages, the article the made up 74.7% of the frequency of occurrence of grammar articles in the compositions. The article a made up 17.7% while the least is an with a percentage of 7.6%. Specifically, the article the had the highest usage of 4320 times followed by the article a with a frequency of 1021 times, and lastly the article an which occurred only 442 times.
Therefore, analysing from these data, it is clear that the article the was highly used by the students. The article an on the other hand was used the least, while the article a was moderately used by the students throughout the compositions. The high frequency of usage of the articles a, an and the points out that the articles were heavily used in the language, particularly in students' writing. This indicates that the grammar articles are not avoided or omitted in writings, hence, mastery of the subject matter is a must for the students.
This finding further supports the previous study by Mukundan, Leong & Nimehchisalem (2012) which claimed that the article system is one of the heavily used aspects of grammar in the English language. Extensive use of definite and indefinite articles show that the article system is one of the basics and fundamentals in grammar. This is especially true in writing, where accurate use of the article system provides an insight to the comprehension level of the students in grammar use. Master (1994) revealed that articles were not often presented as hindrance towards getting the message through or being intelligible by means of speaking since other linguistic features can substantiate omission of articles and should as well provide naturalistic use of English, but in written language, precision and accuracy on articles portray the writer's fluency.
According to Chin (2000), it is strongly suggested that "the most beneficial way of helping students improve their command of grammar in writing is to use students' writing as the basis for discussing grammatical concepts". Based on the findings in this study, frequency of use of articles in students' writing shows the importance of articles as one of the fundamental aspects of English grammar. Thus, students require mastery in this particular component of grammar in order to pursue for comprehension in many other aspects in grammar, to say the least. Chin (2000) further stressed that teachers should give more attention to grammatical concepts that are necessary for clear communication of meaning in students' writing. She added that teachers should prioritize and teach grammatical elements that most affect their students' ability to write effectively rather than being over-ambitious by striving to teach all.
Another strong view is from Weaver (1998) who proposed that guidance in understanding and applying grammar items that are most relevant to writing is all that is needed the most by the students. Frequencies (in this case, the English articles a, an and the) should be of awareness by the teachers to help determine which grammar elements to be prioritized in their teaching (Conrad, 2000). Furthermore, Philip, Mukundan & Nimehchisalem (2012) emphasized that without adequate knowledge of the grammar system, students would not be competent enough, hence, understanding of grammar of the language is necessary in order to function well in the language. Consequently, from the findings of this study, it is best recommended that teachers give more priority to the teaching of the grammar articles as they are extensively used in students' writing. This is to help them to write better and more confidently because articles are presented in every type of written compositions, and whether students are aware of it or not, they have to use articles in their writings. Table 3 shows the frequency of occurences of colligation of a according to specified word classes, which indicates irregularity in its usage. A very imbalance proportion pattern can be seen in other colligations according to their word classes as well. As presented in Table 4, the colligation of an with associated word classes does not have a well-balanced proportion. Table 5 indicates the use of the article the, where S7 and S8 contain the highest numbers of occurrences. 138 S11 111 S12 75 S13 286 S14 17 S15 31

Colligation Patterns of Articles According to Word Classes
This study revealed the distribution patterns of article colligation with word classes in students' written compositions based on its frequency. It can be drawn from the findings that students' use of articles varied across different types of colligations. The number of occurrences of colligation according to particular word classes was www.ccsenet.org/elt Vol. 6, No. 10; inconsistent. There are some colligations patterns that have very low number of occurences, while some are heavily used. These support the fact that there are no free-choice vocabularies in English sentences but rather governed by constraintson the concurrence of words. Stubbs (2001) in a study of phraseology of English addressed that the freedom to combine words in text is much more restricted than often realised.
According to Yamasaki (2008), one linguistic paradigm is that words tend to occur in preferred sequences. This gives an indication to the justification of the irregularity of the patterns in each group of article structures. Colligation of words often failed to be addressed thoroughly in lessons. Students' exposure to words and the art of combination are often limited, hence there are lack of varieties in the choice of words and the use of appropriate articles in natural context. Römer (2004) highlighted a problem faced in the teaching of English to L2 learners which is learner input, whereby educators often fail to address the input pupils actually get in their lessons, when so much stressed is put on learner output. There is lack of exposure to variousness of colligation patterns to these students in their activities and materials in the classrooms.
One dimension worthy to note is that there are many words in English language that carry several meanings, or in other term, ambiguous. Studies particularly in the field of corpus-based revealed that different unspecific nouns such as problem, reason, idea, and fault differ in their favoured syntactic patterns and in the favoured premodifiers used in each pattern (Yamasaki, 2008). From the data collected and the findings, it can be drawn that learners are often confused, or even worst, cannot interpret the exact meaning of these words in context they are being used. Therefore, these learners are not able to properly colligate the words with the right articles. Cases of article misuse, avoidance and omission hence could be seen. One obvious case taken from the data in the corpus is the use of article a and the before the word Facebook, for example 'I have Facebook account', 'the Facebook is a social networking website' and 'a Facebook can help me with'.

Conclusion
With the results of the study, English teachers are better equipped to recognize the most common structures used by students in writing, hereby enabling them to further employ the knowledge for practical application in the classroom where they can better assist students in the mastery of articles. By observing the sentence structures students use in their writing, teachers are able to have clearer perceptions on how students comprehend and apply articles in compositions.
For Malaysian teachers and researchers in particular, the findings help provide insight to the comprehension and practice of Malaysian students in the use of articles, aiding in the research of grammar as well as ESL teaching in Malaysian education. It will also help in the development and improvement of teaching materials used in the teaching and learning of English as well as to help teachers design activities that are more practical and relevant to the needs of these learners. As suggested by Lawson (2001), only a corpus can provide clear perceptions of certain linguistic features in real-life applications such as lexico-grammatical associations. This study should help teachers and researchers gain clearer comprehension on the practice of the use of articles by Malaysian students who are now, based on current educational curriculum, are moving towards the notional-functional approach in the language learning.
Teachers are advised to emphasize more on the aspects that should be prioritized, such as to expose students to language use in context. Materials and activities designed for the students should include all sorts of article colligations according to different types of word classes, not just specific ones. These can help learners to familiarize themselves to the variety of colligation patterns in English language. More practices should be given to the students in the use of words according to the correct use of articles they should colligate with. Exposure is the key for the learners to learn better, hence teachers should expose these students to different types of colligation patterns throughout the whole learning process of the English language, not just particularly in specific grammar lessons. Thornbury (2002) claimed that ability to remember and understand the meanings and functions of words in a language are better achieved if they are met at least seven times. Apart from that, ability of these students to distinguish one noun from another is also crucial and therefore extra practices to educate students on the different types of nouns, their meanings and their usage would help in enabling them to use the appropriate article before nouns. Same stress should also be put on other grammatical items in the language.
The study also brings to light the use of corpus for research on the grammatical field of articles. On top of that, the study highlights the importance of the understanding of article use in context. The use of corpus linguistics in education, in this matter, the teaching of English as a second language, is deemed to be vital as it explores most of the detailed aspects of the language produced by L2 students. The findings of the study provide recommendations to help determine areas for further research by language teachers and researchers alike regarding the use of articles and their colligation patterns. Corpus linguistics may enable teachers with new data on etymology and definitional www.ccsenet.org/elt English Language Teaching Vol. 6, No. 10; aspects of words being used and produced by learners particularly in regards to linguistics and sociolinguistics respectively.
Further research is required to determine the competency of Malaysian secondary and college students in their use of articles. An error analysis may be conducted with the same corpus used in this study, in which aided with its findings will shed more light as more improvements are needed to bring students as well as language teachers to a higher level of comprehension and competence regarding articles. Investigations on the extent to which exposure is given to the students regarding the variety of the usage of articles and the colligations in materials and activities may be helpful to further analyze the distribution patterns of articles not only in writings, but other components as well.