Linguistic Features Differences in Arabic Textbooks Used at Islamic Schools in Malaysia

. The dearth of studies that assess linguistic differences in Arabic textbooks is the motivation for the present study, which identified significant differences in the use of linguistic features in textbooks for different secondary school levels. The study analysed 315 samples of 100-word texts, randomly selected from 105 Arabic textbooks used at Islamic secondary schools in Malaysia. Seven linguistic features were analysed using descriptive and inferential analyses through one-way analysis of variance (ANOVA). The analysis showed a moderate use of simple sentences, complex sentences and noun sentences. However, the use of complex sentences was higher than that of simple sentences. There were a high number of common and frequent words used, but the use of abstract words was low. Conjunctions and discourse markers were used at moderate levels. ANOVA analysis showed significant differences in the use of complex sentences, common and frequent words, conjunctions and discourse markers, and sentence length averages. The study also found that long sentences are higher in frequency in Form 3 textbooks compared to those in Forms 4 and 5 textbooks. The current study suggests that the frequency of linguistic features should correspond to students’ needs by taking into their school level, Arabic language proficiency and subject content.


Introduction
Success in the teaching and learning process depends on many factors, one of which is the selection of teaching materials. Teaching materials are an important element in the curriculum (Haris 1991). They stimulate learning, maintain student interest, increase diversity in learning and reflect the association between the subjects involved in the language learning (Combs 2009). Reading materials such as books, newspapers and magazines are examples of important teaching materials. Yahya (2003) claimed that teachers, unfortunately, find it difficult to select reading materials that are compatible with the student's unique interests, ability and readiness. Suitable reading materials enhance success in teaching reading. Therefore, attention to the preparation and selection of suitable reading materials which are easy to understand and that enable effective teaching and learning are crucial.
With regard to Arabic language education, studies on the readability of reading material are scarce. This is discouraging because such knowledge would provide insight into producing optimal Arabic reading materials that can assist students in mastering the Arabic language. Arabic reading materials that are not commensurate with the student's abilities or level may demotivate students (Kamarulzaman 2010;Kamarulzaman and Hassan Basri 2009). In other words, skill-level-appropriate texts are essential in enhancing students' interest and helping them master the language.
The issue of readability in relation to the difficulty of reading a text has received much attention in the literature. According to Tay (2005), readability is distinct from reading ability. Readability relates to the reading material, while reading ability relates to the reader's ability to read. The concept of readability is quantifiable. It consists of formulas to assess the reading level difficulty of the reading materials. Klare (1969) explained readability in three aspects: (1) materials which can be read easily, either handwritten or typographical, (2) materials that are pleasurable or comfortable to read due to interesting content and (3) material containing good writing style/smoothness of writing.
The linguistic features in this study are based on the Rumelhart Reading Interactive Model (1977) which explains reading as a process of interactions between the reader and the text. This cognitive process involves understanding word meaning, judging the word, engaging in grammatical and structural analysis by translating the meaning of words in the sentences, linking the intent between sentences and paragraphs, linking intent from one paragraph to the next and associating texts with existing knowledge in order to understand the text. This interaction process requires linguistic characteristics that are comprehensible input and existing knowledge, maturity and language-speaking skills. Selected linguistic features are features that contribute to the readability of the text, especially word and sentence variables (DuBay 2004).
There are limited studies on linguistic text features that engender text suitability with the reader's language proficiency level. Studies related to the linguistic characteristics of readability in Malaysia such as Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi (2014) focused on the said characteristics in the context of foreign language learning in Malaysia. The study identified the linguistic characteristics that predicted difficulty in the Arabic text readability formula among non-native Arabic speakers in Malaysia. The results indicated that linguistic features including three categories (words, sentences and content density) have an imbalanced distribution of consumption. The five linguistic features that were most consistent in text reading consumption were sentences and content density categories. The tendency of sentences and content features in determining the difficulty of 15 reading texts of Form 4 Bahasa Arab Tinggi (BAT), i.e., advanced Arabic textbooks, was consistently high. This study, however, only used one textbook in reviewing the Arab linguistic characteristic pattern in the context of readability measurement.
Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi (2017) studied linguistic features in textbooks used at different secondary school forms (levels). Of interest was the relationship between linguistic features and the level of readability of the Arabic texts. The study analysed 315 samples of 100-word texts selected from 105 Arabic textbooks used for secondary 1 to 5 students. The average sentence lengths were found to be high across all forms. The Pearson correlation analysis indicated a significant negative correlation between complex sentences, common and frequent words, conjunctions and discourse markers and the level of text readability level. There was a significant positive correlation between the average length of sentences and the level of readability.
The questions that arise are, in the preparation and selection of appropriate reading materials, what are the fundamental characteristics of text readability for non-native speakers, and what are the differences between the linguistic features in textbooks according to students' levels? Arabic text readability is not as well-researched as English text readability (Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi 2014). Indeed, the assessment of Arabic text readability and studies on Arabic language readability are limited, especially in the Malaysian context (Kamarulzaman 2011). As pointed out by Khadijah Rohani (1989), linguistic features in Arabic textbooks and their differences across texts are underresearched in Malaysia. This and the long-standing tradition of using Arabic as medium of instruction in Islamic schools in Malaysia, warrants research in Arabic texts readability, to find benchmarking measures for selecting Arabic texts for non-native speakers (Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi 2014;2017).

Methodology
This study set out to identify the use of linguistic features and their differences according to level. The data of the study was analysed using descriptive and inferential statistical analysis. The examination of the use of linguistic features in the Arabic texts selected for the study was based on a readability analysis and was verified by Arabic linguists.
The source of data of the current study were Arabic textbooks from the al-Azhar syllabus or the Dini integrated curriculum. Endorsed by the Malaysian Ministry of Education, the textbooks are used in teaching Form 1 to Form 5 (secondary levels 1 to 5) students of Islamic schools known as Sekolah Agama Bantuan Kerajaan (SABK) (government-funded religious schools [GFRS]) and Sekolah Menengah Agama Negeri (SMAN) (state religious secondary schools [GRSS]) in different states in Malaysia. The books were selected based on simple random sampling of names of three states in Malaysia. The names of all the states were written on separate pieces of paper and placed in a box. The draw was done three times and the names of the states that were drawn were Johor, Selangor and Pahang. Thus, the books used in schools in the three states were used in the present study.
For each state, 35 textbooks, i.e., 7 textbook samples for each form (Form 1 to Form 5) were selected. Therefore, a total 105 textbooks from all three states were used for the study. Stratified random sampling design was employed in selecting document samples from the textbooks. Three passages from each of the 105 textbooks were randomly selected for analysis. The data for analysis therefore comprised 315 text passages. Table 1 shows the distribution of the study sample texts.
The focus of the analysis was on the relevant linguistic characteristics of text readability. The first step was to define analytics and category units. According to Krippendorff (2004), the determination of analytical units and categories should be based on the purpose of the study so that text analysis procedures can be carried out objectively. The unit of analysis in this study is linguistic characteristics consisting of common and frequent words, abstract words, simple verses, complex sentences, nouns, average length sentences as well as discourse and discourse markers that indicate the relevance of the content. These linguistic features have been widely used in determining reading comprehension readability. In the current study, the frequency of linguistic characteristics in the sample texts were analysed. Previous research used content analysis methods in the study of linguistic characteristics that contributed to text readability, constructing readability formulas to render a systematic and objective method of measuring the communication effect of written material. Among the linguistic characteristics studied were characteristic words, such as calculating the frequency of difficult words, different words, syllable poly words, and abstract words. In addition, verse features, such as calculating the average sentence lengths and sentence types are also features that contribute to legibility of the text (Holsti 1969). This method was used by Sherman (1893), Gray and Leary (1935), Flesch (1948), Dale and Chall (1948), Dawood (1977) andal-Heeti (1984) and other readability of text research up to present times. Seven categories of linguistic feature variables were analysed, as shown in Table 2. A pilot study was conducted earlier to obtain data transparency from a different encoder analysis to test the methods and procedures of category analysis and to identify the sample so that corrective measures could be implemented before conducting an actual analysis on the actual sample (Krippendorff 2009).
The study also used an inter-rater reliability analysis by utilising the intraclass correlation coefficient (ICC) for measuring the consistency, stability and repeatability of measurements by different encoders separately in measuring the same analysis units (Shrout and Fleiss 1979). Three encoders were used in this study, and the value > 0.667 was selected in determining the value of inter-rater reliability as recommended by Gall, Gall and Borg (2003), as it is not too extreme and has been adopted by many researchers. Table 3 displays the ICC index value. The ICC inter-rater reliability analyses of the consistency, stability and repeatability of scores given by the three encoders were conducted by utilising average measures. For the seven linguistic features, the ICC values ranged from 0.860 to 0.968, indicating a significant degree of agreement, as they fall within Cicchetti's (1994) recommended value range of 0.75 to 1.00. The encoders had a high agreement correlation and presented approximately the same scores in the measurement of an analysis unit. A high ICC value also proves that the measurement error value of the encoders is small (Cicchetti 1994).
Descriptive analysis was performed to obtain systematic information on facts and features of a population or area of study, empirically and accurately (Gall, Gall and Borg 2003). Analysis of variance (ANOVA) was implemented to answer the question of the difference of interval or ratio data collection of the dependent variables across multiple nominal or ordinal scale data sets in the independent variables. The F-ratio value is significant at the value of p < 0.5.

Findings
Comprehensive analysis of the use of linguistic features, namely the use of common and frequent words, abstract words, simple sentences, complex sentences, noun sentences, the average sentence length and conjunctions and discourse markers was conducted. Table 4 displays the results of the analysis.  Table 4 shows the mean distribution of linguistic features studied in the entire sample of texts (n = 313). Out of 315 text samples, two text samples were omitted due to its outlier data. In terms of the sentences type used, the mean complex sentence use (mean = 5.46) was higher as relative to mean simple sentence use (mean = 1.99). The mean value of noun sentences was 4.21. Next, as for average sentence length, the mean value was 14.18. For word use, common and frequent words dominated the sample texts with a mean value = 95.44, as compared to abstract words use with a mean value = 11.87 out of 100 total words in the text sample. Meanwhile, the mean value of the use of conjunctions and discourse markers was 5.38.
In terms of the SD value, simple sentences and complex sentences had the smallest SD among all of the linguistic features with the simple sentences resulting in SD = 1.49 and the value of complex sentences was smaller (SD = 1.25). This shows that the range between the most and least often used texts complex sentences is smaller than that of simple sentences. However, these findings show that the use of both simple sentences and complex sentences in 313 text samples is the most standardised.
With regard to noun sentences, the data indicated that the range between the text most often and least often using noun sentences is greater than that of simple sentences and complex sentences. Nevertheless, this value suggests that the distribution of noun sentences in the texts is standardised. Regarding the average sentence length, the results reveal that the range between the highest and lowest average sentence length in the text is large and therefore less standardised.
The results for the category of common and frequently used words show that the range between the most and least frequent use of abstract words is greater relative to the usage of common and frequent words. This indicates that the distribution of common and frequent word use in the text is more standardised relative to abstract words, which are not standardised.
The findings on the use of conjunctions and discourse markers, SD = 2.26, suggests that the conjunctions and discourse markers in the sample texts of n = 313 were standardised.
Overall, it is suggested that the distribution of linguistic features analysed have a different distribution of standardisation based on the SD. The most standardised frequency distribution is that of complex sentences followed by simple sentences, noun sentences, conjunctions and discourse markers and common and frequent words. The distribution of the average sentence length, by contrast, showed a less standardised distribution; abstract words had a distribution that was not standardised.
The results of the ANOVA analysis in Table 5 suggest that there were no significant differences for the following variables: the frequency of simple sentences, noun sentences and abstract words according to forms. The results however show that the difference in the use frequency of complex sentences between Forms 1, 2, 3, 4 and 5 is significant, with the value F (43.519) = 7.511 and p = 0.000 (p < 0.05). The results of comparison between the texts from textbooks of different forms (multiple comparisons) to identify those with significant differences in the use of complex sentences showed that the overall difference was due to the difference between samples from Forms 1 and 3 textbooks and between Forms 2 and 3 textbooks. The mean value for Form 2 (mean = 5.9206) showed that it has the highest frequency of complex sentences.
Similar results were found on the use of common and frequent words, where as a whole, there was a significant difference in the use of common and frequent words in textbooks of all levels. Comparison of results between the forms indicated that the mean scores between Form 1 and Forms 3, 4 and 5, respectively, showed a significant difference. The mean value for Form 1 (mean = 97.5484) shows the highest frequency of common and frequent words.
With regard to the use of conjunctions and discourse markers, the analysis revealed a significant difference, namely, F (339.457) = 20.839 and p = 0.000 (p < 0.05) between Forms 1 to 5. The comparison between the Forms indicated that the mean scores between Forms 1 and 2, 3, 4 and 5 indicated a significant difference. There was no statistically different use of conjunctions and discourse markers between 2 and 3, 2 and 4, and 2 and 5 (p > 0.05). The mean value for Form 1 (mean = 7.3387) showed the highest frequency for the usage of conjunctions and discourse markers.
Finally, analysis of the average sentence length between forms revealed a significant difference, namely F (152.707) = 3.813 and p = 0.005 (p < 0.05). Comparison of results between the Forms indicated that the mean scores between Forms 1 and 2, 3, 4 and 5, were not statistically significant. However, an overall significant difference was due to slight differences between Forms 1 and 3. The mean value of Form 3 (mean = 15.0869) indicated that Form 3 texts have longer sentences relative to the other forms.
The ANOVA analysis shows that there were no significant differences for the simple sentences, noun sentences and abstract words by the form. Interestingly however, the test revealed that complex sentences, common and frequent words, conjunctions and discourse markers, and average length sentences have significant differences according to forms (levels).

Discussion
The outcome of the present study on the linguistic features in the Arabic text used by students revealed balanced and unbalanced standardisation based on the SD values. This means that the potential linguistic features, which may influence the difficulty of a text, are mostly complex sentences and simple sentences, based on the standardisation frequency of their use in the texts studied. Theoretically, the characteristics of the words potentially influence the readability of the text (Durayhim 1998, al-Khalifa andal-Ajlan 2010) where the text in this study showed standardised frequency distribution of conjunctions and discourse markers and common and frequent words. Although the distribution of the average sentence length lacks standardisation, theoretically it influences the text readability (al-Ajlan, al-Khalifa and al-Salman 2008;al-Tamimi et al. 2014).
Concerning the text difficulty level as a result of complex language context and linguistic elements, Drews-Bryan and Schleifer (1993) found that when prior knowledge for a certain meaning is scarce, readers encounter difficulty in comprehending the intended message. Rye (1982) also stated that when a text has higher cohesion, it becomes an easier text to read.
These findings are consistent with past studies (e.g. Kamarulzaman 2010; Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi 2014; Zulazhan 2012) which found that the use of standardised complex sentences dominated the texts. Although the common and frequent words also dominated the texts, their use was not standardised. Pikulski (2002) and Gunning (2003) stated that sentence elements are the second most important factor in measuring readability. Yngve (1961), Chall (1974) andKhadijah Rohani (1982) noted that the longer a sentence is, the more difficult it is to comprehend, and the more frequent long sentences are in a text, the higher the complexity of the text as it brings abundant information. In relation to this, the current study confirms that not only does information become more compact, but the syntax structure also becomes more complex. Miller and Selfridge (1950) and Harrison (1980) discovered that students did not understand reading science texts because of complex, long sentences and information-rich content. According to these researchers, this resulted in the absence of contextual clues which reduced message comprehension.
Surprisingly, Dawood (1977) andal-Heeti (1984) revealed that standardisation in common and frequent words use and average sentence length contributed to text readability. Nevertheless, it is noteworthy that Dawood's (1977) study involved a large sample number and text which was used by native speakers; the study conducted by Kamarulzaman (2010) and Zulazhan (2012) involved smaller samples.
According to early research in English text readability such as Chall (1974), Klare (1969), Gunning (2003), Oakland and Lane (2004) and Crossley, Greenfield and McNamara (2008), linguistic features such as word elements assist readers in understanding and are purposed for learning. New words are provided with keywords which help to describe the word. Apart from that, sentence structure and type often become the measurement of text readability. This includes sentence structure, which contains description, examples and illustrations with the assistance of words which serve as content connectors.
From these findings, it can be concluded that standardised use of features potentially predict text readability. Standardised use of features should also be applied by those supplying textbooks to non-native students, where the standardised distribution of the linguistic features influences text readability and should therefore be taken into consideration, rather than only giving attention to the learning content required for a particular level.
The ANOVA analysis showed a significant difference in the frequency of complex sentence use. Overall, there were significant differences due to differences between Forms 1 and 3, and Forms 2 and 3 texts with a medium size effect. In terms of common and frequent words, there were significant differences in the different Form texts with a large size effect. Significant differences were found between Forms 1 and 3, and Forms 4 and 5 texts. Forms 1 and 2 texts, by contrast, did not result in a significant difference in the frequency of common and frequent words.
The analysis results also showed significant differences in the use of conjunctions and discourse markers between Forms 1 to 5 texts with a large size effect. The most significant difference are between Forms 1 and 2, 3, 4 and 5 texts. The average sentence length variable between Forms also showed a significant difference. The difference in mean score between Form texts is small, with a small size effect. The overall significant difference was due to a slight difference between Forms 1 and 3 texts, where the Form 3 texts included longer sentences relative to the other Forms.
As previously mentioned, no significant difference was found in the frequency with regard to the following variables: simple sentences, noun sentences and abstract words.
The difference in complex sentences, common and frequent words, conjunctions and discourse markers, and average sentence length according to Forms was expected. The four variables dominated the text and their use as a whole was standardised. This accord Pikulski's (2002) study, which showed that the suitability of linguistic features with the reader's proficiency level is one aspect, which may assist in estimating the suitability of reading material for the target reader. Therefore, the frequency level of specific linguistic features needs to vary relative to the level of the students.
In terms of complex sentence use, the findings indicated that Forms 1 and 2 texts showed significant differences, and complex sentence frequency was higher in Forms 1 and 2 texts. These findings are slightly different from that of Schwarm and Ostendorf (2005), Kamarulzaman (2010), and Kamarulzaman, Ahmad Sabri and Nik Mohd Rahimi (2014) who found that the text for lower forms needed to be different from higher forms in terms of complex sentences as they are long and content compact.
Complex sentences consist of independent clauses and dependent clauses, incorporated by conjunctions and discourse markers. It is these conjunctions and discourse markers that guide readers in comprehending texts by linking content ideas, especially for readers who already have some prior knowledge of constructing meaning from sentences. In general, theories regarding sentence difficulty conclude that simple reading materials have short and simple sentences with a small number of prepositions (Chall 1974;Coleman 1962), and long sentences usually contain a great deal of information (Yngve 1961in Khadijah Rohani 1982. This, according to the theory of human working memory, is the case where the ability to recall a message accurately can only be done exactly after its delivery. The working memory capacity of a younger person is less than that of a mature or less intelligent individual (Miller and Selfridge 1950). Thus, the comprehension of reading materials indeed depends on the length of a sentence due to the limits of working memory.
The findings of the current study are in keeping with those of Kintsch and Miller (1981), and Davison and Kantor (1982) but contrast with those by Cavalli-Sforza, Mezouar and Saddiki (2014) and Hunt (1970). Their studies found that students should be exposed to sentence structure complexity in stages according to their proficiency level, particularly students learning a second (or foreign) language.
The text should also have posts to assist in associating one idea to another, which would ease understanding the text. It should be noted that non-native speakers learning Arabic as a foreign (or second) language in a formal and structured manner based on a teacher guided learning curriculum is standard. Hence, frequent use of conjunctions in Form 1 texts would help students who have existing knowledge of link ideas in comprehending complex and long sentences. The higher the level of learning (from Form 1 to Form 5), the less the frequency of conjunctions and discourse markers needed in the text. This aligns with the development of learning and language acquisition where students in the higher Forms possess skills and strategies that allow them to construct meaning from their reading.
The use of common and frequent words in this research is similar to several previous studies (Kamarulzaman 2010;Zulazhan 2012;al-Tamimi et al. 2014;Cavalli-Sforza, Mezouar and Saddiki 2014) which revealed that the text provided for lower level students consists of more common and frequent word use and its frequency distribution decreases with higher level texts. So, lower form students, as in Form 1 students in the current study, should be exposed to common and frequent word use in order to encourage reading and comprehension.  (2014) with regard to average sentence length. Past studies found that the sentence length used in the text should differ according to the student's proficiency level such that students at lower forms are exposed to shorter sentences in order to minimise the burden of cognitive processing. This is in line with Pang (2008) and Koda (2005), who claimed that lower-level readers have lower-level processing capacity in interaction with their texts relative to higher-level readers. This occurs because lower-level readers have difficulty digesting long sentences since they are still in the early stages of learning the language, particularly when it is a foreign language. These findings converge with those by Kamarulzaman (2010) who found that long sentences make it difficult for non-native speakers to read and understand Arabic reading materials. Readers at lower forms need to be provided with shorter sentence texts to increase language syntax proficiency.
Interestingly, the current study found that the Form 3 text had a higher than average value of long sentences than that of Form 4 and Form 5 texts, which is contrary to the past research findings. In Malaysia, Form 3 is the transition level from lower to upper secondary. Long sentences therefore should be lower in frequency in Form 3 texts than in Forms 4 and 5 texts. High frequency of long sentences in the Form 3 text would cause increased difficulty of the content.
These above findings corroborated a significant difference in the frequency of complex sentences, average sentence length, common and frequent words, and the usage of conjunctions and discourse markers. There was a higher frequency of complex sentences for lower form texts relative to higher form texts. On the other hand, lower form texts had a lower average sentence length relative to higher form texts. In addition, the findings showed that lower form texts have a higher frequency of common and frequent words relative to higher form texts. A similar trend if seen in the frequency of conjunctions and discourse markers where they were higher in lower form texts relative to the higher form texts.
The findings of the current study suggest that linguistic features which have significant differences across forms should be taken into account when assessing the readability of Arabic reading materials for non-native speakers. The frequency of these linguistic features should correspond to students' school level, proficiency level as well as the Islamic subject that use Arabic as the medium of learning and instruction.

Implications
Understanding the linguistic features that contribute to Arabic language text readability could be beneficial for both teaching and learning the Arabic curriculum for I'dadi studies (lower secondary). It could be a guide for effective selection or development of reading materials for non-native students. The identification of helpful (and detrimental) linguistic features in Arabic texts should be elucidated; it enables the assessment of text readability such that textbooks can be aligned with the student's proficiency level.
The books from the al-Azhar syllabus or the Dini integrated curriculum were newly introduced as learning texts. Given the issues raised in the earlier analysis, any effort to republish the texts must consider revisions to the linguistic features; specifically, by controlling the frequency and density of the content and by ensuring that the content is according to student's level of proficiency as a nonnative Arabic speaker. Textbooks for Form 1 for instance must have a different ratio of long sentences and short sentences with that of Forms 2, 3, 4 and 5 in accordance with the development of content and student proficiency and school level.
Authorities who manage resources for education in Arabic and Islamic studies using Arabic as the medium of instruction can take advantage of the present study's findings. Individuals involved in procuring textbooks should take into account the linguistic features that contribute to text readability as a guideline when authoring or selecting textbooks and reading materials in Arabic. The writing guideline should include the selection of words, sentence structures, use of conjunctions and discourse markers to ease assumptions about meaning, level of difficulty in language and content.

Conclusion
The current study suggests that texts used for non-native Arabic speakers in Malaysian schools should suit students' school level and proficiency in Arabic. Linguistic features should enhance text readability in Arabic. This is an essential step in considering text suitability with respect to student proficiency level, in ensuring effective reading processes and in encouraging students learning of the Arabic language. The findings of the study, it is hoped, provide impetus for further research in Arabic language learning in Malaysia and other non-native Arabic contexts.