Assessing Children’s Native Language in Mandarin Using the Adapted New Reynell Developmental Language Scales-Mandarin (NRDLS-M)

Early child language acquisition in Mandarin by Malaysian Chinese children is underexplored. Following the establishment of the first Speech Sciences academic programme at Universiti Kebangsaan Malaysia (UKM) in 1994, there is a need to develop language tests to assess children who might be at risk for language disorders and to provide remediation accordingly by professionals such as speech therapists. The present study aims to adapt a contemporary British English test: The New Reynell Developmental Language Scales (NRDLS) to Mandarin. Since NRDLS is widely-used to assess language comprehension and language production, the aim of the present research therefore is to propose adaptation of NRDLS to Mandarin. The present research also aims to provide preliminary norms (age of acquisition for target structures and errors) for the local children. Underlying factors which might influence child language development i.e. age, gender and socio-economic status (maternal education) are also examined. Using a cross-sectional study of 40 children aged 2;00-6;11, the present study aims to describe child language acquisition based on performance of the adapted NRDLS. The present results show that language skills advanced with age. Gender and maternal education do not affect child language development. Overall, children demonstrated a more superior language comprehension than language production. The adapted New Reynell Developmental Language Scales-Mandarin (NDRLS-M) is developmentally sensitive though further revisions are required. The present findings implicate an influence of universality and ambient language effects on acquisition of Mandarin. The present findings also implicate a need to develop a bilingual MandarinEnglish version of NRDLS-M.


INTRODUCTION
Young children acquire first words in first year of life, and combine words to form sentences using morphemes (e.g. prepositions) from second year of life (Owen, 2016). However, some children may not follow this typical language development due to reasons such as Specific Language Impairment (SLI) (Reed, 2012;Norsofiah, Rogayah & Lim, 2016). Therefore, it is essential to identify children at risk for language disorder at an early age in order to provide remediation. Assessing child language skills and treating child language disorders are the roles of speech-language therapists (SLTs). Existing literature has reported language disorders in Chinese pre-school children including the local ones (Looi, 2010). Currently the local SLTs are facing challenges due to the lack of standardized assessment tools. Based on the "common errors" used for the target structures? Is NRDLS-M developmentally sensitive? In addition to age, does gender and socio-economic status (SES)(maternal education) affect acquisition of Mandarin?

LITERATURE REVIEW THE NEW REYNELL DEVELOPMENTAL LANGUAGE SCALES (NRDLS) (EDWARDS, LETTS & SINKA, 2011)
The Reynell Developmental Language Scales (RDLS) was first published in the UK in 1977 (Reynell, 1977) to assess child language skills. It was revised in 1985 (Reynell & Huntley. 1985) and redesigned and renormed in 1997 (Edwards, Fletcher, Garman, Hughes, Letts & Sinka, 1997). The earlier version of RDLS has also been adapted to Cantonese in Hong Kong (RDLS-C; Hong Kong Society for Child Health and Development, 1987). RDLS-C is a valid tool in assessing language abilities in Cantonese children (Au et al., 2004). For years, it has been widely-used by the speech therapists in Hong Kong. In recent years, the norms in RDLS-C are thought to be outdated with a tendency to overestimate child language abilities, hence suggestions such as replacing NRDLS-C with new tests were made (Klee et al., 2009). The most recent version of RDLS i.e. The New Reynell Developmental Language Scales (NRDLS) is a major revision of the 1997 version. NRDLS was published in 2011 (Edwards, Letts & Sinka, 2011). It has incorporated contemporary knowledge in typical and atypical child language development. Because of the huge differences that exist between items used in the earlier version of RDLS (for which the Cantonese version was adapted) and the latest NRDLS (for which the Mandarin version was adapted in the present study), in this section, only the study of NRDLS is reviewed.
NRDLS represents a formal standardized test, which examines important domains of child language: vocabulary, sentence structure, verb morphology, inference and grammaticality judgement. Areas that are diagnostically important (clinical markers) are selected to help to identify potential child language impairment. For instance, poor use of tense markers (e.g. -ed) and pronouns (e.g. him/himself) have been discerned in children with Specific Language Impairment (SLI), hence verb morphology and tense markers are included in the test. Comprehension and production of language are tested separately on two different scales since comprehension of structures may precede production. Each scale consists of sections with increased difficulty to reflect children's language development. A summary of the test construct and procedure with examples is given in Table 1. It is recommended that tester stop administering the test once an individual child has failed all items in a section plus a couple of items in the subsequent section. This gives an overview of a child's language development and a comparison of his/her performance with the peers (norms).
As many children in the UK have used diverse non-English first language, a multilingual toolkit is proposed in NRDLS to guide speech therapists to adapt the test for use of these children. The guidelines reflect general pointers about potential cultural or linguistic differences that exist between English and other languages. The underlying assumption is that languages in the world shared universalities in the acquisition of language structures, and so the test can be adapted to other languages.  (e.g. ball) and a colourful test picture book. In general, for Comprehension tasks, the children are required to either perform some action commands with the toy objects, or to point to the target picture in the picture book. For Production tasks, the children are required to answer questions verbally.

Comprehension Scale
Production Scale Section A: Selecting Objects. Task: Understand single nouns. (e.g. ball).
Section Di-ii: Sentence Building. Task: Understand simple sentences containing transitive/intransitive verbs. (e.g. Rabbit walk). (e.g Contributing factors in child language development: age, gender, socio-economic status (SES) were examined in the study of NRDLS. SES was measured by years of maternal education and indices of deprivation. Maternal education was divided into four levels: 1. Statutory minimum education by age 16. 2. Further education e.g. A-levels or diploma. 3. Higher education to degree level. 4. Postgraduate qualifications. Whereas indices of deprivation were derived from postcode of schools. Schools were divided into five equalsized bands (quintiles), ranging from the most to the least deprived fifth of the population, with quintile 1 being the most deprived.
Using robust statistical analyses, a positive age effect is found on children's language performance based on NRDLS result. Older children showed better performance on NRDLS compared to younger children. A mild gender effect is found with girls outperforming boys on NRDLS. But because the effect of gender is mild, separate language norms for girls and boys were felt unwarranted for NRDLS. A significant maternal education effect is found for both Comprehension and Production scales, with higher scores for greater years of maternal education. Post-hoc analysis however revealed that maternal education affects children's language skills only when compared children who had mothers with statutory minimum education (leaving full-time education at 16) and children who had mothers with higher education (A-levels, diploma, degree and postgraduate) up till the age of 3;06. Once children start receiving full-time education, the effect of maternal education appears weaken. A mild poverty effect is found on children's language performance on the Production scale only when comparing quintile 1 and quintile 4. Environmental variables such as number of books provided to child, frequency of visits to library, parental teaching activities were thought to have more direct impacts on children's language performance than the broader SES measures.
In conclusion, NRDLS has not yet been adapted (and published) to Mandarin. Effort on adaptation from English to other languages such as Mandarin will enhance knowledge about cross-linguistic acquisition. For a culture that is desperate for child language acquisition tests (norms) such as Malaysian Chinese, adaptation works will certainly bring about great clinical value.

PAST STUDIES ON ASSESSING MANDARIN CHILD LANGUAGE SKILLS IN MALAYSIA
Inflections govern the grammar of a sentence in English, whereas in Mandarin, particles and word order control the grammar and meaning of a sentence (Fung, 2009). Mandarin has aspect markers, not tense markers; Mandarin verbs do not express tenses because temporal properties are expressed through temporal terms, aspect markers and contexts (Chen & Shirai, 2010). The four aspect markers in Mandarin are perfective -le0 and experiential -guo4 (perfective form c.f. -ed in English), progressive -zai4 and durative -zhe4 (imperfective form c.f. -ing in English) (Duff & Li, 2002).
Another striking difference between the grammar of English and Mandarin is that Mandarin has numeral classifiers. In English, a number is attached directly to a noun without a classifier (e.g. four cars). In contrast, in Mandarin, a classifier must be attached to a noun e.g. liang4 as in si4 liang4 che1 (four liang4 cars). These classifiers classify noun referents based on perceptual or conceptual dimensions: shape, size and function. For instance, liang4 is used for transports such as car. This type of classifiers is known as sortal classifiers. A second type of classifiers is mensural classifiers which are used to indicate quantity, for instance, pai2 as in san1 pai2 shu4 (three rows of trees) (Li, Huang & Hsiao, 2010).
Few local Mandarin child language studies have explored shared linguistic structures with English (e.g. vocabulary, preposition) and structures used in Mandarin only (e.g. classifier). Using parental checklist, Chok (2001) investigated first word acquisition amongst 30 children aged 10-20 months. She reported a more advanced receptive word development eISSN : 2550-2131 ISSN: 1675-8021 128 than expressive word development (e.g. nouns, verbs). Older children were found to have acquired more words than younger children. Girls were found to have acquired more words than boys. Children with mothers having higher education (Form Six secondary school education known as STPM)(c.f. A-level) and above acquired about the same amount of words to children with mothers having lower education (below STPM).
Using action command tasks, Teng (2003) studied acquisition of prepositions by 36 children aged 2-5. Teng reported acquisition of prepositions emerged from 3 years onwards.
Phoon (2003) investigated comprehension of wh-questions (what-, who-, where-, how, why, when-) in 48 children (24 boys & 24 girls) aged 2-6. Older children were reported to be more capable in answering wh-questions than younger children. There was no significant gender effect found on the test performance. What-and who-questions were acquired before how-, where-, when-and why-questions.
Thus far, only one standardised test has been published for Mandarin child language: the Chinese Language Assessment, Remediation and Screening Procedure (C-LARSP)(Jin, Rogayah & Oh, 2012). C-LARSP has been adapted from the British English LARSP (Crystal, Fletcher & Garman, 1989). C-LRASP study utilised spontaneous language sampling (free conversation and story-telling) on 130 Malaysian Chinese children aged 1-6. A developmental trend of acquisition was reported: single words (1;00-1;06), short phrases (1;06-1;11) and complex sentences (3;00) In summary, C-LASRP is useful for clinic and research. It can be commended for collecting data in children's most naturalistic environment. However, collecting spontaneous data is effortful and time-consuming, and yet desired language structures might not be captured. Based on observation, many Chinese SLTs are reluctant to use C-LARSP because of the unpracticality reasons. Whereas structured language tests (e.g. NRDLS) are more timeefficient and capable in assessing ambient structures which have already been predetermined in the test.

METHODOLOGY RESEARCH APPROACH AND LOCATION
The present study consists of a cross-sectional study of 40 children aged between 2;06-6;11. As with existing normative language studies (e.g. NRDLS), cross-sectional study approach was employed in the present research to investigate language skills in children at a given point of time. The children were recruited from nurseries and via personal contacts (friends, relatives). Data collection took place in the Chinese nurseries and children's home in the Penang island, by the second author as the single tester. The second author is a final year student in the Speech Sciences Programme at Universiti Kebangsaan Malaysia who speaks fluent Mandarin. All test data collected by the second author was video-recorded (see further Testing Procedure). All video recordings including scoring forms and analysis forms completed by the second author were passed to the first author to check accuracy of data scoring and analysis. In addition, to avoid scoring biasness by a single tester/rater, 10% of the test data (video recordings and scoring forms) was passed to a qualified Chinese speechlanguage therapist for independent scoring/rating purposes. The percentage of agreement eISSN : 2550-2131 ISSN: 1675-8021 129 between the two raters was high (see further Inter-rater Reliability). Both parental consent form and head teacher consent form were distributed and collected prior to testing date.

PARTICIPANTS
The children were randomly selected based on the following criteria and were divided into ten six-months age bands (Table 2): 1. Malaysian Chinese ethnic origin, defined as having Malaysian Chinese parents. 2. No reported mental and physical disorders, syndromic disorders or hearing disorders. 3. Dominant in Mandarin as reported by parents (and teachers) and as observed by the tester. 4. Have mothers with secondary education qualification (SPM) or further education (diploma and degree).
The children were representative of Malaysian Chinese children. They used Mandarin as their dominant home language. All children aged 3;00 and above (70%) were attending nursery at the time of data collection. The youngest children aged between 2;00-3;00 (30%) who had not attended nursery were taken care by parents or grandparents at home. Mandarin was used as the medium of instruction in the nursery. The children were also learning English and Malay subjects in the nursery. Some of the children had received exposure to Chinese dialects at home (e.g. Hokkien-38%; Hakka-3%). As none of the parents of the children in the urban area where the present study was conducted had achieved lower than secondary education i.e. primary school education, in the present study, SES was approached by incorporating correlates of statutory minimum education till 17 i.e. Form Five secondary education known as SPM (c.f. O-level) and further education (diploma and degree) only (Table 2). SES measuring by deprivation indices (e.g. income) was excluded since it is not available in Malaysia. In general, changes to the original test materials including toys and pictures were kept to a minimum for two main reasons. First, consideration of consistency factors such as style of drawing pictures, materials of drawing; sizes and characteristic of toys. Second, constraints of time and manpower namely, given a single tester over a time span of 9 months. The entire procedure of adaptation involved is summarized below and detailed in the following sections.

DETAILS OF ADAPTATION
Examples on adaptation of the test including rationale for modifications were provided in Appendix A. A summary of the test sections of the adapted NRDLS in Mandarin is presented in Appendix B. NRDLS-M comprised 84 items for Comprehension and 74 items for Production. In general, items containing words that were culturally or linguistically inappropriate for the local children were either replaced with more appropriate words or deleted (e.g. sledge, badge)(Appendix A). The present vs. past tense markers (Table 1) were replaced with two Mandarin aspect markers: progressive -zai4 vs. perfective -le0 described earlier. Pronouns (e.g. him, her, himself, herself) (Table 1) were replaced with Mandarin reflexive pronoun zi4ji3 (self). Due to the potential ambiguity involved in testing of the third personal singular non-reflexive pronoun (anaphor), ta1(him/her) was not tested, but was replaced with zi4ji3 (self). It is worth noting that, in spoken Mandarin, ta1 is gender neutral. It is also worth noting that, elsewhere, the pronouns the boy, the girl, the baby are less familiar in spoken Mandarin. They were replaced with di4di0 (younger brother) for both "boy" & "baby"; and mei4mei0 (younger sister) for "girl" (Qi, 2010). As code-mixing is widely used by the local speakers (Lim, Wells & Howard, 2015), translation equivalent terms in English for the three main toy figures of the test: Monkey, Rabbit & Teddy were allowed in the test. For passive sentences (Table 1), the passive marker -bei4 was used. Because of linguistic differences, most of the items in Grammaticality Judgement (Table 1) which requires a child to judge whether a sentence heard is grammatically well-formed or ill-formed were modified by incorporating target structures commonly found in early Mandarin child language e.g. SVO, SVC, AVO (c.f. Jin et al. 2012;Tsang & Stokes, 2001)(Appendix C).

FACE VALIDITY
The face validity check of the adapted test was conducted twice, first before the pilot study and second after the pilot study but prior to the main study. The panel of experts consisted of a Chinese Linguistic lecturer from one local university in Malaysia, an experienced local Chinese SLT, and a Chinese primary school teacher, all three of whom are fluent in Mandarin and English. Two changes were recommended by the panel: 1. Comprehension of Verbs in Section Ci: The target intransitive verb "wave" was switched to "stand" for Make monkey wave because the precise translation equivalent term for "wave" in Mandarin hui1shou3is formal and unfamiliar to the local children. The English translation equivalent term "byebye" (good bye) is commonly used to signify "wave" hands and so the initial proposal was to modify this item to bai4bai0 (bye-bye). However, confusion arose with the logography of bai4bai4 having an ambiguous meaning with "prayer" in colloquial Mandarin on the scoring form. 2. The use of translation equivalent term for the original English word "Make…" as in Make monkey wave across the Instruction Sections. Initially the term gei3(give) was used to replace "make". However, gei3 is more of a Chinese dialectal term in this context, and therefore rang4 (let) a more appropriate Mandarin term was used to replace gei3 across the Instruction Sections.

PILOT STUDIES
In the first pilot study, a total of five children were asked to do the newly adapted test to confirm the appropriateness of the test items. The results of the pilot study have generally implicated a low test sensitivity level given children across age groups obtaining rather similar test scores.
Careful thought was then paid on the constructs of the adapted test. One obvious reason would be negligence of other important aspects of Mandarin not found in the original English version such as temporal terms and classifiers reviewed earlier in this article. A revision to the adapted test was then carried out by introducing new test components of Mandarin Temporal Terms and Classifiers (Appendices D-E). Three temporal terms were tested on the Comprehension scale: xian1…. ran2hou4; yi3qian2, yi3hou4 (first…then, before, after). Children were required to carry out action commands using the three key animal figures i.e. Monkey, Teddy and Rabbit e.g. hide rabbit in the box first, then hide monkey. On the other hand, ten classifiers were tested on both scales. Both sortal classifiers (e.g. ben3 for book) and mensural classifiers (e.g. pai2 for trees) were incorporated. Children were required to point to one of four pictures in the Comprehension task but to name pictures in the Production task.

TESTING PROCEDURE
The children were assessed individually in a quiet room at home or in the nursery. The adapted New Reynell Developmental Language Scales-Mandarin (NDRLS-M) took approximately 30-40 minutes. All test sessions were recorded using a high quality video recorder (Sony Cyber-shot DSC-W350). These recordings were used for post-hoc scoring accuracy checking purposes. Prior to testing, a brief warm-up free play session (e.g. colouring) was conducted to build up rapport with the children. The children were asked to do the test following the original testing procedure in NRDLS. In each section of the test, clear instructions with trial items were given. No cues were given to the children, hence, only spontaneous data was taken into analysis. The children were rewarded with a sticker at the end of the test.

SCORING PROCEDURE
The scoring form of NRDLS-M was devised drawing upon careful translation and modification of the English NRDLS (Edwards et al., 2011). Similar scoring procedure of NRDLS was employed in NRDLS-M. One mark was given for correct responses and a zero mark was given for wrong or nil responses; these marks were entered on the scoring form. The maximum score is 84 marks for Comprehension and 74 marks for Production. Codeswitching and code mixing data (e.g. Mandarin-English) were noted in the scoring form.

INTER-RATER RELIABILITY
In the present study, the percentage of agreement between two raters was used to measure inter-rater reliability. The data of five children (10%) was independently scored by a local Chinese speech-language therapist who is fluent in Mandarin. Overall, the percentage of agreement for scoring/rating between the two raters was high (98% for Comprehension; 96% for Production). Hence, it can be concluded that the inter-rater reliability for NRDLS-M was high.

TEST-RETEST RELIABILITY
Five children (10%) were asked to repeat the test within a time span of 7-14 days. Pearson's correlation coefficient revealed a high test-retest correlation (0.991 for Comprehension and 0.994 for Production).

RESULTS
Both quantitative and qualitative analyses were used to address the research questions of the present study.

QUANTITATIVE ANALYSIS
As ceiling effects plus heterogeneity of variance were present in the present data corpus, nonparametric Kruskal-Wallis One-Way Analysis of Variance test (Howell, 2002;Siegel & Costellan, 1988) was used. Table 3 shows that there was improvement with age for both Comprehension and Production scales. Statistical analysis confirmed that there were significant age effects on both Comprehension and Production scales (Kruskal-Wallis chi 2 =32.72, df=9, p=0.000 for comprehension; Kruskal-Wallis chi 2 =33.78, df = 9, p=0.000 for production). Hence, it can be concluded that age affects early child language development.  Table 4 shows that girls seemed to have outperformed boys for Comprehension scale. Conversely, boys seemed to have outperformed girls for Production scale. Statistical analysis confirmed otherwise that there were no significant differences in the scores for both scales by both boys and girls (p >0.5). Hence, it can be concluded that gender does not affect early child language development.  Table 5 shows that children with mothers having high SES (further education) seemed to have outperformed children with mothers having middle SES (secondary school education) for both scales. Statistical analysis confirmed otherwise that there were no significant differences in the scores for both scales by children with both groups of mothers (p>0.5). Hence, it can be concluded that SES (maternal education) does not affect early child language development.

QUALITATIVE ANALYSIS
As the sample size used in the present study was relatively small compared to the study of NRDLS, hence robust statistical analysis was not feasible. More depth was then given on qualitative analysis. In NRDLS , quantitative (statistical) analysis is highlighted, qualitative analysis of acquisition (e.g. age of acquisition for target structures or common errors made by children) is not mentioned. As such normative information is highly desirable for local Mandarin, some of the conventional ways in analysing child language were employed. Shipley, Maddox and Driver (1991) for instance provides a list of "age of development" for irregular past tense verbs in children aged 3-5 based on a predetermined group success criterion of "80% of children using irregular past tense verbs correctly" in sentence completion task (c.f. 50% in other studies). In the present study, a group success criterion of 75% was adopted for the sake of uniformity in comparison to other local studies purposes (e.g. Tan, 2004). Children's acquisition of structures is discussed in terms of "age of acquisition" and "order of acquisition", both contribute to preliminary norms ( Table 6). The "age of acquisition" is defined as the age when a target structure was scored correctly by at least 75% of children in an age group (Table 6). The "order of acquisition" is derived from the most to the least number of children in an age group scoring a target structure correctly eISSN : 2550-2131 ISSN: 1675-8021 134 ( Table 6). Some of the ways in which the children approached the task of responding to the test battery including simplification strategies (errors) are illustrated in Table 7.  qiu2, bei1zi0, ya1zi0, qian1bi3, wa4zi0 (2;00-2;05), zhuo1zi0, yi3zi0, shu1zi0, hou2zi0 (2;06-2;11), he2zi0(3;06-3;11). (ball, cup, duck, pencil, sock, table, chair, brush, monkey, box). bei1zi0, yi3zi0 (2;06-2;11), qiu2, zhuo1zi0, wa4zi0, ya1zi0 (3;00-3;05), qian1bi3 (3;06-3;11), shu1zi0 (4;00-4;05), hou2zi0, he2zi0(4;06-4;11). (cup, chair, ball, table, sock, duck, pencil, brush, monkey, box).
C: Comprehension. P: Production. Made more errors on whose-questions than who-questions. e.g. whose daughter is having a birthday party?

DISCUSSION
The present data indicates an overall more superior language comprehension than language production on virtually all test components. The children in the present study exhibited a rapid growth of vocabularies and short phrases and sentences from 2;00-2;11. These findings are consistent with what have commonly been reported in the local child language studies (e.g. Chok 2001;Jin et al., 2012) and beyond (e.g. Owen, 2016). Their use of close-semantic words (e.g. tablechair) or associated functions of object (e.g. pencil to write or pencil box) indicates partial understanding towards the target words. This observation indicates that acquisition of words in young children is gradual. Some children had code-switched Mandarin vocabularies with English, resulting in a late mastery (100%) of vocabularies (by 5;00). Unlike English (Letts et al., 2011), prepositions posed more challenges in Mandarin than verbs, as evidenced in a better test score for verbs in the present study. Despite some discrepancies in the age and order of acquisition for prepositions in the present study and the previous local study (Teng, 2003), both studies have indicated a relatively late development of prepositions (from 3;00 onwards) than verbs (from 2;00 onwards). Further, children were confused when responding to the target preposition -behind using a toy truck. One plausible explanation for which would be: relating the front portion (driver seating area) commonly known as qian2mian4 (front) in spoken Mandarin with the back portion (loading area) commonly known as hou4mian4 (back) in spoken Mandarin. One common criticism about structured language test is the small number of stimuli involved for each target structure. However, this test allows comparison of word and grammatical knowledge alongside underlying processes used by children within similar age ranges who had under gone the same test. These processes manifested through their error patterns, implicate difficulties with memory, processing, auditory discrimination, social cognition and inferencing (c.f. Bishop, 2003). In the present study, the younger children gave more incorrect responses and no responses than the older children. Whenever in doubt, the younger children showed a tendency to simply respond by answering "yes" or to repeat the whole or part of the test stimuli presented. The younger children deleted one noun in a two noun phrase (e.g. apple and bed bed) or in a longer SVO sentence (e.g. Rabbit beat Teddy Teddy) whilst the older children were observed to swap around the two nouns in the SVO sentence (e.g. Rabbit beat TeddyRabbit beat Monkey), both of which might implicate a consequence of memory or processing overloading with older children doing slightly better than the younger children. The present findings reflect one other strength of structured language tests namely capable to capture fundamental structures with which spontaneous language samplings might not be able to capture. This strength was seen in the acquisition of passive marker bei4 by 5;00-6;11 in the present study. Passive marker bei4 was found absent in the language sampling study of C-LARSP (Jin et al., 2012) reviewed earlier.
Both aspect markers perfective -le0 (c.f. -ed) and imperfective -zai4 (c.f. -ing) were acquired later than single nouns and verbs in the present study (4;06-5;05). Chen and Shirai (2010) reported early emergence of -le0 and -zai4 in the spontaneous language sampling of four Chinese children subjects with a tender age (1;04-3;05). Li and Bowerman (1998) reported increasing correct responses in the 3 to 6 year-olds deriving from their picture identification tasks of four aspect markers including -le0 and -zai4 with six types of verbs. The few existing studies have consistently reported a frequent use of -le0 with achievement, and a frequent use of -zai4 with activity and statives verbs (Li & Bowerman, 1998;Chen & Shirai, 2010). In the present study, -le0 was tested with telic verbs (e.g. achievement) as in e.g. shu1tou2le0 (brushed hair) while -zai4 was tested with atelic verbs (e.g. activities) in e.g. ta1 zai4 he1shui3 (she drinks) indicating the present measures of aspect markers are developmentally appropriate.
The reflexive pronoun -zi4ji3 (self) was acquired by 5;06-5;11 in the present study. The existing studies (e.g. Chien & Wexler, 1987;Hao, Sheng & Gao, 2014) have shown that as with English-speaking children, Mandarin-speaking children achieved a full competency in the comprehension test of reflexive -zi4ji3 by 5. In the present study, -ta1(him/her) was not tested due to differences of the pronoun system that exist between English and Mandarin. For example, the pronoun "her" as in e.g. is the mother painting her? (a picture showing a mother is painting a picture of a girl who is standing near her) the translation equivalent in Mandarin for which ma1ma0 shi4 bu1 shi4 zai4 hua4zhe0 ta1? can be inferred as:1. The mother is painting a picture of herself. 2. The mother is painting a picture of the girl (reflexive anaphors)(Wang, 2011).To avoid this ambiguity, the target stimulus-ta1 was changed to zi4ji3 e.g. is the mother painting her? is the mother painting herself?(ma1ma0 shi4 bu4 shi4 zai4 hua4zhe0 zi4ji3?). A second look at the literature, to avoid this ambiguity, researchers have been incorporating for instance, more than one mother in a picture e.g. is every mother painting her? (a picture showing three mothers are painting a picture of a girl who is standing near them) (Hao et al., 2014). This strategy is also included in the trial item of the original NRDLS: is every grandfather painting him? (a picture showing three grandfathers are painting a picture of a boy who is standing near them). Future revision to incorporate ta1 in the test using this kind of strategy is needed.
The relatively late acquisition of complex sentences: relative clauses, passive sentences and wh-questions in the present study are consistent with the existing findings for eISSN : 2550-2131 ISSN: 1675-8021 138 cross-languages including English and Mandarin. Examples of studies with comparable findings in Mandarin are, for wh-questions: Fahn, 2003;Phoon, 2003;for relative clauses: Hu, Gavarro & Guasto, 2016;for passive sentences: Zeng, Mao & Duan, 2016. However, in the present study, only a small sample of relative clauses, passive sentences and whquestions was used, an expansion to include more structures for which is recommended. For example, the grammatical structure such as the order of who-and which-in a wh-question is reported to have an influence on children's understanding of wh-questions namely, object wh-questions vs. subject wh-questions (Fahn, 2003). And likewise, subject relative clauses vs. object relative clauses (Hu et al., 2016). And for passive sentences, whether the agent in a passive sentence is preverbal (Fahn, 2003).
Temporal terms were acquired late across languages (e.g. for English: Owens, 2016). Consistent with the local study reviewed earlier (e.g. Ooi, 2003), the temporal term xian1…ran2hou4 (first… then) was acquired before yi3qian2 (before) in the present study. But this finding is contrary to one of the classic study in English by Clark (1973).
In the present study, classifiers were also acquired late by the children. The present study has shown inconclusive results to support the dimensional theory with regards to the rate of acquisition of classifiers (3Ds1D2Ds). The present findings on the other hand support the previous findings of an earlier acquisition of sortal classifiers (e.g. shape) than mensural classifiers (quantity) (e.g. Li, Huang & Hsiao, 2010). In the present study, both mensural classifiers tested namely dui1 (for rocks) and pai2 (for trees) were acquired late (5;00-6;11) on both scales.
The children in the present study have shown awareness about ambient syntactical structures (grammatical judgement) from 4;06 onwards. They scored higher in this test section than the previous Classifier section. A second look at the proposed items for Grammaticality Judgement (Appendix C), there was only one item in this section that had been used to test awareness about morphological violation. All other items had been used to test word order violation. This explains the good performance for this section. As cited in the literature, for Chinese languages, children scored higher on sentences involving word order changes than morphological errors (Tsang & Stokes, 2001). The present findings contribute further knowledge to the literature of Mandarin child language by confirming existing findings of an advantage of word order changes over morphological violation in Mandarin syntactical awareness tasks.
In the present study, inferencing skills were elicited through a series of wh-questions. The present investigation has indicated a more advanced acquisition of who-questions than whose-questions. The present findings support that inferencing skills have involved integration of abilities to understand language with non-linguistic structures such as world knowledge and experience (Edwards, Letts & Sinka, 2011).
Effects of age, gender and maternal education on child language development were examined in the present study since existing literature suggests that both internal (biological) and external factors (environment) have an impact on children's language acquisition. The present finding of a positive age effect on language performance concurs with the previous local findings of Mandarin child language (e.g. Teng, 2003) and the previous findings of NRDLS on British English child language (Letts, Edwards, Sinka, Schaefer & Gibbons, 2013).
Gender effect, though generally reported to be weaker than age, and sometimes found absent, has consistently been pointing to a more superior language performance of girls than boys (Zhang et al., 2008). Gender effect was not found in the present study. The study of NRDLS on British children has reported a mild gender effect with girls outperforming boys slightly on the language test. Of the two pilot studies reviewed earlier in this article, one found non-significant gender differences on acquisition of wh-questions (Phoon, 2003), eISSN : 2550-2131 ISSN: 1675-8021 139 whilst the other found a significant gender effect with girls outperforming boys for the acquisition of first words (Chok, 2001). One explanation for these mixed findings by the three local studies would be discrepancies of age range under investigation. The subjects in the present study and the study of Phoon are older (2 years & above) than the subjects in the study of Chok (2001)(below 2).
Maternal education level does not affect language performance in the present study. In the past study of NRDLS on British English-speaking children, Letts, Edwards, Sinka, Schaefer & Gibbons (2013) found a mild maternal education effect on language performance up till 3;06 only when comparing children with mothers having least education (statutory minimum education by 16) and mothers having further education, higher education and postgraduate education. The input receiving by the older children at the centres is thought to have associated with the diminished effect of maternal education on language acquisition of these children. Supportive language teaching devices such as eliciting conversation, using picture cards and telling stories have been reported to have a positive impact on language development in children (Zhang et al., 2008). This explains the previous findings of Letts et al. and the present findings to some extent. On the whole, the majority of children (70%) in the present study were receiving robust input from the nurseries which were very academic bias for 5 hours a day and 5 days a week. Nonetheless, the correlation of age and maternal education was not examined in the present study.

CONCLUSION AND IMPLICATIONS
Consistent with the previous findings of English language , the present data indicates a more superior language comprehension than language production. These findings implicate the influences of language universality, not surprising given that all children are subject to language acquisition device, cognition and world knowledge (Genesee, 2003). Cross-linguistic similarities (in terms of age and order of acquisition) were discerned amongst the present study and existing studies on all language aspects under investigation: vocabularies, short phrases, sentences, aspect markers, reflexive pronoun, complex sentences, temporal terms, classifiers, grammaticality judgement and inferencing. On the other hand, cross-linguistic differences (in terms of age and order of acquisition) were also observed amongst the present study and existing studies, implicating an influence of ambient language effect as well. These include for instance, acquisition of the following aspects: a more superior acquisition of verbs than prepositions in Mandarin (c.f. English) (Edwards et al., 2011); a more advanced acquisition of temporal term yi3hou4 (after) than yi3qian2 (before) in Mandarin (c.f. English) (Clark, 1973).
The present findings suggest that the revised NRDLS-M is developmentally sensitive, although further revisions to the test are required. These include a revision to the following sections: 1. To swap around Section B (prepositions) with Section C (verbs) in NRDLS-M in order to reflect the developmental approach to the test. 2. To change the test stimulus (toy truck) for target preposition hou4mian4 (behind) in Section B to avoid ambiguity. 3. To incorporate target pronoun ta1 (him/her) in Section F using the same strategies employed in the original trial items in NRDLS (Edwards et al., 2011). 4. To add more structures to Sections F & G Complex sentences: object wh-questions vs. subject wh-questions (Fahn, 2003); subject relative clauses vs. object relative clauses (Hu et al., 2016); whether an agent in a passive sentence is preverbal (Fahn, 2003). 5. To reorder Section G & I (classifiers), Section H (temporal terms) and Section I (grammaticality judgement) in NRDLS-M in order to reflect the developmental approach to the test: grammaticality judgement, followed by temporal terms, and lastly classifiers. Consistent cross-linguistic findings of positive age effects on language development by the present study and the previous studies (e.g. Teng, 2003; implicate that language acquisition is universal. The present study has indicated that gender and maternal education factors do not affect child language development. Existing crosslinguistic literature provides mixed findings about the effects of gender and maternal education on child language acquisition (Chok, 2001;Phoon, 2003;. One explanation for these mixed findings would be, in some studies, more sophisticated statistical analysis namely, correlation of age and gender/maternal education was incorporated . Future local studies incorporating analysis of correlation of age and gender/maternal education are needed to provide a comprehensive picture of the effects of gender and maternal education on local child language development. The sample size used in the present study was relatively small due to constraints of time and manpower. Future study using a larger sample size is recommended. Future investigation on the psychometric properties of the revised NRDLS-M such as reliability and validity is desired. The present study has focused on one of the local children's dominant languages i.e. Mandarin only. Code-switching between Mandarin and English vocabularies observed in the present study implicates a bilingual Mandarin-English test version is needed to reflect the bilingual or multilingual repertoire of these children, otherwise inaccurate diagnosis of child's vocabulary skills may be made (Lim et al. 2015 washing (xi3) is a less familiar verb than mopping (ma1) in this context to the local children.

44
Show me the man who pulled the sledge. Omitted.
Sledge (xue3qiao4) is not a culturally appropriate item to the local children.
Section G: Complex sentences 53 The boy who is wearing a badge is smiling.
Omitted. Badge(hui1zhang1) in Mandarin is not a familiar word to the local children.