Chinese teachers’ conceptions of assessment for and of learning: Six competing and complementary purposes

As China continues to involve teachers in the implementation of an assessment for learning or formative assessment policy, a clearer understanding of how they conceive of the purposes and functions of assessment is necessary. This paper synthesises eight interview and survey studies, which have examined how diverse samples of practicing teachers in China have described the nature and purpose of assessment. Making use of inductive analyses and factor analytic techniques, variations in the constructs identified in teachers’ thinking are identified and aligned across the study methods. Six major constructs were identified, ranging from the positively regarded ideas that assessment develops the personal qualities and academic abilities of students to the more negatively viewed role of assessment for management and inspection of schools. This framework allows better insights into the challenges policy-makers might have in involving teachers in an effort to reduce negative consequences associated with high-stakes examination systems. Subjects: Assessment; Assessment & Testing; Attitudes & Persuasion; China; Curriculum; Education Policy; Teachers & Teacher Education; Teaching & Learning


PUBLIC INTEREST STATEMENT
China is an examination-driven society which is discontent with the educational outcomes of its system. China is currently seeking to involve teachers in the delivery of an assessment for learning or formative assessment policy. Understanding what assessment means to teachers who have to do assessment is an important goal. This paper synthesises eight interview and survey studies which have examined how diverse samples of practicing teachers in China have described the nature and purpose of assessment. Six major constructs were identified, ranging from the positively regarded ideas that assessment develops the personal qualities and academic abilities of students to the more negatively viewed role of assessment for management and inspection of schools. This framework allows better insights into the challenges policy-makers might have in involving teachers in an effort to reduce negative consequences associated with high-stakes examination systems.

Introduction
What teachers believe about assessment matters to how they implement, interpret and respond to evaluative practices. In the People's Republic of China, educational assessment is dominated by high-stakes examinations, especially at the end of middle school and senior secondary school (zhong kao and gao kao respectively). In light of large class sizes common in China and the importance of examination success, Chinese classroom practices focus on (1) rigorously controlled teaching demonstrations that are frequently evaluated by peers and administrators, (2) maximising student scores on all forms of evaluation, (3) removal of curriculum content not explicitly evaluated by the examination system, (4) high student workloads to ensure mastery of examination material, (5) teacher-centric transmission of discipline-specific and "bookish" knowledge, and (6) mechanistic, rote-learning, memory-driven learning and pedagogical strategies (OECD, 2011).
There is potential to change these seemingly reductive practices, because it has been established that Chinese teachers do not believe that such practices constitute excellent teaching (Chen, 2008;Chen, Brown, Hattie, & Millward, 2012). Since successful change requires the active endorsement of the agents expected to carry out the changes (Hargreaves & Fullan, 1998), there are grounds for thinking that assessment reform is feasible in China. Indeed, there are policy pressures, especially in the basic curriculum, to use assessment more educationally to guide improved teaching and learning (Gao, 2002;Liu & Qi, 2005). Thus, while teachers may have little control over official examination policy and practices, they are increasingly expected to use assessments to improve the quality, nature and quantity of student learning outcomes by using data about student performance to modify their own classroom practices. Simply, then, if teachers do not believe assessment can serve these mandated goals of education, the current reforms are unlikely to be successful; attempting to change teacher behaviour (i.e. increase formative assessment practices) without taking into account the pre-existing reasons and beliefs teachers have for their current practices is likely to fail (Robinson & Lai, 2006). A deeper understanding of how Chinese teachers conceive of assessment will be useful to policy-makers, school leaders and even assessment developers, in supporting the implementation of the new curriculum.
The aim of this paper then is to synthesise recent studies of teachers in the People's Republic of China as to their conceptions of assessment and form a framework that characterises those beliefs. To achieve this, rather than conducting a comprehensive review of all relevant research in China, we provide insights gained from two streams of inductively analysed interviews and factor-analytic survey research in which we have been involved. The conceptual framework developed in this paper draws on (1) recent graduate student theses carried out under the supervision of the second author and (2) collaborative research between the two authors and their respective research teams in Guangzhou and Hong Kong. While these studies are a convenience sample of all possible research, we consider that they provide a sufficient basis for outlining the major aspects of teacher thinking in China concerning the impact and effect of assessment policy and practice in contemporary China.
Thus, this paper is not a review of all research on Chinese teacher thinking about assessment; it is rather an attempt to develop a conceptual framework by which teacher conceptions of assessment can be understood and researched further. An additional value of this paper is that all the studies synthesised were conducted in China by Chinese education researchers and have been interpreted jointly between Chinese and non-Chinese researchers, potentially adding cross-cultural strength to the analyses (Katyal & King, 2013). An added benefit of the paper, for the Western reader, is to introduce Chinese teacher thinking about assessment which may differ to that of Western teachers. This is of importance considering the exceptional performance of Shanghai on the OECD PISA tests; an aspect of Chinese education that may be relevant to western educators is how teachers understand and implement assessment. The resulting conceptual framework also allows us to highlight important similarities and differences to teacher conceptions of assessment studies conducted outside China. This may also help China's education practices and policies to steer a course between the more progressive education envisaged by the basic curriculum and the importance of examination success within Chinese contexts.
The paper first defines the construct "conceptions of assessment" and then briefly reviews earlier non-Chinese studies of teacher conceptions of assessment. Then, the assessment policy context of the People's Republic of China is outlined, before we describe the data sources used in this paper. The paper then provides an integrated synthesis of the studies, and it concludes with reflection as to the implications of the framework for China itself and as a basis for comparison to international studies.

Conceptions of assessment
We use the term conception to refer to the general, usually implicit, knowledge a person has about the nature of a phenomena (Brown, 2008;Marton, 1981;Thompson, 1992). That is, conceptions refer to the ideas, values and attitudes people have toward what something is (i.e. what they think it is and how it is structured) and what it is for (i.e. its purpose). Conceptions are formed gradually through experiences with a phenomenon (i.e. conceptions of assessment arise from experiences of being assessed as a learner) and become the mechanism by which a person's reactions or responses to the phenomenon are shaped (Ajzen, 2005;Fives & Buehl, 2012;Pajares, 1992).
It has been established that teacher beliefs, attitudes and responses affect-to the extent teachers have control-the quality of what happens in school curriculum, teaching and assessment (Fives & Buehl, 2012). Specifically, teachers' beliefs about students, learning, teaching and subjects influence assessment techniques and practices (Asch, 1976;Cizek, Fitzgerald, Shawn, & Rachor, 1995;Kahn, 2000;Tittle, 1994). For example, Yan (2014) reported that Hong Kong teachers had negative views of school-based assessment (SBA) requirements, but those with a belief that SBA would be useful and who had confidence to use SBA were more likely to self-report the intention to use SBA in their classrooms. Science teachers in Taiwan, who viewed assessment as improving learning, tended to view science learning as increased understanding rather than memorising (Lin, Lee, & Tsai, 2014). Further, there is evidence that using formative assessment processes to differentiate instruction results in more profound learning outcomes for students (Christoforidou, Kyriakides, Antoniou, & Creemers, 2014). Additionally, among Iranian English-language teachers believing that assessment was for improvement was positively correlated with a reduced sense of burnout, while believing assessment was irrelevant led to increased self-reported burnout (Pishghadam, Adamson, Shayesteh Sadafian, & Kan, 2014). In contrast, the more prospective teachers in China endorsed the notion that assessment was about teaching for examination success, the more they thought assessment was (1) not diagnostic and formative, (2) developmental of life character and (3) irrelevant (Chen & Brown, 2013). These examples show that conceptions of assessment matter to teaching practice.
Conceptions of assessment, then, refer to a teacher's understanding of the nature and purpose of how students' learning is examined, tested, evaluated or assessed. Brown (2008) has argued that all purposes of assessment (e.g. Newton, 2007 identified 17 uses) fall into one of three major purposes and an "anti-purpose". These are: (1) assessment as improvement of teaching and learning (improvement), (2) assessment as making schools and teachers accountable for their effectiveness (school accountability), (3) assessment as making students accountable for their learning (student accountability) and (4) assessment as irrelevant to the life and work of teachers and students (irrelevant). Nonetheless, by grouping the school and student accountability functions, it is possible to conceive that there are only two major purposes of assessment within any society (i.e. accountability and improvement).
In accountability, the goal is to make a judgement about the value of assessed performance; while, in improvement, the goal is to identify weaknesses so as to bring about change. Accountability assessments include the traditional end-of-unit, course, or year examinations and tests which determine whether students should be awarded a certificate, promoted a grade or experience other high-stakes consequences. More recently, especially in the USA, accountability testing has included assessments used to determine whether schools and teachers are making adequate progress or meeting expected standards (Ravitch, 2013). In contrast, in some societies more than others, there has been an explicit policy shift toward using assessments to diagnose and inform changes to instructional settings and practices so that learning is improved-that is, formative assessment (Berry, 2008;Black & Wiliam, 1998). Within this policy emphasis, assessment is much less about making value judgements about students, schools or teachers, and much more about diagnosis and reflection upon the quality of teaching and learning.
These two purposes create a tension in how assessment is perceived and experienced by teachers. Using assessment for improvement requires deliberately using assessment to identify learning needs that have not yet been met by the teacher and/or school, so as to lead to potential improvement action . However, if the same information is used to evaluate schools or teachers with consequences for poor performance, exposure of weakness could potentially be counterproductive for the reputation of the teacher or school. A strong emphasis on accountability as the purpose of assessment is likely to discourage active seeking out of weaknesses, because among the known effects of accountability (Lerner & Tetlock, 1999), is the need to maximise performance so as to meet superior's expectations.
It seems to us that all societies use assessment as a mechanism to fulfil one of four major purposes; that is, (1) evaluate learners, (2) judge the quality of teachers, (3) guide student learning and (4) inform teaching changes (Brown, 2008;Heaton, 1975;Torrance & Pryor, 1998;Warren & Nisbet, 1999;Webb, 1992). Hence, given these persistent and probably universal tensions between accountability and improvement functions, it is likely that teachers' conceptions of assessment will reflect the relative emphasis put on either accountability or improvement-oriented purposes of assessment within each system ).
Understandably, the more a phenomenon is similar in its structure and functions across time or place, the more people in different locations or across different cultural contexts, who experience those similarities, will have similar conceptions of the phenomenon. Thus, there is a legitimate expectation that teachers should have similar conceptions of assessment depending on whether the evaluative or improvement functions are prioritised within a policy context.

Conceptualisation of teacher conceptions of assessment
Research into teacher conceptions of assessment conducted in non-Chinese contexts has focused largely on the competing tensions between formative and summative functions of assessment (e.g. Black & Wiliam, 1998;Berry, 2008;Brown, 2008;Carr, 2001;Torrance & Pryor, 1998). The dominant conceptual frameworks juxtapose the positively endorsed formative and improvementoriented uses of assessment against the negatively viewed evaluative and terminal uses of assessment. In other words, most debate about assessment sits between the formative use of assessment which provides feedback to learners and teachers, and leads to improved outcomes versus the summative examination and judgement of learner outcomes or teacher effectiveness.
More finely tuned variations on this simple continuum separate the formative functions into improved student learning and improved teaching (e.g. Brown, 2004;Remesal, 2011) and bifurcate the summative and evaluative functions of assessment by considering the certification and examination of students separately from the evaluation of schools and teachers (Brown, 2008). Finally, resistance to and rejection of the evaluative judgement of students and schools (e.g. Shohamy, 2001) has been incorporated into a multidimensional model of teacher conceptions (Brown, 2008).
A self-report inventory (i.e. the Teacher Conceptions of Assessment inventory) captures the multidimensional nature of teacher thinking about assessment by statistically modelling the intercorrelation and mean level of endorsement for four contrasting purposes of assessment.
Research with this multidimensional framework shows that teachers have patterns of belief across teaching improvement, student learning improvement, school accountability, student accountability and irrelevance purposes of assessment consistent with the policies and priorities of the society in which teachers are employed.
Generally, in the same studies, teachers gave less, but still positive, endorsement to using assessment to evaluate students, with greater endorsement among high school teachers whose students are working towards high-stakes qualifications (Brown, 2011;. In contrast, teachers viewed the use of assessment to evaluate schools in a relatively negative light, especially in societies that have prioritised the formative, improvement functions of assessment (e.g. New Zealand, Queensland, Cyprus, and Spain). Importantly, in all the studies, teachers reject the notion that assessment is irrelevant; assessment matters whether it be for evaluation or improvement.
However, replication studies of the TCoA inventory in Hong Kong and China (Brown, Kennedy, Fok, Chan, & Yu, 2009;Li & Hui, 2007) have indicated that the four-factor framework was insufficient to capture the full range of teacher conceptions of assessment in Chinese contexts. Specifically, it was found among Hong Kong teachers that the correlation between assessment for improved teaching and learning with using assessment to evaluate students was very high (r = .90). This was attributed, in part, to the impact of the high-stakes examination society in Hong Kong in which evaluating students is seen as an essential component of becoming a more developed person (Hui, 2012). Additionally, the historic and philosophic conditions underpinning educational assessment and examination in collectivist China are quite different to those of Western individualistic societies (i.e. examinations have existed in China for >2,000 years and have substantial social and economic consequences, with few alternative pathways to social success), so it seems that an alternative framework would be necessary to account for Chinese teacher thinking about assessment. Thus, the major goal of this paper is to overview and synthesise multiple studies which have sought to understand the purposes and functions of assessment from the perspective of practicing teachers in the People's Republic of China.

Understanding assessment in China
The education system of China places great emphasis and value on success in the many high-stakes examinations used to select students for entry to further opportunities and better schools; hence, the evaluative function of assessment matters for both students and teachers (He, Levin, & Li, 2011;Li, 2009;Liu & Qi, 2005;Min, 1997;Niu, 2007;Wang, 1996). The long use of assessment as a means of social and personal life improvement (China Civilisation Centre, 2007) and the strong association between academic achievement and beliefs about personal worth and virtue (China Civilisation Centre, 2007;Li, 2009;Niu, 2007;Tsui & Wong, 2009) mean that examinations hold great sway in education. However, the current curriculum reform in China advocates building a new system of "assessment for development" which aims at the all-round development of students consistent with the concept of formative assessment (Gao, 2002;Liu & Qi, 2005;PRC Ministry of Education, 2005;OECD, 2011). It is noteworthy that this concern for all-round development of good character and good person attributes has been a curriculum expectation in China since the mid-1950s (Han & Yang, 2001;Wang, 1996).
Thus, China has an assessment context that presses teachers towards two different ends; that is, high performance on summative examinations and formative improvement. Cross-cultural psychology research has also identified that the values of high power distance and high collectivism in Confucian-Heritage societies create significantly different conditions under which teachers must work with

Sources for the model of Chinese conceptions of assessment
This paper draws on four graduate student studies into teacher conceptions of assessment supervised by the second author. Two studies (i.e. Lian, 2006;Wu, 2007) used phenomenographic techniques (Marton, 1981) to examine how teachers experienced assessment and accordingly describe their experiences of assessment. Phenomenography elicits from relevant participants how they have experienced a phenomenon (e.g. the nature and purposes of assessment) and seeks patterns of variation in how those experiences can be classified. This inductive approach has been used successfully to investigate how (1) students understand the nature of learning (Marton & Booth, 1996), (2) how teachers understand teaching (Gao & Watkins, 2001) and (3) how teachers understand the purposes of assessment ). Since phenomenography is necessarily a data-intensive research method, sampling is generally extremely limited, and, thus, generalizability to populations is weak.
Consequently, many phenomenographic studies are complemented by survey studies in which participants rate their attitude towards statements derived to reflect the various conceptions identified through phenomenographic analysis of interview data. Thus, two survey studies supervised by the second author (i.e. Shang, 2007;Wu, 2007) tested the conceptual frameworks developed in the phenomenographic studies. These studies are contrasted with four survey studies carried out by collaborative research teams at South China Normal University and the Hong Kong Institute of Education (Brown, Hui, Yu, & Kennedy, 2011;Li & Hui, 2007;South China Normal University [SCNU], 2010;Wang, 2010). Note that these latter four survey studies were extensions or replications of Brown's (2004Brown's ( , 2008 survey work on teacher conceptions of assessment. Details of the data collection methods, samples and analytic techniques used in each of these studies are provided in Table 1. Lian (2006) carried out interviews with 17 secondary school English teachers in Guangzhou China, while, Wu (2007) interviewed 17 secondary school science teachers in Foshan, a city near Guangzhou. Shang (2007), based on Lian (2006), surveyed 272 teachers in three Yichang City, Hubei province, high schools across three bands of achievement (i.e. top-level provincial school, middle-level city school and a bottom-level district, ordinary school. Wu (2007) extended his own qualitative study with a survey administered to 220 science teachers in Foshan City. Li and Hui (2007) translated Brown's (2004) TCOA-III inventory into Chinese and surveyed 97 lecturers in a tertiary vocational college in Hangzhou, Zhejiang province. The SCNU (2010) survey used a new inventory developed from Western and Chinese sources, and was administered to 1395 teachers in 15 primary and secondary schools in the provinces of Guangdong and Guangxi. Wang (2010) surveyed 501 teachers in 32 schools in over six provinces in China (Guangdong, Hunan, Shanxi, Hebei, Guangxi and Xinjiang) who taught Chinese and conducted follow-up interviews using a photo-elicitation technique.  utilised a similar inventory as SCNU (2010) but developed a common model to fit the responses of teachers in two major cities of southern China.

Constructing teachers conceptions of assessment in China
Utilising phenomenographic data analysis methods to identify patterns in how participants described their experience and views of assessment, Lian (2006) and Wu (2007) each identified five dimensions in teacher thinking about assessment. In both cases, these dimensions were characterised by a continuum anchored at one end by an accountability-oriented conceptions of assessment for management to an improvement-oriented focus on assessment for personal development. Each category contained information about the aims of assessment, content being assessed, methods of assessment and the purpose of assessment.
The most managerial category in Lian (2006) was "Behavioural management" in which the aim of assessment was to maintain order and discipline, by using classroom records and teacher impressions of student attendance and performance in class and in extracurricular activities. Less stringent, the second category, "Teaching diagnosis", aimed at improving teaching by assessing knowledge and skills taught by teachers through posing questions to the class, giving homework, quizzes and exams. The third category, "Target orientation", aimed at ranking and checking student progress towards reaching course targets, as defined by public examinations. Classroom drilling, tests and periodic exams of the knowledge and skills covered by public examinations are used. The fourth category, "Ability development", focuses on improving students' skills and ability through open-ended tests and exercises so as to check and improve students' abilities. The fifth and most personal category, "All-round development", aimed at promoting the overall development of students' learning motivation, attitude, awareness and independence by using teacher-student interactive methods (e.g. chatting, discussing, communicating and commenting through portfolio). Wu (2007) described the most managerial category as "Management oriented" which had the aim of maintaining discipline by focusing on student behaviour through teachers' comments during the class or the recording of student behaviour. Second, "Exam oriented" aims at preparing students for public examinations by strictly following examination prescriptions and simulating examinations with tests and exams. The third category, "Teaching-oriented" assesses to identify diagnostic information and to encourage students to work harder by focusing on knowledge taught in class and tasks given by teachers after class. Teachers use in-class quizzes, pen-and-paper tests and afterclass worksheets. The fourth category, "Ability-oriented" aims to raise students' interest and improve their learning capabilities (such as, enquiry, exploring, problem solving and overall scientific literacy) by using inquiry-based activities and experiments both inside and outside the classroom. The most personal category (i.e. "Personality-oriented") aimed to foster students' individual development and stimulate their creativity by focusing on skills, motivation and attitude towards learning through a variety of processing assessments.
Factor analysed survey studies of Chinese teacher conceptions of assessment have found similar belief categories. Li and Hui (2007) translated Brown's (2004) inventory TCOA-III into Chinese and reported nine factors which formed three major groupings (i.e. negative, improvement and accountability). They concluded that the participants held contradictory views about assessment because, on the one hand, they agreed that it would be used for improvement, and that they would not ignore it. However, at the same time, the teachers indicated assessment was not an accurate or valid way to describe student learning. Li and Hui (2007) concluded that this was a rational response to the overemphasis on examinations, rather than judgement of professionals, as the dominant assessment policy and practice.
The SCNU Team (2010), led by the second author, further modified Brown's TCOA-III by developing items that focused on categories identified by Lian (2006) and Wu (2007). They identified six factors (i.e. "Student development" in which learning motivation, strategy, ability, values, personality and responsibility are developed through assessment; "Teacher accountability" in which assessment is used to monitor and evaluate teachers' work; "Exam preparation" in which assessment is both a means for selecting students and preparing them for high-stakes examinations; "Supervising and urging" in which assessment helps teachers to monitor student behaviour and learning; "Negativity" in which assessment is perceived as unfair and interfering with teaching; and "Error" in which assessment is imprecise and its measurement error has to be considered). Wang (2010) used the SCNU (2010) inventory and reported five conceptions of assessment. They were "Nurturance" in which assessment facilitates the cultivation of students' abilities, values and responsibility so as to benefit their all-round development; "Improvement" in which assessments acts to improve learning and teaching; "Management and control" in which assessments manage schools and control student behaviour in and out of classrooms; "Exam-oriented" in which assessment teaches students how to succeed in public examinations; and "Irrelevant" in which assessments are not only irrelevant but also negatively affect normal teaching.  found seven factors which grouped into three meta-factors (i.e. improvement, accountability and irrelevance) and reported that model parameters were statistically invariant between the two groups. There were statistically significant differences in mean level of agreement with the PRC teachers agreeing with irrelevance more and Hong Kong teachers agreeing more with improvement and accountability. In both groups, the correlation between improvement and accountability was strongly positive (r = .80) and irrelevance was weakly, inversely related to improvement (r = −.22) and weakly, positively related to accountability (r = .28). These intercorrelations suggested that aspects of the examination and control effects of accountability were viewed with some suspicion, while helping learning and developing students were viewed positively.
Despite differences in methods, samples and contexts, there appear to be useful conceptual similarities across these selected studies. A detailed re-sorting and re-summarising of the common ideas across the studies generated six major common conceptions of assessment, each of which is identified and described next.

Management and inspection
This conception emphasises the external management and inspection roles of assessment. The aims of assessment are inspection and control of schools, teachers and students, purportedly to urge better teaching and achievement (Zhou & Reed, 2005). Assessment includes students' performances in subject courses (i.e. summative tests and quantitative records) and their conduct and discipline. Student performance is an indicator of the quality of teachers and schools. http://dx.doi.org/10.1080/2331186X.2014.993836

Institutional targets
This conception sees assessment as a way of checking whether students have fulfilled the pre-set learning targets and standards as instantiated in public examinations. Students are prepared by drilling the set knowledge and skills of the examination prescription through pen-and-paper tests or "boosting" exercises. Students are ranked according to their marks as an effective way of meeting targets.

Facilitation and diagnosis
In this conception, assessment provides valid information to diagnose the effectiveness of teaching and to guide teaching improvements. The focus is on both knowledge and skills students have acquired and on students' approaches to learning. Methods include periodic summative tests, classroom quizzes, and interaction and communication between teachers and students. The emphasis is on analysing student performance, exploring problems students might have in their learning and making adjustments accordingly.

Ability development
This conception aims to increase students' learning motivation and to enhance learning abilities. While, it takes knowledge into account, more attention is put on students' abilities, such as comprehension, problem-solving, inquiry and creativity skills. Assessments are varied including formal and informal tests, performances that are integrated with learning activities.

Personal quality
In this conception, assessment aims to enhance the overall quality of students as humans by encouraging them to establish correct attitudes towards learning; develop their personalities and character, strengthen their interpersonal and organizational skills; foster a sense of responsibility and honour; and help them become self-regulating, cooperative and creative beings. Assessments emphasise encouraging comments, free conversations, learning portfolios, self-reflection, and peer review and integration with learning activities such as project learning, DIY, writing scientific essays, speech contests, role performances and so on.

Negativity
In this conception, assessment itself is considered not to be accurate and may come with errors. Under these conditions, it is logical that teachers would consider that invalid interpretations would lead to improper negative effects for students and their learning. Assessment might even disrupt teaching, forcing teachers to adopt methods incompatible with their beliefs about their teaching.
Five of these factors appear to range from a more external management and control perspective to a much more individualistic-developmental view of the purpose of assessment. The sixth conception reflects a more negative view of assessment. This arrangement is depicted in Figure 1. Table 2 provides a summary of the factors reported in each identified study relative to these six categories and compares them to the New Zealand developed TCoA results. This analysis shows that, while certain conceptions appear to replicate across the Western and Chinese contexts (e.g. irrelevance, improvement, and accountability), there are different priorities among Chinese teachers. For example, improvement includes not just learning and skills but also personal development, which included statements such as: (1) assessment is used to provoke students to be interested in learning, (2) assessment cultivates students' positive attitudes towards life, (3) assessment fosters students' character, (4) assessment develops a sense of responsibility in students and (5) assessment provides opportunities for students to practice lifelong skills. It is unusual, perhaps, to a Western educator to consider that assessment contributes so positively to learning, positive attitude, responsibility and lifelong skills; these expectations do not seem to accord with the evaluative function of assessment seen more in Western discourse. Nonetheless, student accountability in Chinese contexts is strongly tied to the examination system, as is school accountability. http://dx.doi.org/10.1080/2331186X.2014.993836

Factors affecting teachers' conceptions of assessment
Having identified the range of options in Chinese teachers' conceptualisation of assessment, we consider now factors for which there are statistically significant different levels of endorsement for these factors. It is expected an evaluation of teachers' personal characteristics (e.g. sex, age, qualifications and teaching experience) and work environments (e.g. type of school, teaching subject, class size, teaching level, school band and employment status) will shed light as to the reasons for certain beliefs.
Public schools in China belong to and are run by different levels of local governments (i.e. provincial, municipal, district and township), with schools funded by the largest governments being the most well resourced and prestigious. Entry into provincial-level school reflects positively on the socio-economic resources and academic achievement of the student. Within each government level of schooling, schools are divided into different bands according to their conditions and achievement levels; the  schools are classified by student entrance-examination scores so that there are elite (1st-level) and ordinary schools which are lower quality. Schools are normally classified by the experience of students into three stages: primary (grades 1-6), junior secondary (grades 7-9) and senior secondary (grades 10-12). Note that there are major public examinations that determine entry into the next level of schooling at the end of all three stages. Class sizes in China, according to central government standards, ought to be 40-50 students per class. However, in some primary schools, the number might be less than 30 due to recent decreases in birth rates; whereas, in some senior secondary schools, class sizes might as large as 70, with even up to 100 being reported (OECD, 2011).
The studies have identified some statistically significant effects according to some teacher characteristics and work environment factors. However, the picture is not consistent across studies; indeed, Wu (2007) reported that all comparisons were statistically non-significant.

Teacher characteristics
Sex and teaching experience were the only teacher characteristics to have a statistically significant effect. In two studies, men teachers were found to more strongly agree with management and inspection and institutional targets factors (SCNU, 2010;Shang, 2007). This may be a function of more school managers being men. Teachers with considerable experience (i.e. 20 or more years) were found to more strongly agree with management and inspection factors (SCNU, 2010;Shang, 2007) and institutional targets (SCNU, 2010;Wang, 2010). Further, highly experienced teachers had lower levels of agreement with two factors; that is personal quality factors (SCNU, 2010) and facilitation and diagnosis factors (Wang, 2010). The effect of teaching experience on the conceptions of assessment may reflect the much stronger school management and accountability roles that experienced teachers have in schools.

Work environments
As might be expected, teachers in the senior secondary school sector, where the greatest consequences for public examinations (i.e. entry to university) come to the fore, had statistically significant different beliefs. Senior secondary teachers had the lowest agreement with personal quality factors (Wang, 2010) and the highest agreement with management and inspection factors (Wang, 2010) and institutional targets (Wang, 2010). At the same time, senior secondary teachers had the highest agreement with irrelevance (Wang, 2010) and teachers in the final year of senior secondary school agreed most with personal quality factors (Shang, 2007). Shang (2007) also suggested that teachers working in higher band (i.e. prestige) agreed more with personal quality factors. Teachers working in large classes (i.e. >50) were least confident that assessment would facilitate and diagnose (Wang, 2010) or that assessment was negative (Wang, 2010). Lian's (2006) qualitative study sheds some light on these survey results. Teachers in higher ranking indicated that schools tend to admit students according to their entrance examinations marks, and the style of school management is similar to industrial and commercial business environments. Lian (2006) also reported that teachers adopted institutional target conceptions because contemporary Chinese parents, in contrast to traditional priorities, expect schooling to contribute directly to maximising future employment opportunities for their children rather than personal development.
In sum, it would appear teachers' conceptions of assessment become more managerial and targetoriented and less nurturing as teachers work in environments that are high-stakes (e.g. senior secondary, high-status schools). However, given the low level of interstudy consistency in these findings, there are clearly complex sources in teachers' thinking about assessment that are not sufficiently captured in the variables studied so far. More insights into teacher thinking about assessment in China may arise in examining the relationship of beliefs about assessment to practices of assessment.

The relationship between assessment conceptions and practice
Two studies have examined in somewhat greater detail the relationship of teacher conceptions of assessment to their practices. Wang (2010) used cluster analysis to create three groups of teachers based on their responses to the five factors she found in the survey study. These three categories, based on their agreement or rejection profile were active, compromising and conflicted. The largest group were called compromising, because while they disagreed with management and control, they agreed with the examination-orientation, while being neutral for the other three scales. The second group, almost as large as the first, were deemed conflicted because they agreed with all five scales despite the logical tensions between accountability and improvement perspectives. The third group, the smallest, were called active because they agreed with the nurturance and improvement factors, and rejected the management and control, exam-oriented and irrelevance factors. It seems likely that only the active group are likely to be inclined to adopt and implement an assessment for learning policy direction.
To more deeply characterise these conceptual orientations, qualitative analysis of selected individuals' representations of each orientation were undertaken (Wang, 2010). A combination of interviews and evaluations of drawings led to characterisation of the content, participants, methods and interpretations of assessment for each cluster. Teachers in the "active" group tended to promote students' development in their assessment practices. For example, they used more qualitative methods, involved students more, included a wider range of learning outcomes, and were more focused on describing student progress and considering external conditions when interpreting and utilizing assessments. Teachers in the compromising group were characterised only by a stronger tendency to prefer teacher-based assessment as the only legitimate form of assessment. The conflicted group seemed to appreciate the use of discussion as a supplementary assessment method and preferred to rank students according assessment results. Nonetheless, despite cluster membership, there appeared to be inconsistency around public examinations. For example, teachers in the active group disagreed with the exam preparation conception, but in practice, they made use of test-like assessments and focused heavily on public examinations. This practice, despite contrary beliefs, reinforces the powerful effect of accountability evaluation of schools through student performance. Teachers seem unable to resist the pressure of the system. Wu (2007) examined the relationship between assessment conceptions and teaching practice through the concept "teaching approach" (Trigwell & Prosser, 2004) in which teaching strategies are related to teaching intentions and, consequently, practices and results. The approaches to teaching inventory, with 18 items in four sub-scales (i.e. student-centred intention, studentcentred strategies, teacher-centred intention and teacher-centred strategy), was translated into Chinese. The correlations provide evidence of convergence and divergence around student and teacher-centred thinking. Convergence was found around the moderate correlations between teacher-centred approaches and the management and exam-oriented conceptions of assessment (mean correlation r = .49), and between the student-centred approaches and the teaching-and ability-oriented conceptions (mean correlation r = .42). Divergent validity is seen in the low intercorrelations between the student-centred approaches and the management and examoriented conceptions (mean correlation r = .17), and the teacher-centred approaches and the teaching and ability-oriented conceptions (mean correlation r = .29). This indicates significant alignment between teaching and assessment beliefs around external inspection and selfdevelopment constructs. This result seems consistent with Sang, Valcke, Tondeur, Zhu, and van Braak (2012) who reported stronger endorsement of constructivist than transmission-oriented approaches to teaching among teacher education students in China.
Nonetheless, it needs to borne in mind that these results are based on teachers "espoused or declared beliefs" rather than their observed "beliefs-in-action". Hence, this section only tells half the story since the studies did not carry out classroom observations to demonstrate that these conceptions matter in practice.

Discussion
The goal of this paper was to create an interpretive framework from these multiple studies with different samples and methods. We propose that Chinese teachers' conceptions of assessment include six major categories which can be placed on a continuum of agreement ranging from most positively endorsed (i.e. personal quality development) to most strongly rejected (i.e. negativity about accuracy and effects of assessment). There is also evidence that there is general consistency between Chinese teachers' conceptions of assessment and their practices. However, this consistency is constrained by many external factors which may limit how effective teachers can be in implementing formative assessment practices. Nonetheless, no matter how constrained the situation is, it would appear Chinese teachers (1) have the idea that assessment exists to facilitate student learning and nurture personal development and (2) try to deliver such benefits in addition to exam-oriented preparation.
A strong characteristic of this interpretation of Chinese teacher beliefs is that assessment was seen as playing a significant role in managing, controlling or holding schools accountable. Assessment not only controlled school curriculum and teaching, it is also used to evaluate schools and teachers. Teachers were aware of the same powerful controlling effect assessment had on students. Examination success as not just a matter of forcing students to learn, or a means of meeting institutional targets, but more importantly as the duty of a teacher and the school to improve the child's moral and personal character. Hence, we suggest that Chinese teachers appear to be aware of, and possibly positive about, the role assessment played in organising classroom activities and controlling students' classroom behaviour.
It is accepted that the evidence base used in this synthesis may be considered somewhat weak since it relies on graduate student studies, including two masters level dissertations. Nonetheless, those studies, supervised by the second author, contribute meaningfully to our theorisation of how Chinese teachers conceive of assessment. It is argued these studies have credibility since, despite differences in content and method, they provided complementary results. That the graduate student studies aligned with the peer-reviewed journal papers also lends credibility. Although, research in this field would benefit from large, high-quality studies that are subjected to international peer scrutiny, we offer the current studies as an initial classification of the major choices faced by teachers in China when they think about assessment.

Comparisons with non-Chinese teachers
Like the western teachers described earlier, the Chinese teachers agreed most with the role assessment plays in improving learning and teaching. The core idea is that assessment can provide useful and effective information for exploring learning problems so as to diagnose and improve teaching and learning. However, Chinese teachers, in this synthesis, tended to see improvement as inseparable with evaluative accountability of teachers and schools, much as the earlier survey of Hong Kong teachers found .
As a consequence of this strong association, Chinese teachers appeared to agree that by examining students they are assisting students to become more responsible for their own learning. It is this assertion that assessment contributes to the development and nurturing of students' personality that may strike the Western teacher as alien. In the west, assessment may be seen as contributing to a child's learning ability and related motivations and attitudes. However, to the Chinese teacher, assessment examines the overall quality of a student as a whole and, thus, benefits students' all-round development. The strong relationship of accountability and improvement may arise because of the strong emphasis on collectivist rather than individualistic philosophies in China or it may be a function of the strong tendency to conform to authority in China. Nonetheless, this strong association suggests that there will be considerable difficulty for Chinese teachers to implement a version of the assessment for learning reform that avoids using tests or examinations; assessment that does not readily contribute to accountability purposes will likely not be conceived as contributing to improvement. Hence, the relevance and potential of Carless's (2011) approach in helping teachers in Hong Kong to use their summative testing formatively.
While both Chinese and non-Chinese teachers recognised purposes which were negative or irrelevant, the reasoning behind that evaluation differed. Chinese teachers focused on the negative aspects of assessment at the system level (i.e. high-stakes, selective public examinations and the linking of teacher incomes to student examination results (Harris, Zhao, & Caldwell, 2009;Zhang & Ng, 2011). In contrast, in non-Chinese jurisdictions, assessment is negative for its impact on the individual child or because of its inherent flaws. However, this difference may be a function of where previous non-Chinese studies took place (i.e. in environments where the consequences of public testing or examinations are relatively low for teachers). Perhaps, similar negativity about "national testing" systems would be exhibited in Western countries (e.g. the UK or the USA) where substantial negative consequences are imposed on schools and teachers.
The current results seem consistent with the quite different educational priorities seen in the various countries studied to date. To clarify, China has a strong public examination society with high-stakes consequences for students based on their observed test scores. This situation is used by Chinese society to reflect on the quality of both teachers and schools; hence, Chinese teachers, perhaps to meet societally approved goals or perhaps to protect their own standing in a school, have to prioritise maximum performance on the externally administered tests. As long as the examination culture dominates the distribution of educational resources and rewards, it is highly unlikely Chinese teachers can think of assessment in any other way. The argument made in Hong Kong by Kennedy, Chan, and Fok (2011) probably applies here as well; formative, SBA is a good but soft policy, while the hard policy is examinations.
In contrast, teachers in the Western jurisdictions studied to date (e.g. New Zealand, Queensland, Cyprus or Spain) work in environments that have long prioritised assessment as predominantly a formative or diagnostic function to guide teaching and have reserved qualifications assessments for the last few years of secondary schooling, and to some extent involved teachers in awarding those qualification results independent of external examinations (esp. in New Zealand and Queensland). This work context permits the prioritisation of assessment for improvement, and downplays the importance of evaluating teachers and schools through student results. With a relatively open school system without tracking or elite selection, unlike China, assessment results appropriately play a much weaker role in evaluating the quality of education.

Conclusion
The current review is based on a small number of studies conducted in the southern part of China, which made use of phenomenographic or survey methods, and which focused on the aims or purposes of assessment. Additional research from different analytic perspectives, different regions, different school types and different stages of education would be useful. Further, the relationship of teacher conceptions to students' conceptions and outcomes is not yet studied. Future studies could examine some of the speculations offered in this paper as to whether collectivism or conformity to authority are adequate explanations for the strong relationship between accountability and improvement purposes of assessment.
Notwithstanding the strong policy environment in which assessment is used as means of student selection, it would appear that there is breadth in teachers' conceptions of assessment. Depending on times and locations, teachers can express different conceptions and might agree with conflicting conceptions at the same time. Furthermore, since teachers are not free to use assessment in any way they might choose, the effect of their conceptions on assessment and teaching practices is complex and often indirect.
The effects of China's new national curriculum can also be seen in these studies. The philosophy of assessment advocated by the new curriculum has become part of the teachers' ideas and their practice. If China's education management system could be less rigid, and if political and educational leaders could view schools as something other than "factories", teachers may be free to put the new curriculum philosophy into practice. This would require, of course, resisting millennia of tradition concerning the proper role of examinations. It seems most unlikely that a change of teacher conceptions could be brought about purely by professional development or pre-service teacher education. Indeed, it is doubtful that, as long as students experience rewards through performance on public examinations, prospective teachers will adopt more reformist views of assessment (Chen & Brown, 2013). Environmental constraints, such as teacher evaluation, remuneration, and recruitment policies, let alone student evaluation mechanisms, must be changed to see a change in teacher conceptions.
It seems plausible that policy-makers could confidently remove consequences associated with some high-stakes examinations (esp. the zhong kao examination at the end of middle school), because Chinese teachers have a strong commitment to educational improvement through assessment. The sky will probably not fall down if different mechanisms were used to assign students to schools or identify and reward teachers and schools that are doing a good job compared to the current systems. This oversight needs to be addressed.
These studies into Chinese teachers' conceptions of assessment provide a start for understanding how assessment is perceived among teachers in the People's Republic of China and what differences there may be to previous studies with Western teachers. The results also shed light on the feasibility of implementing an assessment for learning agenda in China.
While there are many commonalities between Chinese and western teachers in how they conceive of assessment, many of the differences seem to be linked to the vagaries of the education system in China (e.g. making exam marks the only standard for enrolment into higher levels of schooling; copying administration models and rules from factories and enterprises; ranking schools according to their resources and student performance in public examinations; and large class sizes). These structural features reduce the freedom of teachers to act in accordance with their dominant beliefs about the purposes of assessment.