The Effect of Language of Survey Administration on the Response Formation Process

Introduction With the increasing number of people of multiple cultural backgrounds in modern societies, surveys of ethnic minorities and immigrants are becoming more common. One obvious source of measurement differences is the necessary use of different languages when intending to measure the same phenomena in multiple ethnocultural groups. Typically, surveys allow respondents to answer in the language of their choice, possibly introducing self-selection bias to the extent to which those who choose their mother tongue differ in background characteristics (e.g., level of acculturation, education), substantive answers, and response patterns (e.g., “don’t know” responses) from those who choose the mainstream language. However, although self-selection certainly plays a role in differences observed across the different language versions of a survey, it is premature to consider it the sole source of all observed differences. There is a known link between language and cognition (e.g., Whorf, 1956). To study language influences on the response formation process in surveys, we need to assert that the various language versions of a survey are free of translation problems and convey the same constructs. Thus, any observed differences between responses provided by the same respondent in different languages can be attributed to language priming a particular mind frame and influencing the thought processes. To examine the potential effects of language on survey responses, we focus on the response formation model (Sudman, Bradburn, & Schwarz, 1996; Tourangeau, Rips, & Rasinski, 2000). The right-hand side of Figure 1-1 presents the tasks that respondents perform to answer a survey question: attending to the question and response options (comprehension), retrieving the necessary information (retrieval and judgment), assessing the


Introduction
With the increasing number of people of multiple cultural backgrounds in modern societies, surveys of ethnic minorities and immigrants are becoming more common. One obvious source of measurement differences is the necessary use of different languages when intending to measure the same phenomena in multiple ethnocultural groups. Typically, surveys allow respondents to answer in the language of their choice, possibly introducing self-selection bias to the extent to which those who choose their mother tongue differ in background characteristics (e.g., level of acculturation, education), substantive answers, and response patterns (e.g., "don't know" responses) from those who choose the mainstream language. However, although self-selection certainly plays a role in differences observed across the different language versions of a survey, it is premature to consider it the sole source of all observed differences.
There is a known link between language and cognition (e.g., Whorf, 1956). To study language influences on the response formation process in surveys, we need to assert that the various language versions of a survey are free of translation problems and convey the same constructs. Thus, any observed differences between responses provided by the same respondent in different languages can be attributed to language priming a particular mind frame and influencing the thought processes.
To examine the potential effects of language on survey responses, we focus on the response formation model (Sudman, Bradburn, & Schwarz, 1996;Tourangeau, Rips, & Rasinski, 2000). The right-hand side of Figure 1-1 presents the tasks that respondents perform to answer a survey question: attending to the question and response options (comprehension), retrieving the necessary information (retrieval and judgment), assessing the completeness and relevance of the memories (formatting), and editing the response before mapping to the provided response categories (editing). These tasks are not necessarily sequential or independent but are presented as such for simplicity. The left-hand side of Figure 1-1 represents the mechanisms related to language influences that are most likely to be present at each stage of the response formation process. We limit our discussion only to mechanisms well known to yield reporting differences, namely, cultural frame switching, language-dependent recall, language codability, and spatial frames of reference inherent in each language. We acknowledge that other language influences might be at play, but as of now, they remain undiscovered.
The language influences presented in the model can only be apparent among bilingual respondents, so we discuss them within the context of more than one language being available to communicate with respondents. We describe each of these mechanisms and examine their possible effects at each step of the response formation model by reviewing the existing literature from relevant fields and deriving conclusions about consequences for surveys.

Comprehension
Survey data are meaningless if respondents do not understand the survey questions as intended by the researchers. Question comprehension involves processing the syntactic structure and understanding the semantic (literal) Language-dependent recall Spatial frames of reference Cultural frame switching Editing and pragmatic (intended) meaning. In cross-cultural surveys, in addition to the direct impact of translation, comprehension problems may occur as a result of differences related to cognition. Because language is a tool for information exchange among people of the same culture, it reflects the meaning system of the culture. Thus, word meaning and sentence meaning in language comprehension depend on preexisting background knowledge about not only the grammatical norms associated with the language, but also the cultural norms and practices related to it. Furthermore, lexical ambiguity is inherent in languages, and recall of the lexical meaning of words is often context dependent. Languages differ in their contextual dependency, and this difference is reflected in the conversational norms across cultures. For example, many words in Chinese acquire meaning only in the conversational context and cannot be translated word for word; this is related to the practice of East Asian cultures to read between the lines (for an overview, see Nisbett [2003]). Thus, the same question presented in Chinese or English to a bilingual respondent may convey a different meaning depending on how much contextual information is incorporated from previous questions.

Cultural Frame Switching
Differential context dependency can have consequences for question interpretation when partially redundant information is presented (e.g., Haberstroh, Oyserman, Schwarz, & Kühnen , 2002). In bilingual respondents, such context sensitivity is likely to depend on which cultural frame is primed by the survey question. Research on acculturation has demonstrated that individuals can possess more than one cultural identity (e.g., Berry & Sam, 1996;Hong, Morris, Chiu, & Benet-Martinez, 2000) and move between different cultural meaning systems, depending on situational cues and requirements. This phenomenon, known as "cultural frame switching" (Briley, Morris, & Simonson, 2005;Hong et al., 2000), is likely to have a strong effect on survey responding because each cultural meaning system serves as an interpretive frame that affects an individual's cognition, emotion, and behavior (Geertz, 1993;Hong, Chiu, & Kung, 1997;Kashima, 2000;Mendoza-Denton, Shoda, Ayduk, & Mischel, 2000).
Language can serve as a situational cue for the cultural system associated with it; thus, it may prompt bilingual respondents to differential question interpretation based on the cultural frame induced by it. Indeed, studies that have experimentally manipulated language assignment among bilinguals report responses consistent with the cultural system associated with the assigned language (e.g., Peytcheva, 2019;Ross, Xun, & Wilson, 2002;Trafimow, Silverman, Fan, & Law, 1997). Such studies provide evidence that language is a powerful cue for the interpretive frame bilingual respondents adopt when answering survey questions.

Codability
Language codability is the ease with which a concept can be expressed in a language. Not surprisingly, the most highly codable concepts are presented by the most frequently used words, which are short and easy to write and pronounce (see Whitney, 1998). Codability affects cognitive processes such as retrieval (Lucy, 1992;Lucy & Shweder, 1979;Lucy & Wertsch, 1987) and comparative judgment (Kay & Kempton, 1984). However, codability may also influence question comprehension in surveys; the question target may be very different depending on whether a specific word exists in the language for a given attitude or behavior or whether several less specific terms are used to describe it. For example, in Chinese, there are separate terms for family members that have only one English equivalent-different words describe whether your "uncle" is your mother's brother or father's brother and whether he is a younger or older brother. Thus, it can be hypothesized that when asked in Chinese about two or more related people who can be labeled differently, respondents may think of them differently relative to when questions are asked in English when a common label is used. This difference may lead to inclusion errors when respondents are asked in English because of the failure to draw a lexical distinction across referents. Such interpretational differences across two languages may affect various respondent tasks in surveys (for example, household roster construction).

Spatial Frames of Reference
Languages have inherent frames of reference for describing relationships among objects. Psycholinguists distinguish between relative and absolute languages (also known as egocentric and allocentric). Relative languages, such as most Western languages, use a viewer-centered perspective, giving rise to descriptions such as "in front of me" and "to the left." Absolute languages use external reference frames, such as cardinal directions or an up-down axis; for example, speakers of Arrernte (Australia) will say "the fork is to the north of the spoon" (Majid, Bowerman, Kita, Haun, & Levinson, 2004).
Such intrinsic language differences may potentially affect comprehension in bilingual speakers of languages with different dominant spatial frames of reference because these reference frames have been found to determine many aspects of cognition (see Levinson [2003]). Experiments by Pederson et al. (1998) demonstrate that the domination of a linguistic frame of reference in a language reliably correlates with the way its users conceptualize in nonlinguistic domains. For example, speakers of Mopan (Mayan) and Kilivila (Austronesian) cannot distinguish between two photographs of a man facing a tree when the position of the man and the tree are left-right mirror images of one another because such a relationship between the objects in both photographs is described as "tree at man's chest." For survey practitioners, such findings suggest that speakers of languages that use different frames of reference may interpret survey visual images and response scales differently. For example, the orientation of a scale (vertical or horizontal) may influence how similar or distinct response categories are perceived, depending on the language used and its inherent frame of reference. However, such effects are likely to occur only in cases where the dominant frames of reference used in two languages are not functional equivalents of one another (as in the example with Mopan speakers where there were no functional equivalents of "left" and "right" in the described mirror-image photographs); thus, their impact on the survey response processes may be very limited. However, the relationship between dominant frames of reference and cultural orientation (individualistic vs. collectivistic; for reviews on documented social and cognitive differences, see Oyserman, Coon, and Kemmelmeier (2002) and Oyserman and Lee (2007)) remains unknown. To the extent to which ego-centered frames of reference are related to individualistic identities across cultures that use such languages and vice versa, the language of administration will be an important factor influencing survey responses. Similar to cultural frame switching, a speaker of languages that use different frames of reference would endorse more individualistic or collectivistic responses depending on the cultural identity evoked by the egocentric or allocentric frame of reference inherent to the language of survey administration. Such possibility deserves further investigation.

Retrieval and Judgment in Behavioral Reports
The information requested in a survey question is rarely readily available, and often respondents need to retrieve memories and assess their relevance on the spot. Because this process is somewhat different for behaviors and attitudes, we discuss each separately, starting with behavioral reports.
Behavioral questions often ask about past events that took place in a respondent's life. When such events have low frequency of occurrence or are of particular importance to the respondent, they may be directly accessible in memory (for reviews of issues related to asking behavioral questions, see Bradburn, Rips, and Shevell, 1987;Schwarz, 1990;and Strube, 1987). However, respondents often need to recall relevant information and count instances of occurrence (enumeration) or compute a judgment (rate-based estimation). The success of retrieving the information and its accuracy depend on time on task (e.g., Williams & Hollan, 1981), the elapsed time since event (Cannell, Miller, & Oksenberg, 1981;Loftus, Smith, Klinger, & Fiedler, 1992;Means, Nigam, Zarrow, Loftus, & Donaldson, 1989;Smith & Jobe, 1994), the availability and adequacy of retrieval cues (for a review, see Strube, 1987), and the match between the encoding and recall contexts (Tulving & Thompson, 1973). The context may vary from physical context (Godden & Baddeley, 1975;Smith, 1988) to mental and emotional states (Bower, 1981;Bower, Monteiro, & Gilligan, 1978;Eich, Weingartner, Stillman, & Gillin, 1975). Several studies have demonstrated that the language in which mental activity is carried out during information encoding creates an internal context analogous to a mental state and can serve as a retrieval cue during information recall; similarly, the language spoken aloud during an event creates an external context analogous to a physical context and can serve as a situational cue during event recall (Marian & Neisser, 2000;Schrauf & Rubin, 1998. Thus, a match between language of encoding and language of recall in surveys should yield more accurate responses among bilingual respondents.

Language-Dependent Recall
Language-dependent recall is the notion that the language may influence retrospective reports. This phenomenon has been demonstrated in several bilingual groups in terms of number of recalled memories (e.g., Bugelski, 1977) and time in life when the recalled events took place (e.g., Schrauf & Rubin, 1998). Going beyond earlier findings of language-congruity effects, Marian and Neisser (2000) investigated whether a match between language of encoding and recall facilitated retrieval because the language matched words used during the original event or because the language at the time of recall induced a more general mindset, resembling the processes assumed to underlie state-dependent memory. The results showed that the effect of ambient language was significantly stronger than the effect of word-prompt language, further "strengthening the analogy between language-dependent recall and other forms of context dependency" (Marian & Neisser, 2000, p. 366).
The implication of such findings for surveys that involve immigrant and ethnic minority populations is that the choice of language of survey administration affects both the quality and quantity of recall. Specifically, first-language cues tap into first-culture memories, while second-language cues likely activate more recent memories. This suggests that language of survey administration in bilingual respondents may be switched throughout the survey, depending on life periods for which researchers are interested in collecting data. Additionally, bilingual immigrants or ethnic minorities are likely to use different languages in different life domains, for example, at work and at home. We can expect that the match between language spoken at home and language of survey administration will yield the most accurate information regarding home events, the highest number of such reported events, and the lowest response latencies for home-related questions, and vice versa. Such hypotheses, if supported, would further argue for a language switch across domains in surveys of bilinguals.

Codability
Often, there is no direct correspondence across languages with respect to terms that describe the same phenomenon; thus, using phrases or multiple words to describe the concept of interest is necessary during translation. Research related to language codability would predict difficulty in recall with difficult-to-code words because easily coded words (and, therefore, events associated with them) are remembered more easily (Lucy, 1992;Lucy & Wertsch, 1987). However, analogous to question decomposition, multiple words may provide more contextual cues that can ease recall and eventually improve report accuracy. To date, it remains unknown how such processes operate for users of two languages with different levels of specificity for the same concept.

Spatial Frames of Reference
A different aspect of language-dependent recall is demonstrated in studies of spatial cognition; the frames of reference used in a language to describe specific situations are likely to induce the same frame of reference in the nonlinguistic coding of the same situations (Levinson, 2003). Various experiments (Levinson, 2003;Pederson et al., 1998;Wassmann & Dasen, 1998) have shown that when speakers of languages with different dominant frames of reference are given various memory and spatial reasoning tasks, the nonlinguistic frames of reference used to carry out these tasks match the dominant frames of reference of the languages (see Levinson, 2003;Pederson et al., 1998;Wassmann & Dasen, 1998). Specifically, speakers of languages that use absolute frames of reference (e.g., Balinese, Indonesia; Belhare, Nepal; Arrernte, Australia) preserved the absolute coordinates of objects when performing tasks such as memorizing order and direction of objects within an array, while speakers of relative languages, such as Dutch, Japanese, and Yukatek (Mexico), preserved the relative coordinates of objects (Levinson, 1996;Pederson et al., 1998).
The cognitive consequences of being bilingual in languages that use different frames of reference remain unclear. One possibility is differential perceptual tuning due to the use of different frames of reference because languages have been found to affect perception such that individuals become more or less attuned to certain features of the environment (Goldstone, 1998;Sloutsky, 2003). For survey practitioners, this may mean that what is reported during recall tasks may be related to what language is used during initial information encoding and later, during the survey interview. In an extreme example, certain information may not be encoded because of the language spoken during an event that predetermines on what speakers focus their attention. Furthermore, similar to language-dependent recall, it can be expected that a match between language frames of reference during encoding and retrieval could facilitate remembering.

Retrieval and Judgment in Attitudes
Attitude questions often require respondents to form an opinion on the spot in the specific context of a survey (Sudman et al., 1996). To do so, they need to form a mental representation of the question target based on the most accessible relevant information. Preceding questions, visual aids, and interviewer characteristics can make certain information more accessible; language of survey administration can also determine what information is accessible at any given time by activating the cognitive-affective cultural framework associated with it. By using a particular language, a "languagespecific self" is activated, who acts like a filter through which information is both encoded and retrieved (Schrauf, 2000).

Cultural Frame Switching
Language can affect what information is temporarily accessible by evoking a particular mindset related to the cultural meaning system associated with it. For example, a study of Greek students attending an American school in Greece showed that the correlation between the same attitudinal questions administered in English and in Greek was low for domains in which the Greek and American norms differed in what was considered socially desirable and high for domains in which the cultural values converged (Triandis, Davis, Vassiliou, & Nassiakou, 1965). Similar results were reported for English-Spanish bilinguals by Marín, Triandis, Kashima, and Betancourt (1983).
Another aspect of cultural frame switching relates to differences in how Westerners and East Asians organize the world: Westerners show preference for grouping objects based on taxonomy or common category membership, while East Asians prefer groupings based on relationships (Chiu, 1972;Ji, Schwarz, & Nisbett, 2000). Such grouping preferences can be manipulated by the language used during the cognitive task; for example, Ji, Zhang and Nisbett (2004) found that relationship-based grouping shifted to categorical when Chinese speakers from Mainland China and Taiwan were asked questions in English. Recent studies in psycholinguistics have also demonstrated that language can affect comparisons (Bowerman & Choi, 2003;Gentner, 2003), and to the extent to which languages classify according to different criteria, the extracted similarities also differ (Boroditsky, 2001;Boroditsky, Schmidt, & Phillips, 2003;Lucy & Gaskins, 2001).
These findings have several implications for surveys of bilingual respondents. First, the information that is accessible to form an opinion will vary depending on the language of survey administration. Hence, to achieve maximum equivalence of different language versions, open-ended questions should be avoided. Second, the same question can be perceived to have different affective characteristics depending on the language and cultural norms it activates; thus, more or less socially desirable opinions will be expressed, depending on language. Knowing in advance how cultures differ in terms of a question's affective characteristics may better inform questionnaire design, and various techniques can be used to reduce social desirability or sensitivity across language versions. Third, judgments can be language dependent because comparisons are based on culture-approved practices and how language systems are organized. Such hypotheses necessitate systematic investigation of language effects and the underlying dynamics across question types.

Codability
Studies in psycholinguistics have demonstrated that codability affects judgment. Kay and Kempton (1984), for example, showed that color-naming practices affect judgments of colors: speakers of Tarahumara (a Mexican Indian language that does not have separate words for blue and green) differentiated among color chips on the blue-green color continuum based on their physical characteristics-namely, wavelength of reflected light. In contrast, English speakers differentiated among the same color chips based on labels, such as "shade of green" and "shade of blue." Thus, English speakers evaluated colors in terms of categories in which they were easily coded, while Tarahumara speakers, lacking such codability of colors, based their evaluations on physical characteristics. Similarly, Hoffman, Lau, and Johnson (1986) examined the extent to which the codability of personality description (existence of stereotypes) in a language influenced the impression about a person. The study found that terms that were readily available in the language led to stereotyped impressions, and participants were more likely to elaborate on the described person's characteristics using terms consistent with the stereotype than when a verbal label was not available.
Such findings may have implications for the use of scales in surveys of bilingual respondents. For example, scales may be judged differently depending on whether scale labels are easily codable in both languages. If label equivalents are not easily codable in one language, respondents may be more likely to consider solely the numeric values of the scale when making judgments, resulting in response differences across language versions.

Response Formatting
The ability to differentiate among response options may be influenced by language codability, and the stimuli used to anchor the points of a rating scale may be affected by the cultural meaning system primed by language.

Cultural Frame Switching
Cultural frame switching can further complicate the investigation of language effects at the formatting stage because scale anchoring may be affected by the reference frame primed by a language. Such differences in scale anchoring may be reflected in the observed differential response styles across cultures. For example, several studies have reported that East Asians avoid extreme responses (Chen, Lee, & Stevenson, 1995;Chun & Campbell, 1974;Hayashi, 1992;Stening & Everett, 1984;Zax & Takahashi, 1967). Although such differences are often attributed to differential emphasis on conflict avoidance and humbleness, it is unclear whether these differences are an artifact of self-presentation as a result of language priming culture or true differences in perception, independent of language. Moreover, the extent to which respondents use the range of a presented frequency scale as a frame of reference when answering survey questions is also culture dependent. A study by Ji, Schwarz, and Nisbett (2000) demonstrated that Chinese students were influenced by the range of frequency scales only when asked to report private, unobservable behaviors (e.g., having nightmares, borrowing books from the library). However, no scale effects were found for public behaviors (e.g., being late for class), possibly reflecting the importance of "fitting in" in Asian cultures related to monitoring (and thus having better memory representation of) one's and others' public behaviors. In contrast, consistent with previous research on scale effects (for a review, see Schwarz [1996]), American students relied on the presented response scale frequency range to estimate both private and public behaviors. For surveys of bilingual respondents, such findings suggest that, depending on the cultural identity primed by the language of interview, different estimation strategies may be employed.

Codability
Similar to the effect of language codability on retrieval and judgement, response formatting may also be affected by the availability of a label for a given concept. For example, scales may be used differently by speakers of different languages as a result of different scale label codability; thus, the meaning of the same number on a labeled scale may be affected by what language is used. Taken to an extreme, there are cultures whose languages have terms only for one, two and many (Greenberg, 1978), which further limits the ability of their speakers to make comparisons (Hunt and Agnoli, 1991). At this point, little is known how this may affect the cognitive processes in bilinguals whose other language allows for utilization of the whole numeric scale. It can be speculated, that the ability to make comparisons may remain language dependent.

Response Editing
Respondents sometimes edit their responses before reporting them, reflecting social desirability and self-presentation concerns (Sudman et al., 1996). Gender, age, socioeconomic status, and various survey design characteristics have been found to be correlates of socially desirable responding (for a review, see DeMaio, 1984). Recent work in cross-cultural research suggests that culture influences social desirability through interpretation based on cultural experiences, and response editing depends on the need to conform with particular social norms Lee, Xu, Fu, Cameron & Chen, 2001).

Cultural Frame Switching
The same survey question may be perceived to have different levels of socially desirable content depending on the respondent's cultural identity. For example, maintaining harmony and face-saving are more socially desirable traits in Asian cultures than in the Western world (Triandis, 1995). Similarly, mental health is stigmatized in Arab and Hispanic societies (Bazzoui & Al-Issa, 1966;Chaleby, 1987;Okasha & Lotalif, 1979;Silva de Crane & Spielberger, 1981) to a greater extent than in the United States. For bilingual respondents, this means that, depending on the language of the survey interview and the cultural frame primed by it, such questions might be perceived to have different affective characteristics, and respondents would be likely to edit their answers to match the values of the culture associated with the language. The studies by Triandis et al. (1965) and Marín et al. (1983) presented earlier illustrate this effect. For survey practitioners, such language effects would require thorough advance knowledge of where cultural differences related to questions' affective characteristics are to be expected to determine the language assignment of bilingual respondents or to employ questionnaire design techniques that reduce differentially perceived social desirability or sensitivity across language versions.

Summary
A substantial body of literature in psycholinguistics and cross-cultural psychology suggests that language used in survey interviews can affect every stage of the response formation process, and different mechanisms may simultaneously play a role at each step. As our discussion indicates, depending on language, respondents may answer the same question differently as a result of different question interpretation, different mental representations of the question target, a mismatch between the language of encoding and language of recall, different accessible information at the time of the survey request, differential anchoring of response scales, and differential self-presentation concerns.
Two shortcomings of the presented theoretical framework relate to its application. It is desirable to directly connect the outlined model to published survey research and possibly reinterpret puzzling results in light of the proposed language influences, but the existing cross-cultural survey data do not offer such an opportunity. Thus, the proposed framework remains largely speculative. Next, some of the presented mechanisms are demonstrated through research in settings, tasks, and languages that are very different from common survey tasks and languages in which surveys are typically conducted. At this stage, it is unclear to what extent the outlined mechanisms would be detectable in survey responses collected in mainstream (rather than indigenous) languages or whether they are task and language specific. We believe the merits of this theoretical model are to present possibilities for language influences and to stimulate further discussion and action related to these issues.