Details Matter: The Effect of Different Instructions and Their Order on the Bias of Measured Personality Traits by Social Desirability

Social desirability is a tendency to respond to items in a socially acceptable way. It can bias results and is a threat to the validity of the measure. The current study focused on exploring the effect of different instructions on personality traits. The sample consists of 363 Slovak adults, 260 women, and 103 men. The participants were between 18 and 62 years old ( M = 25.6; SD = 6.76). The Big Five Inventory-2 was used for measuring personality traits and social desirability. The participants were split into two groups depending on which instruction was administered first – honest setting or social desirability inducing setting (imagining the selection situation). All participants responded to both scenarios. We hypothesized that extraversion, agreeableness, and conscientiousness are socially desirable traits, and so they will be higher using selection simulating instruction compared to honest instruction. The opposite was hypothesized for negative emotionality. The social desirability of open-mindedness was explored. The results confirmed all our hypotheses and showed that open-mindedness is a socially desirable trait as well. Importantly, we found an effect of the order of administrating different instructions – the effect of induced social desirability was present in the honest instruction setting.


Introduction
Researchers of social sciences often implicitly assume that the participants respond to the items of self-report measuring tools honestly and that these tools measure only the intended variables (e.g., Baumgartner & Steenkamp, 2001). However, this assumption is not true. Self-report measures are biased by numerous factors and response styles are one of such factors (e.g., Tóth-Király et al., 2020). There are several response styles, for example, acquiescence, disacquiescence, extreme response styles, careless response styles, and others (e.g., Rammstedt & Farmer, 2013;Wetzel et al., 2013). In this study, we focus on a socially desirable response style. It is defined as a tendency of participants to present themselves in a direction that is acceptable by the major society of which the individual is part of, and this self-presentation does not correspond with reality (e.g., Bergen & Labonté, 2019;Halama, 2011;Liu & Liu, 2021).
Social desirability can distort the results of research and threaten the validity of the measure. Numerous studies report social desirability as a possible limitation in the interpretation of their research, for example, involving perfectionism (e.g., Stoeber & Hotham, 2013), anger and aggressivity (Fernandez et al., 2018), empathy (Nook et al., 2016), violence (Cava et al., 2020), and other. A higher probability of such bias lies in the assessment situations which could have serious social impact/ consequence (e.g., Keiser & Payne, 2019). One of the most investigated assessment situations is a job interview or selection process, both simulated or the real one (e.g., Bensch et al., 2019;Liu & Zhang, 2020;Preiss et al., 2015).
We investigated possible bias by social desirability in measuring Big Five personality traits -extraversion, agreeableness, conscientiousness, negative emotionality, and open-mindedness by using honest and simulated selection instruction. In general, there is a consensus that extraversion, agreeableness, conscientiousness, and open-mindedness are socially desirable traits and negative emotionality is a socially undesirable trait (e.g., Bäckström & Björklund, 2013;Grieve & de Groot, 2011). However, the relationship between social desirability and open-mindedness was not confirmed in several studies (e.g., Anglim et al., 2017;Jakubek & Krafčíková, 2016). Based on these findings, we assume that participants will score higher in extraversion, agreeableness, conscientiousness and lower in negative emotionality with simulated selection instruction. The possible difference in open-mindedness between these two instructions will be explored. Moreover, we will investigate whether the simulated selection scenario has an impact on participants' responses in the honest instruction setting, which followed afterward.

Procedure
All participants responded to BFI-2 items two times with two different instructions. In the beginning of the survey, participants were divided into two groups -the first group responded using the honest instruction first and the simulated selection instruction second. For the second group of participants the instruction order was reversed. We asked the participants to honestly respond to the following items in the honest instruction ("Here are a number of characteristics that may or may not apply to you. Please select a response to each statement to indicate the extent to which you agree or disagree with that statement. No answer is right or wrong, thus, please try to answer honestly") and in the simulated selection instruction, we asked them to imagine that the responses to the following items are part of the selection process ("Here are a number of characteristics that may or may not apply to you. Please select a response to each statement to indicate the extent to which you agree or disagree with that statement. Try to imagine that answering to following items is part of the selection process to your preferred job with the knowledge 156 Studia Psychologica, Vol. 65, No. 2, 2023, 154-164 that your answers will have an effect on the selection process").

Sample
The sample consisted of 363 participants from the Slovak general adult population, 260 women (71.6%) and 103 men (28.4%). Age of participants is in the range from 18 to 62 years (M = 25.6; SD = 6.76) for whole sample and M = 25.4 (SD = 6.76) for women and M = 26.1 (SD = 6.78) for men. The only two criteria were that participants have to be at least 18 years old and be from Slovakia. All participants responded to personality inventory two times. They were divided into two groups at the start of the research. Participants in the first group (N = 187, 51.5%; 134 women, 53 men; M age = 25.78; SD age = 7.33) first responded to items with honest instruction and then with simulated selection instruction (honest-first group). Oppositely, participants in the second group (N = 176, 48.5%; 126 women, 50 men; M age = 25.35; SD age = 6.11) first responded to items with simulated selection instruction and then with honest instruction (honest-second group). Data were collected via the research platform formr (Arslan et al., 2018) in 2019. All participants were informed about the voluntary nature of their participation and their agreement with participation in the research by fulfilling the presented items was obtained. Participants were entered in a drawing for a voucher to a bookstore. Our data is available on open science framework (osf): https://osf.io/r8a2x/?view_on-ly=ad3d9f2205354262895c81df2360844b

Measures
Firstly, participants responded to socio-demographic items. Next, they responded to items of BFI-2 (Halama et al., 2020) two times with two different instructions. BFI-2 contains 60 items, and it measures 5 domains. Every domain contains 3 facets (subscales) -extraversion contains the facets sociability, assertiveness, and energy level; agreeableness contains the facets compassion, respectfulness, and trust; conscientiousness contains the facets organization, productiveness, and responsibility; negative emotionality contains the facets anxiety, depression, and emotional volatility; and open-mindedness contains the facets intellectual curiosity, aesthetic sensitivity, and creative imagination. Every domain contains 12 items, and every facet contains 4 items. Domain and facets are balanced in the number of positive (pro-trait) and reverse (con-trait) items. Participants respond to items using the Likert scale from 1 (Disagree strongly) to 5 (Agree strongly). Domains have very good internal consistency -the values of Cronbach's α with honest instruction are from .82 (open-mindedness) to .90 (negative emotionality) with a mean value of .86 and slightly lower values on the facets level from .65 (trust) to .85 (sociability) with a mean value of .75 with honest instruction. Values with simulated selection instruction on the domain level range from .82 (open-mindedness) to .91 (conscientiousness) with a mean value of .87 and on the facet level from .58 (intellectual curiosity) to .82 (organization) with a mean value of .75. Participants also responded to several control items that should minimize the careless response bias (e.g., Shaman & Berning, 2020) and thus increase data quality. Participants who failed in responding to control items were excluded from the sample.

Data Analysis
The data were analyzed using jamovi software (the jamovi project, 2021). The Student's paired samples t-test was used to access differences between domain and facet scores obtained from honest and simulated selec-tion instruction. Differences between groups based on which instruction was presented first, were analyzed using Student's independent samples t-test.

Results
The results showed that in all personality domains as well as facets there were significant differences between honest and simulated selection instructions. We confirmed our hypotheses that extraversion, agreeableness, and conscientiousness are socially desirable traits and negative emotionality is social-ly undesirable. Moreover, we found that open-mindedness was higher in simulated selection instruction, suggesting its social desirableness. From the domain perspective, the largest, medium differences were found for extraversion, negative emotionality, and conscientiousness, followed by small differences for open-mindedness, and agreeableness. Focusing on facets, the differences ranged from d = 0.22 (aesthetic sensitivity) to 0.72 (energy level). These results are presented in Table 1.
In an exploratory analysis, we focused on differences in participants' responses to items with different order of instructions -wheth- er participants first responded to items with honest or with simulated selection instruction. The differences in traits between different instruction in the group of participants who first responded to items with honest instruction are in the same direction and all significant, but these differences are with a higher effect size. The differences in traits between different instructions in participants who responded first with simulated selection instruction are more interesting. This time, we did not find a significant difference in do- main conscientiousness, all other domains are significantly different with weak effect sizes, with the exception of extraversion with medium effect sizes. The results of the differences of facets are interesting in this group as well -only energy level from all facets of extraversion is significantly different with high effect size. More detailed results are presented in Table 2.
Another analysis was performed due to the interesting results of the planned exploratory analysis. We were interested in the difference of personality traits with honest instruction according to its order of administration. We found significant differences in extraversion, conscientiousness, and negative emotionality. Participants who responded first with simulated selection instruction reached a higher level of extraversion and conscientiousness and a lower level of negative emotionality with honest instruction than participants who responded first with honest instruction. Similar

Table 3 Differences in personality traits with honest instruction between groups divided by order of administrated instruction between honest and simulated selection instruction
Honest instruction as first results are also on the facet level -facets of conscientiousness and negative emotionality are interesting -significant differences are in facets organization, productiveness, anxiety, and depression (weak to medium effect size). Detailed results are presented in Table 3.
We also investigated the results of differences in personality traits with simulated selection instruction with a focus on a different order of administrated instruction. We found significant differences in all traits -participants who first responded with honest instruction reached higher scores in extraversion, agreeableness, conscientiousness, open-mindedness, and lower in negative emotionality than participants who first responded with simulated selection instruction. On the facet level, we found significant differences in energy level, respectfulness, trust, and for all facets of conscientiousness, negative emotionality and open-mindedness. Detailed results are presented in Table 4.

Discussion
We hypothesized that extraversion, agreeableness, and conscientiousness will be higher and negative emotionality will be lower with simulated selection instruction than with honest instruction. Such difference suggests the social desirability of the measured variables (e.g., Anglim et al., 2017). All our four assumptions were confirmed. We even found a higher score of open-mindedness with simulated selection instruction than with honest instruction. Therefore, we consider extraversion, agreeableness, conscientiousness, and open-mindedness as socially desirable traits and negative emotionality as socially undesirable trait. Results are interesting even on the facet level -energy level is the most socially desirable facet of the extraversion domain and creative imagination is the most socially desirable facet of the open-mindedness domain.
Our results confirmed that awareness about using self-report measures for psychological assessment in selection processes is needed. Besides our study, there is other evidence that both simulated (see also Liu & Zhang, 2020) and the real selection process (see also Preiss et al., 2015) increase the social desirability in participants' responding to self-report measures. Thus, users who use such measures in real-life situations should be aware of this issue and maximize the effort to prevent socially desirable responding.
Exploration analysis provided interesting results as well. We found that the order of administration of instructions has an effect on socially desirable responding. When simulated selection instruction was presented first, the differences between both instructions were lower, and sometimes they were even practically meaningless. These results could be logically interpreted -if participants re-sponded in the way of social desirability the first time, it could be more difficult for them to present a worse image of themselves.
Thus, for future research focused on social desirability or faking we recommend using honest instruction as the first one, if participants are responding to both honest and simulated selection instruction in one session.
In further exploration, we focused on possible differences in social desirability responding to items of personality traits with honest instruction between participants who firstly responded with honest or with simulated selection instruction. Interpretation of these results is not so simple, because after responding with simulated selection instruction, participants probably responded in the way of social desirability even with the honest instruction. We can discuss why participants responded similarly with extraversion, conscientiousness, and negative emotionality and differently in agreeableness and open-mindedness. One possible explanation could be that extraversion, conscientiousness, and negative emotionality are somehow generally socially desirable personality traits while agreeableness and open-mindedness could be socially desirable rather in the case of instruction with simulated selection. This is just a reflection and a possible interpretation, further research is needed -for example, using another way of measuring social desirability or adding another fake instruction that would not be dependent on the selection process.
Interpretation of the results of the differences in personality traits with fake instruction between participants who first responded to honest and simulated selection instruction is similarly difficult. In this case, we surprisingly found that participants who responded first to honest instruction responded in a socially desirable way more than participants who responded first with simulated selection instruction. We do not have any knowledge about research with a similar analysis, but we suppose that participants who responded first with honest instruction responded in a socially desirable way twice. As mentioned above, social desirability is more present in situations with serious social consequences, however, it is present even in regular assessment situations (e.g., Caputo, 2015). It could mean that if participants responded in a socially desirable way with honest instruction, they did it even more when they responded with simulated selection instruction.
Our exploration results can suggest that the effect of instruction remains even when after another instruction is administered. If participants first respond with honest instruction and even with this instruction, they respond in a socially desirable way on some level, they will do it with simulated selection instruction even more. We can look at it in the opposite way as well: if participants first respond with simulated selection instruction, their socially desirable responding may remain and bias even the honest instruction.
We consider the confirmation of social desirability of specific Big Five personality traits in our cultural setting as the main contribution of our study. We also consider the confirmation of simulation of selection as a tool for measuring faking or social desirability (e.g., Grieve & de Groot, 2011;Preiss et al., 2015) as another important contribution of this study. Our results also emphasize the need for the correct formulation of instructions if the researcher would like to measure social desirability by different instructions. The possible mistakes in giving instructions could be omitting information on whether answers are (not) anonymous, that there is no right or wrong answer (if that is true), etc. This is important also because social desirability is not present only during a selection process (or its simulation), but also in other assessment situations that can evoke anxiety in participants (e.g., Halama, 2011). Our results also showed that it is probably more beneficial to use honest instruction as the first one and simulated selection instruction as the second one. Even the order of administrated instructions or formulation of instruction can affect the results.
The study contains a few limitations. The results are limited by using instruction with the simulated selection as a proxy of social desirability. The limitation is due to the nature of such instruction -it is questionable whether the social desirability in this context can be generalized to another context (e.g., Kovačić et al., 2014;Wetzel et al., 2021). A more precise term could be social desirability in the work environment or in the selection process. The other methodological limitation is using a simulated selection process instead of a real-life selection process. Such instruction could affect the participants' responding differently -it is possible that they will not have so high a motivation to fake (Costa et al., 2001). However, there is some evidence that the difference in faking between the real-life selection process and its simulation is not so substantive (e.g., Preiss et al., 2015). Another limitation is the quite small sample size. Also, using the manifest approach for data analysis limits differing true scores of traits of participants from bias by social desirability (e.g., Liu & Liu, 2021).

Conclusions
We confirmed that extraversion, agreeableness, and conscientiousness are socially desirable traits, and that negative emotionality is a socially undesirable trait. We explored and found that open-mindedness is a socially desirable trait as well. We also investigated and confirmed that simulated selection instruction can be used as an effective tool for measuring social desirability. Our main exploration results and contribution is that even the order of instruction is important for social desirability. This issue needs further investigation, however, it seems that when social desirability is measured, details matter.