Positive and negative personality descriptors: UK dataset of self-referential valence, imageability and subjective frequency ratings of 300 adjectives for use in cognitive-emotional tasks

Experimental tasks comparing participants’ performance for categorising, remembering, and recognising positive and negative words are widely used in the emotional cognitive domain. Such tasks are commonly used in experimental psychology and psychiatry research, and have been shown to be sensitive biomarkers of depression and antidepressant drug action [1,2]. In addition, several of these tasks investigate self-referential processing i.e., the processing of information relevant to oneself; this has been shown to modify the way emotional words are encoded and remembered and may be a target that is amenable to treatment [3,4]. In practice, the development of such tasks for implementation in research studies often depends on the selection and matching of words according to characteristics such as valence or arousal, imageability, word frequency and word length to investigate differences in a chosen domain of interest whilst keeping important confounds constant. This introduces a need for ratings covering a range of word attributes that have been shown to affect processing. In particular, ratings of self-referential valence (how positively or negatively subjects feel about a word when this is used to describe themselves/their circumstances) have been seldom included in databases, despite the frequent investigation of the concept in research [1,5]. Other important attributes often considered in the process of matching and selection are word imageability and subjective frequency [6,7]. To facilitate the word selection and matching process required in cognitive-emotional task development, the present dataset provides subjective ratings for 150 positive and 150 negative adjectives describing personality characteristics. Across four online surveys, the 300 words were rated on self-referential valence, imageability and subjective frequency by representative samples of 200 UK-based, English-speaking adults. Basic demographics and data on depressive symptoms and state anxiety were collected from all participants. Comprehensive descriptive statistics and word length were calculated for each of the 300 words. All data cleaning and statistical analysis was performed in R. Our work is based on years of experience using the Oxford Emotional Task Battery [1,5] and may be particularly relevant for researchers using self-referential cognitive tasks with UK-based samples.


a b s t r a c t
Experimental tasks comparing participants' performance for categorising, remembering, and recognising positive and negative words are widely used in the emotional cognitive domain. Such tasks are commonly used in experimental psychology and psychiatry research, and have been shown to be sensitive biomarkers of depression and antidepressant drug action [1 , 2] . In addition, several of these tasks investigate self-referential processing i.e., the processing of information relevant to oneself; this has been shown to modify the way emotional words are encoded and remembered and may be a target that is amenable to treatment [3 , 4] . In practice, the development of such tasks for implementation in research studies often depends on the selection and matching of words according to characteristics such as valence or arousal, imageability, word frequency and word length to investigate differences in a chosen domain of interest whilst keeping important confounds constant. This introduces a need for ratings covering a range of word attributes that have been shown to affect processing. In particular, rat-ings of self-referential valence (how positively or negatively subjects feel about a word when this is used to describe themselves/their circumstances) have been seldom included in databases, despite the frequent investigation of the concept in research [1 , 5] . Other important attributes often considered in the process of matching and selection are word imageability and subjective frequency [6 , 7] . To facilitate the word selection and matching process required in cognitive-emotional task development, the present dataset provides subjective ratings for 150 positive and 150 negative adjectives describing personality characteristics. Across four online surveys, the 300 words were rated on self-referential valence, imageability and subjective frequency by representative samples of 200 UK-based, English-speaking adults. Basic demographics and data on depressive symptoms and state anxiety were collected from all participants. Comprehensive descriptive statistics and word length were calculated for each of the 300 words. All data cleaning and statistical analysis was performed in R. Our work is based on years of experience using the Oxford Emotional Task Battery [1 , 5]

Value of the Data
• This dataset provides ratings of self-referential valence, word imageability and subjective frequency which can be used in the development of experimental cognitive tasks for psychological and psychiatric research. • These data are primarily of interest to researchers working in emotional information processing, either as a standalone field, for understanding psychiatric conditions, or for assessment of drug efficacy. • The words in this dataset can be used for the development or validation of existing or novel experimental tasks used in a wide range of cognition research. • Our focus on self-referential valence rather than more traditional arousal ratings (i.e., where subjects rate the feeling elicited by a word) was motivated by feedback from study participants completing specific self-referential emotional cognition tasks [1 , 5] . We believe this to be a strength of the current dataset as self-referential processes are frequently studied in psychological and psychiatric research, but self-referential ratings are seldom included in large databases. Thus, the current dataset provides a valuable resource which may be used alongside more comprehensive norms such as the English Word Database of EMOtional Terms (EMOTE) database [8] .

Data Description
Positive and negative personality descriptor words dataset: calculated self-referential valence, imageability and subjective frequency statistics (central tendency, range, and variability) and word length for final list of 300 adjectives Positive and negative personality descriptor words data dictionary: data dictionary for "Positive and negative personality descriptor words dataset" Raw dataset 1: initial self-referential valence ratings for list of 482 adjectives, demographics, and depression and trait anxiety questionnaire responses collected from 100 participants + data dictionary Raw dataset 2: further self-referential valence ratings for final list of 300 adjectives, demographics, and depression and trait anxiety questionnaire responses collected from 102 participants + data dictionary Raw dataset 3: imageability ratings for final list of 300 adjectives, demographics, and depression and trait anxiety questionnaire responses collected from 200 participants + data dictionary Raw dataset 4: subjective frequency ratings for final list of 300 adjectives, demographics, and depression and trait anxiety questionnaire responses collected from 202 participants + data dictionary Self-referential valence reliability dataset: results from analysis of variance exploring the effects of data collection phase on self-referential valence ratings for final list of 300 adjectives + data dictionary

Experimental Design, Materials and Methods
An initial list of adjectives denoting personality characteristics was drawn up by the research team using previous rating studies [9] and online repositories. Since our focus was on building a dataset of recognisable words that could be used in cognitive assessments with varied participant samples, words that were independently identified by multiple members of the research team as inappropriate or excessively obscure were removed. This resulted in an initial list of 482 personality characteristics.
In Phase 1 of data collection, we included these 482 adjectives in an initial Qualtrics survey (see Supplementary file 1) to collect self-referential valence ratings from a representative sample of UK-based, English-speaking adults. Survey instructions were adapted from Anderson [9] with one notable change; whilst Anderson advised participants to rate the word according to how much they would like a person described using the word, we instructed participants to think of another person describing them as each of the words and to rate their feeling towards being described as such. The motivation for this change was to enable directly employing the resulting dataset in self-referential cognitive tasks such as the Emotional Categorisation Task of the Oxford Emotional Test Battery [1 , 7] , where participants are given equivalent instructions when sorting words presented on screen.
Participants were thus presented with each of the 482 adjectives and asked to rate how they would feel about being described in this way, from 0 (most negative) to 100 (most positive). If they did not know the meaning of a word, they were instructed to leave the rating at the default score of 50. The order of word presentation was randomised for each participant. Throughout the survey, participants had to answer 9 simple arithmetic questions (e.g., what is 50 + 10?) intended to check their engagement in the task. We also collected basic demographic data. Survey responders also completed self-report measures of current depressive symptomatology (Center for Epidemiologic Studies Depression (CESD) scale) and trait anxiety (Trait subscale of the State-Trait Anxiety Inventory (STAI-T)). Phase 1 data collection ran between 17-18 August 2020 and 100 responders completed the survey.
Data from Phase 1 was downloaded from Qualtrics (see Raw dataset 1) and processed in R (see Supplementary file 2). Survey responses were checked for completeness and proportion of engagement questions answered correctly, but none were excluded from the subsequent analysis. The maximum number of missing responses from any one participant was 83 (17%), and all engagement scores were above our cut-off of 50% correct. CESD and STAI-T responses were converted to numerical values. We computed a series of statistics (mean, standard deviation, standard error, number of ratings received, median, minimum rating, maximum rating, range, skew, kurtosis, word length) for each of the 482 adjectives and arranged them by their mean self-referential valence rating in ascending order, from most negative to most positive. We then picked the 150 most negative and 150 most positive adjectives to further characterise for our final dataset. All negative words had a self-referential valence rating of under 30, and all positive words had a self-referential valence rating of over 70. The frequency histogram of  self-referential valence ratings for the final list of 300 words is displayed in Fig. 1 -the dip in the centre implies an absence of neutral words. The minimum number of ratings received for any included word was 87/100.
In Phase 2 of data collection, the resulting list of 300 words was used in three further Qualtrics surveys (see Supplementary file 1). The first sought to validate the self-referential valence ratings acquired in the initial survey and was identical in the questions it posed to   participants. The second survey was used to collect word imageability ratings. Survey instructions were adapted from Cortese and Fugett [10] ; responders were asked to rate how easily they could form a mental image of each of the 300 adjectives, from 0 (very difficult) to 100 (very easy). The third survey was used to collect subjective word frequency ratings. Survey instructions were adapted from Stadthagen-Gonzalez and Davis [11] ; responders were asked to rate how often they came across each of the 300 adjectives in their everyday life, from 0 (never) to 100 (several times a day). For these three surveys, if they did not know the meaning of a word, participants were instructed to tick a box next to each item labelled "word not known". These responses were stored as missing ratings and counted towards exclusions based on completeness. All three surveys featured 8 arithmetic engagement questions, demographics questions as well as the CESD and STAI-T scales. The order of word presentation was randomised for each participant. 102 responders completed the additional self-referential valence survey, 200 completed the imageability survey and 202 completed the subjective frequency survey. Phase 2 data collection ran between 24 February and 9 March 2021. Data from Phase 2 was downloaded from Qualtrics (see Raw datasets 2-4) and separately processed in R (see Supplementary file 2). Survey responses were checked for completeness and proportion of engagement questions answered correctly. Four survey responses (three for the imageability survey and one for the subjective frequency survey) were excluded from analysis because the proportion of engagement questions answered correctly was lower than 50%. CESD and STAI-T responses were converted to numerical values. Cleaned data from the two selfreferential valence surveys were merged prior to additional analyses being performed, resulting in valence ratings from 202 participants. Preliminary mean values for the self-referential valence, imageability and subjective frequency ratings of each of the 300 adjectives were calculated.
For each type of rating collected (self-referential valence, imageability, subjective frequency), we computed a "participant difference score", as follows: for every word, we calculated the difference between each participants' rating and the overall mean rating for that word. For every participant, we summed up those differences between their rating and the overall rating and divided the result by the number of words the participant had rated. Thus, this "participant difference score" was a measure of how much, on average, any participant's ratings deviated from the average group rating. We excluded participants whose "difference scores" were more than 3 standard deviations above the mean difference score value; this resulted in three responses being excluded from the self-referential valence survey and four responses being excluded from the subjective frequency survey.
We then computed a series of statistics (mean, standard deviation, standard error, number of ratings received, median, minimum rating, maximum rating, range, skew, kurtosis) for each type of rating for each of the 300 personality descriptors. The statistics for self-referential valence, imageability, subjective frequency and word length were merged into a final dataset (see Positive and negative personality descriptor words dataset). We pooled scores from all participants for the reported statistical analyses, based on exploratory analyses showing age, gender and depression/anxiety symptoms had little effect on participant ratings (see Figs. [2][3][4][5][6][7][8]. However, if greater stratification is desired, specific population statistics can be re-calculated from the raw datasets. We also explored the relationship between the initial self-referential valence ratings collected in Phase 1 (first Qualtrics survey) and those collected during Phase 2 (second Qualtrics survey) for our final list of 300 words. We found the mean ratings for each word to be highly correlated between the two surveys (Spearman's rho = 0.97, p < .01; see Fig. 9 ). Additionally, we conducted a mixed effects analysis of variance to statistically assess the effects of data collection phase on the self-referential valence ratings acquired for each personality descriptor (see Self-referential valence reliability dataset). After correction for multiple comparisons, the rating for only 1 of 300 words was found to differ significantly between Phase 1 and Phase 2 ("modern").
No identifiable data was collected by the research team. Prolific academic collects identifiable data necessary for contacting participants, is GDPR compliant and does not store these data longer than is required. Only fully anonymised data is provided with this article -all pseudonymous variables have been removed by the research team prior to sharing.

Ethics Statement
Collection and redistribution of these data was approved by the University of Oxford Central University Research Ethics Committee (CUREC), under reference number R71109/RE002. Informed consent was obtained from all participants prior to filling in the survey, and participant data are fully anonymised.