A corpus-based approach to (im)politeness metalanguage: A case study on Shakespeare's plays

in


Introduction
The paper seeks to broadly contribute to (im)politeness (meta)pragmatics by establishing a corpus-based method for inductively identifying a large body of (im)politeness metalinguistic items, and one way they might be explored. In their criticisms of traditional theories of politeness, Watts et al. (2005Watts et al. ( [1992) and Eelen (2001) call for more attention to 'firstorder' politeness. One of their major criticisms of the traditional theories (notably Brown andLevinson (1987 [1978]), but also Lakoff (1973) and Leech (1983)) was that they took the definition and delimitation of politeness (and impoliteness) for granted. They called for the starting point to any theory to be how speakers of a language or members of a community themselves define such a notion. This is well summarised by Verschueren (1996:196): … conceptualisations and practices are inseparable. Consequently, there is no way of understanding forms of behaviour without gaining insight into the way in which the social actors themselves habitually conceptualise what it is they are doing.
'friendly', 'rude', 'offensive', and so on e is more frequent and diffuse. The present study focusses on a body of lexical items which fall under what Eelen (2001:35) refers to as "classificatory politeness", though there are of course occasions where such items are used in more metapragmatic discussions about (im)politeness. To avoid terminological ambiguity, the language under study will continue to be referred to by the more generic '(im)politeness metalanguage'.
It is important to acknowledge at this stage that while the traditional approaches to (im)politeness were criticised for being biased towards the speaker (Eelen, 2001:96 et passim), evaluations of (im)politeness could arguably go too far the other way, and bias the perspective of a participant or hearer responding to another speaker's behaviour. However, while it is likely that evaluations are typically used in response to another's behaviour, such linguistic expressions can and are used to evaluate both the other and one's self. For instance, in example (1), taken from Henry VIII, a character questions how their own behaviour may have been impolite: (1) Alas Sir: In what have I offended you? What cause Hath my behaviour given to your displease, That thus you should proceed to put me off, And take your good Grace from me? (Henry VIII, 2.4)

Previous research on (im)politeness metalanguage
K ad ar and Haugh (2013:192e193) identify three types of existing (im)politeness metalanguage research.
Corpus analysis: Studies that use corpora to study language quantitatively, typically applying some kind of statistical analysis. Lexical/conceptual mapping: Studies that focus on metalanguage to define the semantics or concepts of politeness or impoliteness, typically through a mixture of qualitative and quantitative means. These studies normally have a comparative focus. Metapragmatic interviews/questionnaires: Studies that use interviews and questionnaires to elicit from participants their understanding and usage of (im)politeness metalanguage. (Adapted from K ad ar and Haugh, 2013:192e193). This is a useful division, though it is worth noting that lexical and conceptual mapping studies occasionally rely on corpus analysis (e.g. Taylor, 2017) and metapragmatic interviews/questionnaires (e.g. Pizziconi, 2007). Indeed, Culpeper (2011: chapter 3) employs all three. The present study proposes a slightly simpler, two-way distinction of (im)politeness metalanguage research based on the approach to data: Studies where naturally-occurring data have been analysed for (im)politeness metalanguage, e.g. through corpus linguistics (see Culpeper, 2009Culpeper, , 2011Culpeper et al., , 2019Jucker et al., 2012;Waters, 2012;Taylor, 2015Taylor, , 2017 or qualitative analysis (see Sifianou, 2019). Studies where new data have been elicited for the purposes of interpreting (im)politeness metalanguage, for example through questionnaires, surveys, interviews, etc. (see Blum-Kulka, 2005; Ide et al., 2005Ide et al., [1992; Pizziconi, 2007;Bolivar, 2008;Gagne, 2010;Culpeper, 2011;. The first set of studies involve looking at how people already talk (or have talked) about (im)politeness in interaction. The second set of studies involve prompting speakers specifically to talk and consider their conceptualisations of (im)politeness. From the example studies listed above, it may be noted once again that (a) some studies use both types of data, such as in Culpeper (2011: chapter 3), which applies both corpus analysis on preexisting data but complements this with reports undertaken by undergraduates; and (b) that there is nonetheless a general division in methodology, with the corpus studies generally falling into the former group, and K ad ar and Haugh's (2013) first two categories falling into the latter group. Due to its historical focus, this study naturally fits into the first category.
There is little space to consider these studies in depth here. However, one shortfall across many studies with preexisting data is the range of (im)politeness they consider. Frequently, a relatively small group of items are preselected and explored. For instance, Waters (2012) focusses entirely on 'rude', Ide et al. (2005Ide et al. ( [1992) compares the term 'politeness' in English with 'teineina' in Japanese,  look at six different impoliteness items. These items are typically selected due to their perceived centrality to the issue at hand, but do not represent the full range of evaluative items. One notable exception to this strategy is Jucker et al. (2012), who undertake "metacommunicative expression analysis" on politeness terms in the history of English. Their study begins with a search in the Historical Thesaurus of the Oxford English Dictionary (hereafter 'HT') for historical synonyms of 'courtesy', 'courteous', and 'courteously', removing words with homonyms in other word classes and adding spelling variations, and arriving at a list of 185 (im)politeness metalanguage terms. They entered these terms into the Helsinki Corpus and returned 1,164 instances of these terms in total. This clearly goes beyond a handful of terms, though still begins with synonyms for 'courtesy', 'courteous', and 'courteously'. So while they capture a broader range of evaluative items than previous studies, the focus on the semantic area of 'courtesy' could still potentially exclude certain behaviours described in the (im)politeness literature, for instance the notion of affection or friendliness (cf. Section 4.4). The present study is heavily influenced by Jucker et al. (2012) but adapts their methodology in a few aspects, described in Section 3.

(Im)politeness metalanguage in drama
It is important at this stage to acknowledge that (im)politeness metalanguage in drama comes with its own considerations. McIntyre and Bousfield (2017:765e766) point out that fictional data grants new affordances in this area, for instance direct access to the thoughts of characters involved and so their inner perceptions and evaluations of (im)politeness. While in play texts we do not often get access to thoughts to the same way as in prose texts, characters do offer monologues or asides which provide the audience access their personal thoughts, perceptions, assessments, and so on.
In that vein, it is important to consider the discourse architecture of play texts. Short's (1996:169) model of the discourse architecture of plays outlines that there is interaction between the characters in a play, but also the layer of the playwright and the audience. In other words, the text is an act of communication between playwright and audience, involving acts of communication between the characters involved (Jucker, 2016:95e96). (Im)politeness metalanguage may offer the playwright opportunities to signal events crucial to the plot, or may offer characterisation cues about others (cf. Culpeper and Fernandez-Quintanilla, 2017:105e106). However, given that this study will use a corpus of plays, it will naturally focus on the interactions between characters, and therefore says little on the playwright and audience. Dividing the first-order perspective of the characters from that of the audience (or indeed the analyst) in this way is arguably appropriate: the interactions in the plays are fictional and fixed, whereas Shakespeare's audiences can and do change enormously, not least diachronically. How the audiences perceive the interactions involved is another (interesting and worthwhile) set of data, but outside of the scope of this approach.

Methodology and findings: locating (im)politeness metalanguage
This study employs the Enhanced Shakespearean Corpus (hereafter 'ESC') compiled as part of the Encyclopedia of Shakespeare's Language project at Lancaster University . It is built from the First Folio, plus The Two Noble Kinsmen and Pericles, therefore covering 38 plays for a total of roughly 1 million words . The source texts were supplied to the compilers of the ESC from the Internet Shakespeare Editions (https://internalshakespeare.uvic.ca/) based at the University of Victoria. The corpus is freely available on CQPWeb (see http://wp.lancs.ac.uk/shakespearelang/) and has been highly annotated: spelling variation has been regularised and it is tagged for part-of-speech, semantics, and genre, as well as social information such as the gender and social status of characters speaking (Murphy et al., 2020).
In the previous section, previous (im)politeness metalanguage studies were criticised for preselecting the items they explore, for instance 'rude' or 'politeness'. This study employs an inductive approach to (im)politeness metalanguage by not preselecting specific items for analysis and instead, like Jucker et al. (2012), employs the freely accessible HT (Kay et al., 2022) to identify the metalinguistic items. Their concern was primarily the language of 'courtesy', and therefore drew from the HT's category 01.15.21.04.01 e 'Courtesy'; however, as mentioned above, this study seeks to capture a broader range of behaviours that concern issues of (im)politeness, and goes therefore higher in the hierarchy, taking all the words classed under 01.15.21.04 e 'Good Behaviour'. As it also considers impoliteness, it also took all the words classed under 01.05.21.05 e 'Bad Behaviour'. All words with recorded instances from between 1500 and 1700 were drawn. This period was designed to capture all words that could have been used in Shakespeare's life, starting with 1500 to exclude words that had fallen out of use, and extending to 1700 to account for potential errors in first recorded use. This resulted in 432 potential politeness terms and 193 potential impoliteness terms. Of this number, only 164 politeness terms and 57 impoliteness terms were found in the corpus. However, not all items were relevant to issues of (im)politeness. For instance, the 'Good Behaviour' category referred to above included the word 'form', referring to etiquette or manners in sense 14 of the Oxford English Dictionary (n.d.). A scan of concordance lines for 'form' reveals that it is rarely used in this sense. Thus, a process was required to eliminate irrelevant items, so that the resultant corpus search was refined to items which were predominantly used with (im)politeness senses, to focus on a core (im)politeness metalinguistic lexicon.
To resolve this, concordance lines for each term were examined to identify if it is used most frequently to evaluate social behaviour. This could be linguistic or non-linguistic behaviour (cf. Lakoff, 1973:303). Each term was searched individually. For words with fewer than 50 instances, every instance was examined; for words with more than 50 instances, an initial sample of 25 instances were searched; for those with 100e200 instances, 50 instances were searched; and for those with more than 200 instances, 100 instances were searched. 'Fair' was the most frequent of the words examined, which occurs 771 times in the corpus. If fewer than 50% of the instances examined were used to evaluate social interaction, the word was eliminated. If, after any sample, it remained ambiguous or marginal whether a word was used to evaluate social behaviour, a further sample of the same size was taken. 351 terms were examined: for 303 of these, every instance of their use was manually examined; of the 48 remaining terms only 3 had fewer than 25% of all concordance lines examined (100/771 for 'fair', 100/597 for 'grace', and 100/406 for 'seem*').
Judgements on whether an individual instance of a word, or a pattern of its usage, was being used to evaluate social behaviour were made based on concordance as well as reference materials, particularly the Oxford English Dictionary (hereafter 'OED'). The OED is semantically comprehensive with historical citations, typically citing earliest uses of a sense, making it clear which sense of a word could be used in this period.
Three other processes were employed to nuance the terms and expand the lists beyond what appeared in the HT: Testing other terms found in context: In the corpus, (im)politeness metalinguistic terms were frequently used in coordination with others. New terms were drawn and tested with the same process above when they were used in coordination with other words already located in the initial list.
Using the corpus's 'wild card' function to collect morphological variations: For instance, entering 'benevolen*' in the corpus will return all items that begin with that graphological string of letters, i.e. 'benevolence', 'benevolent', and 'benevolently'. Placement of the wild cards relies on the user's decision, but informed decisions can be made based on the known morphological variation of the words. This step greatly increased the recall of the search terms (cf. Jucker et al., 2012, section 3.3).
Using the corpus's part-of-speech tagger to reduce homonyms: All words in the corpus have been annotated for part-ofspeech using CLAWS (see Culpeper and Archer, 2020:195), allowing greater precision. For instance, 'kind' in the corpus includes the adjective relating to friendliness, but also the noun meaning generic types of things. By suffixing 'kind' with '_JJ', the part of speech tag for adjectives, it restricts the results to only the adjective. With 'despite', the tag '_NN1' restricted it to the singular noun, excluding the preposition.
The entire process above resulted in 4,023 occurrences of 234 different terms in total. 109 terms covering 2,112 instances were deemed related to politeness, and 125 terms covering 1,911 occurrences were deemed related to impoliteness. For the sake of transparency, the full search strings are listed here: Politeness metalanguage search string.

Second-order analysis: semantically categorising the (im)politeness metalanguage
The search terms above constitute the findings of the methodology described in Section 3. However, a variety of conceptual spaces are occupied by these terms. Talk of 'civility' is different to talk of 'kindness'; to describe someone as 'uncourteous' is not the same as describing them as 'villainous'. Previous work on (im)politeness metalanguage has occasionally employed broad semantic categories to capture similar distinctions (e.g. Culpeper, 2011:94). A large number and range of terms are found in the present study, revealing even intuitive differences, for instance between language of formality (e.g. 'civil', 'courteous', 'proper') and the language of warmth and kindness (e.g. 'affable', 'kind'). There are different types of (im)politeness, which different frameworks and theories describe. This section analyses these findings further by dividing them into five semantic categories:

CIVILITY-UNCIVILITY GENEROSITY-HARM GOOD NATURE-BAD NATURE KINDNESS-UNKINDNESS SOFTNESS-ROUGHNESS
These groups were established through analysis of how terms were used, collected when identifying whether they should be included in this study (see Section 3). Patterns of usage resulted in semantic clusters. For example, there was a clear pattern of linguistic items which metaphorically related to the notion of softness or roughness, for instance 'gentle', 'mild', and 'rough'. These notes were complemented with further investigation of the terms in context, and again with reference to previous research and the OED, refining the semantic clusters into more clearly defined groups.
The groups all cover both politeness and impoliteness metalanguage and exist as a set of polarities, rather than having separate groups for politeness and impoliteness. This acknowledges that separating words based on 'politeness' and 'impoliteness' may be a somewhat forced distinction, potentially blurring the fact that all these terms are used to describe and evaluate social interaction equally, orienting only towards positive evaluations (politeness) or negative evaluations (impoliteness). The definitions of each group reflect this too. It is also worth noting that some groups are more directed at linguistic behaviour than others, as will become apparent in their description. The names of the groups were designated at the end of this process, not influencing the formation of the categories. However, prototypical features exist within the categories which define and distinguish them, and it is illustrative to apply these as titles retrospectively. Small capital letters (e.g. 'GENER-OSITY') are used to designate these as second-order concepts, though the words themselves almost all appear within their groups.
It is important to treat these categories with some caution: these five categories cover 109 politeness terms and 125 impoliteness terms. It would be an oversimplification to suggest that words within the same category are synonymous, or that a word carries the same sense in every instance. This is a key aspect of the discursive perspective (see Watts, 2008:292e293;Bousfield, 2010:118e119). It would also be an oversimplification to say that each group is clearly distinct from the others: there are cases of overlap. However, these groups are intended to appeal to prototypical (in the sense of Rosch, 1973) categories with broad conceptual similarities, and all words fit reasonably well in one group, or at the border of one or two other groups. For the purposes of producing corpus search terms for each category, words at the border of two or more groups were categorised under the category they fit most frequently. For instance, 'kindness' is relevant to both my KINDNESS and GENEROSITY groups, but most frequently the former, and is therefore kept there, whereas the plural 'kindnesses' is kept with GENEROSITY. Similarly, it is worth reiterating that these semantic groups do not describe synonyms, but rather approach prototypical categories. Bryson (1998), for instance, details the differences between (and movement from and to) 'courtesy' and 'civility' in the period, yet both remain in this CIVILITY group.
In order to illustrate each semantic category, each remaining subsection here (4.1e4.5) will provide a definition of its use with reference to (im)politeness theories and frameworks, a list of all the words included in this group, and an exploration of its use with reference to multiple textual examples.

CIVILITY-UNCIVILITY
The CIVILITY-UNCIVILITY metalanguage labels reflect an understanding of an established code of conduct which individuals are anticipated to follow, especially those of a higher status (or eager social-climbers), and it evaluates individuals for how well they follow it. This is also most immediately visible in the 'court' found in the 'courtesy' words, with the court being an institution of high social status (Bryson, 1998:62). A similar observation might be made about the use of the 'civil' words: for instance 'civil', 'civility', 'uncivil'. These relate to ideas of commerce and the rise of a 'civil society' and citizenship in the city, contrasted with the 'savagery' of the country (Bryson, 1998:45 et passim). The code of conduct in Early Modern English society is discussed at length by Bryson (1998) in her monograph on courtesy and civility as she traces the development of this code in the period (cf. also Richards, 2003;Thomas, 2018). This period was also marked with a rise in courtesy manuals detailing such behaviour, for instance Castiglione's Il Cortegiano and Della Casa's Il Galateo (Culpeper, 2017). These labels evaluate the use of linguistic resources such as greetings, terms of address, and word choice, as non-linguistic aspects such as gestures, movement, dress, and so on. It is less concerned with morality or sincerity (cf. Bryson, 1998:88 et passim) and consequently this behaviour can be (and is) the target of suspicion. Though addressing a period after Shakespeare's death, Jucker (2020:136) acknowledges how a "concern for proper behaviour" and "the shallow formalities of etiquette in words and gestures" are among the important aspects of the discourse of politeness in the 18th century. Other corollaries in the literature might be Sell's (2005Sell's ( [1992) discussion of politeness as obedience, as well as also the concept of 'discernment' politeness (e.g. Ide, 1989) in that the use of a linguistic resource such as an honorific is partly determined by what is correct and expected.
I have said that the CIVILITY-UNCIVILITY metalanguage labels evaluate individuals for how well they follow an established code of conduct. One major aspect of this is the notion of 'respect', and the idea that some things ought to be shown the correct behaviour, particularly individuals of higher status. In example (2), one is deserving of the proper 'respect' because of their 'virtue': (2) According to his Virtue, let us use him With all Respect, and Rites of Burial (Julius Caesar, 5.5) And in (3) from Othello, Emilia laments her husband's actions, but notes that it is correct that she obeys him: (3) O villainy! Villainy! … 'Tis proper I obey him; but not now: Perchance Iago, I will never go home. (Othello,5.2) In (4), the connection with status becomes clear e one's birth has not given him the right to speak to Lords in the way he does, and therefore he is being UNCIVIL ('saucy'): (4) … you are more saucy with Lords and honourable personages, than the Commission of your birth and virtue gives you Heraldry. (All's Well That Ends Well, 2.3) Words relating to the inability to show the appropriate level of gratitude are common in this group too, for instance 'ingratitude' and 'unthankfulness'. This relates to the notion of (im)politeness reciprocity within the context of thanking (cf. Culpeper et al., in press).
From these examples, it is also clear that CIVILITY-UNCIVILITY covers both linguistic and non-linguistic behaviour. For instance, in (4) above, it is speaking which has been evaluated, and in (2) and (3) it is non-linguistic behaviours. In (5) from The Merchant of Venice, Bassanio has criticised Gratiano for being too liberal in his verbal and non-verbal behaviour, and cautions him to temper it around strangers. Gratiano responds: (5) Signior Bassanio, hear me, If I do not put on a sober habit, Talk with respect, and swear but now and then, Wear prayer-books in my pocket, look demurely, Nay more, while grace is saying hood mine eyes Thus with my hat, and sigh and say Amen: Use all the observance of civility Like one well studied in a sad ostent To please his Grandam, never trust me more. (The Merchant of Venice, 2.2) Here he promises to behave more appropriately and with more restraint, especially around strangers: he promises to "talk with respect" e verbal CIVILITY e but the whole pattern of verbal and nonverbal actions is referred to as "observance of civility".
As mentioned earlier, another important aspect of CIVILITY-UNCIVILITY that it has a tenuous relationship with sincerity and can be the target of suspicion. For example, in (6) from Richard III, CIVILITY is actively criticised: (6) Because I can not flatter, and look fair, Smile in men's faces, smooth, deceive, and cog, Duck with French nods, and Apish courtesy, I must be held a rancorous Enemy.

GENEROSITY-HARM
The GENEROSITY-HARM metalanguage labels draw focus to the costs and benefits that actions or behaviour create for an individual, evaluating the agent for the costs or benefits they have created or transferred to another. In pragmatic terms, they draw attention to the perlocutionary effect, and its equivalent for non-linguistic behaviour. There are two notable corollaries in existing (im)politeness literature: Leech's (1983) Tact and Generosity Maxims and Culpeper and Tantucci's (2021) Principle of Politeness Reciprocity. Leech (1983:132) defines his Tact Maxim as "Minimize cost to other … Maximize benefit to other" and his Generosity Maxim as "Minimize benefit to self … Maximize cost to self". This would appeal to the contrast between, for instance, borrowing someone's car and inviting them for dinner: both are requests or directives, but the former creates cost, and the latter provides a benefit (Leech, 1983:133). Actions that create cost might then be mollified using politeness strategies. Leech (1983:107 et passim) acknowledges the idea of a cost-benefit analysis and even refers to it as a "transaction" (Leech, 1983:134). Interestingly, metaphors of commerce are also picked up by Culpeper and Tantucci (2021). In setting out their Principle of Politeness Reciprocity, they identify that (im)politeness can be (and is) conceptualised as a "debit-credit balance sheet" (2021:146 et passim). Being polite to another may indebt them to you, but the "balance sheet" can be balanced through reciprocal politeness. We might add that the same can be said of impoliteness, for instance in the sense of revenge or 'getting back' at someone. Adopting the notions of a cost-benefit analysis from Leech (1983) and a debit-credit balance sheet from Culpeper and Tantucci (2021), the GENEROSITY-HARM category can be defined as a category of metalanguage labels which appeal to the conceptualisation of acts of (im)politeness as a form of transaction involving costs and benefits, drawing attention to the recipient of the behaviour as well as the agent, and all of this is backgrounded by the sense that there exists an ongoing 'debit-credit balance sheet' of (im)politeness.
I have said that GENEROSITY-HARM involves attention on the costs and benefits that one creates for another. On the GENEROSITY side, one is able to create benefits for another. One key notion is a 'favour', i.e. that individuals approve of others, and approval can lead to material or immaterial gifts e both the approval and the gifts can be 'favours'. For instance, in (9) from Henry VIII, it is acknowledged that: (9) … whoever the King favours, The Cardinal instantly will find employment, … (Henry VIII, 2.1) Having the king's 'favour' involves receiving benefits. In (10), 'favour' is the gift itself, i.e. revenge on another's behalf: (10) O what more favour can I do to thee, Than with that hand that cut thy youth in twain, To sunder his that was thy enemy?" (Romeo & Juliet,5.3) The HARM side indicates that one has created costs for another, and it includes many variations of the word 'offend' or 'offence'. For instance, in example (1) above, which is reproduced below: (1) Alas Sir: In what have I offended you? What cause Hath my behaviour given to your displease, That thus you should proceed to put me off, And take your good Grace from me? (Henry VIII,2.4) This instance from Henry VIII also shows an awareness of the debit-credit sheet (cf. Culpeper and Tantucci, 2021): an 'offence' has caused someone to experience bad feelings (an example of a cost), and therefore they 'take [their] good Grace' from the speaker, i.e. the benefits they were providing. It is suggested that the giver has stopped providing their 'good Grace' because what they received in return has been perceived to be an 'offence' e not a good reciprocation.
GENEROSITY-HARM is generally less linguistically focussed than some of the others, and one can be materially or immaterially GENEROUS, but linguistic behaviours are frequently implicit. For instance, being 'attentive' involves conversational activity vis-a-vis listening in (11): (11) Platagenet shall speak first: Hear him Lords, And be you silent and attentive too, For he that interrupts him, shall not live. (Henry VI Part III, 1.1) However, there are cases where the GENEROSITY is purely material, such as in (12) where 'bounty' refers to the material gifts beyond literal treasure that Enobarbus has been sent by Anthony:

GOOD NATURE-BAD NATURE
The GOOD NATURE-BAD NATURE metalanguage labels reflect a tendency to evaluate another's inherent nature, with an awareness of how this affects their treatment of others. In this sense they are cumulative moralistic judgements about others rather than evaluations of a moment in time: an evaluation about a person's nature, rather than a specific action or behaviour. An individual with a GOOD NATURE is likely to do good to others and vice versa. However, it is worth noting that it may of course only take a single utterance or action for someone to decide that another has a GOOD or BAD NATURE! Unsurprisingly, there is a strong connection with morality. Research on the connection between (im)politeness and morality has received scant attention, though has grown in recent years (e.g. K ad ar, 2017). In Shakespeare's plays, it is important to highlight that the contemporary sense of morality is partly shaped by the Christian notion of original sin, which holds that individuals are naturally inclined towards sinful activity (cf. Trussler, 1989:124). To translate, this would imply that humans naturally lean towards activity and behaviour that would cumulatively lead to an evaluation of BAD NATURE; however, through accumulation of activity and behaviour which is judged positively, one might be evaluated as having a GOOD NATURE.
The following search terms were devised as a subdivision of the main ones discussed in Section 3, to capture GOOD NATURE and BAD NATURE metalanguage respectively: (benign|breeding|carriage|good nature|good natures). A notable omission from this category is 'good' in its GOOD NATURE sense, the antithesis of BAD NATURED 'evil'. However, with 2,858 instances of 'good' in the corpus, and a broad variety of senses, many instances do not relate to social interaction and/or morality, hence its exclusion (as per the process described in Section 3). It is still occasionally relevant.
This category of words is defined as a cumulative moralistic judgement of an individual's nature rather than a judgement of their behaviour at a specific point in time, and how that impacts their treatment of others. The most frequent GOOD NATURE word is 'breeding' in the use of senses 3 and 4 in the Oxford English Dictionary (n.d.), where it refers to how people are brought up, involving aspects of their "personal manners and behaviours; generally … good or proper manners". An individual with 'good breeding' is prone to actions which will be evaluated positively. Indeed, the individual in (13) is prone to many positively evaluated behaviours: (13) … you are a gentleman of excellent breeding, admirable discourse, of great admittance, authentic in your place and person, generally allowed for your many warlike, courtlike, and learned preparations (The Merry Wives of Windsor, 2.2) The same is true of the next most frequent word, 'carriage'. In (14), Anthony Dull from Love's Labours Lost is described by the King as: (14) a man of good repute, carriage, bearing, & estimation (Love's Labours Lost, 1.1) BAD NATURE is most obviously represented by the word 'villain'. When a character has learned of terrible things that another character has done, often highly emotive moments, they are occasionally classed quickly as a 'villain'. This can be seen in (2) above and in (15)  It is worth noting that 'breeding' and 'carriage' both carry associations of high social status, whereas a 'villain' has its origins in reference to individuals of low social status. Traces of this association can still be found, as in (18)

KINDNESS-UNKINDNESS
The KINDNESS-UNKINDNESS metalanguage labels evaluate behaviours for indexing a (close) relationship between individuals. The classical models of politeness address similar aspects: Brown andLevinson's (1987 [1978]) notion of positive face covers making others feel liked and wanted, and Leech's (1983:132) Maxim of Sympathy gives space for individuals expressing good feelings and good will to others. However, the best precedent for KINDNESS is in Lakoff's (1973:301) third mode or principle of politeness: "Make your receiver feel good"; making them "feel wanted, like a friend" and "expressing solidarity", which can manifest itself through (for instance) compliments, nicknames, and so on (Lakoff, 1973:301e302). In this way it relates to concepts such as affect (cf. Baxter, 1984;Slugoski and Turnbull, 1988), it is frequently associated with friendship and love, and it has a stronger relationship with sincerity than the other groups. Because KINDNESS-UNKINDNESS appeals to the ongoing relationship between individuals, Spencer-Oatey's (2000) notion of rapport management is a useful framework to draw from. Spencer-Oatey (2000:32) introduces four "rapport orientations"; KINDNESS has an analogue in her "rapport enhancement orientation", wherein speakers may try to improve relations between themselves and others. Subsequently, UNKINDNESS might be explained with Spencer-Oatey's (2000:32) "rapport neglect orientation" or "rapport challenge orientation", where the relationship is respectively ignored or actively damaged. Likewise, if Brown andLevinson's (1987 [1978]) notion of positive politeness is a useful analogue for KINDNESS, then Culpeper's (1996) notion of positive impoliteness may be one for UNKINDNESS. Indeed, many of its strategies index distance and a lack of a relationship: "ignore, snub the other", "exclude the other from an activity", "disassociate from the other", etc. (Culpeper, 1996:357e358).
I have said that this group of words is centred around the notions of solidarity, familiarity, and a close relationship. This might best be illustrated in how frequently it is connected to the idea of 'love', as for instance in (19) 'Love' can be emblematic of a close relationship one might have with another, and the connection with KINDNESS is therefore unsurprising. In (22) from Richard III, this same connection is drawn but instead with UNKINDNESS and the absence of love. Richard angrily tries to discover who has slandered him by saying he has spoken ill of them to the king: (22) They do me wrong, and I will not endure it, Who is it that complains unto the King, That I (forsooth) am stern, and love them not? (Richard III, 1.3) UNKINDNESS otherwise relates to how people might index a distance with others, and particularly where they might not behave in a way which is reflective of their preexisting relationship (cf. Lakoff, 1973:295). In (23) from Troilus and Cressida, Ulysses and Agamemnon plan for the Greek lords to be UNKIND to Achilles: (23) Ulysses: Achilles stands in the entrance of his Tent; Please it our General to pass strangely by him, As if he were forgot: … Agamemnon: We'll execute your purpose, and put on A form of strangeness as we pass along, So do each Lord, and either greet him not, Or else disdainfully, … (Troilus & Cressida, 3.3) They pass his tent and either ignore him or greet him 'disdainfully'; by doing so they are being UNKIND ('strange') by deliberately dissociating themselves from one of their fellow lords.
Another important aspect of KINDNESS-UNKINDNESS is that it is more a matter of sincerity than some of the other groups (cf. especially CIVILITY-UNCIVILITY in Section 4.1). In (19)e(21) the connection with sincerity emerges through the sincere nature of the love: it is 'unfeigned', 'honest', and 'true'.

SOFTNESS-ROUGHNESS
The SOFTNESS-ROUGHNESS metalanguage labels focus on the manner or style of an individual's behaviour (as opposed to the content, sincerity, or the individual's nature), metaphorically connecting physical and social properties, and there is a substantial and inseparable connection to social status. SOFTNESS and ROUGHNESS correspond with assumptions about, respectively, higher and lower status individuals and their behaviours. There seems to be an expectation that higher social status characters behaviour (and are treated) in a 'gentle' or 'gracious' manner, and that lower status characters are 'rude' and 'rough'. There is also the expectation that women are SOFT. There is no clear parallel in the (im)politeness literature, but there is a clear metaphorical and etymological unity to these labels. With this in mind, it is felicitous here to provide a brief summary of the etymologies of the words in this group (not least to avoid any misconceptions which may arise to the modern reader, for whom many of these senses might be familiar as purely physical or social in Present-Day English). Table 1 draws from the OED to provide the original senses of these words within English.
All except 'gentle' have origins in language to talk about physical sensations. 'Gentle' originally had a predominantly social sense: referring to the nobility and a noble way of acting, but it has been used to refer to physical softness since the 1550s (Oxford English Dictionary, n.d.). 'Gracious', 'mild', rough', and 'rude' have had both physical and social senses since their earliest appearance in the English language. The remaining words have a purely physical origin and have developed social applications over time. This would fit the expectation of conceptual metaphor theory (Lakoff and Johnson, 1980) that concrete senses are used to describe abstract senses. Of course, like all words in this study, the use to evaluate social behaviour is predominant in the corpus. However, with the physical origin, it is perhaps unsurprising that this category of words frequently focusses on the style or manner of behaviour, as mentioned above. This is reflected in the fact that these words are frequently applied to words referring to behaviour or language itself, or communicative acts.
I have said that the SOFTNESS-ROUGHNESS words focus more on the manner or style of the behaviour, as opposed to its content or sincerity. This might be linguistic, more broadly social, or more material, and this group has a stronger relationship with language use than other groups. For instance, in (24) And in (27) from Coriolanus, the eponymous character says he will still defend himself from accusations, but promises to do so 'mildly', i.e. in a SOFTER style: (27) Cominius: Away, the Tribunes do attend you: arm yourself To answer mildly: for they are prepared With Accusations, as I hear more strong Than are upon you yet. SOFTNESS-ROUGHNESS may be more broadly social as well, for instance in (28), the behaviour of Miranda and her father Prospero are respectively evaluated as SOFT and ROUGH with no specific reference to language: (28) O She is Ten times more gentle, than her Father's crabbed; And he's composed of harshness. (The Tempest, 3.1) In (29), a general 'smooth civility' is easily dropped when faced with an offence: (29) You touched my vein at first, the thorny point Of bare distress, hath taken from me the show Of smooth civility: … (As You Like It, 2.7) 'Smooth civility' would suggest a mixture of (unsurprisingly) SMOOTHNESS and CIVILITY, and given that these two groups have a strong relationship with high social status and surface-level behaviour, it would make sense that there be some overlap (cf. also 'smooth' in (5)).
There also appears to be a strong expectation that women show SOFTNESS, illustrated by (30) and (31). In (30) from Henry VI Part III, the Duke of York assesses Queen Margaret's role in leading an army against him, and criticises her as follows: (30) Women are soft, mild, pitiful, and flexible; Thou, stern, obdurate, flinty, rough, remorseless. (Henry VI Part III, 1.4) He contrasts the traditional SOFTNESS of women with her ROUGHNESS by occupying such an active political and military role. And in (31) from As You Like It, Rosalind in disguise as Ganymede suggests that an insulting letter could not possibly have been written by the character Phoebe, because: (31) … women 's gentle brain Could not drop forth such giant rude invention, (As You Like It, 4.3)

Conclusion
This study has broadly contributed to (im)politeness (meta)pragmatics by establishing a corpus-based method for inductively identifying a large body of lexical items which are used to evaluate (im)politeness (the 'mention' of politeness from Jucker, 2020:19). In this way, this method is able to identify the shape and nature of (im)politeness in an empirical fashion, and is theoretically applicable to any period or variety of language. It has also more specifically sought to produce data which can be used to fill the current deficit in historical (im)politeness metalanguage studies, and the complete vacuum of studies in the realm of (im)politeness metalanguage and stylistics. To this end, it has provided an account of (im)politeness metalanguage in a corpus of Shakespeare's plays. It employed a series of strategies to identify a total of 234 (im)politeness lexical forms in the corpus, covering a total of 4,023 instances in a corpus of approximately 1 million words. Noting that certain themes appeared across the words used, five second-order semantic categories were established and explored with textual examples: CIVILITY-UNCIVILITY, GENEROSITY-HARM, GOOD NATURE-BAD NATURE, KINDNESS-UNKINDNESS, and SOFTNESS-ROUGHNESS.
It is worth reflecting on to what use this methodology and its resultant data can be put. Metalinguistic discussions of (im) politeness themselves can be complementary to other approaches and data. As argued in Section 1, corpus-based metalinguistic data grants access to (sometimes differing) perceptions of (im)politeness, from both hearers and speakers, as well as instances in which such norms are discussed within a pre-defined context (here, Shakespeare's plays). It remains for the connection between these evaluations and different linguistic forms and choices to be teased out. For instance, which speech acts cause an evaluation of CIVILITY, and how do they compare to those which prompt an evaluation of KINDNESS? Are certain forms of the same speech act more or less likely to be evaluated as ROUGH? This is to say little of how such work might be carried out on different sets of data. Would the same categories apply to present day data, or would they need adjusting?. 2 Metalanguage can be used to better understand how different linguistic choices are understood, even when assuming a level of generalisation and stability in such linguistic choices, and therefore sitting uncomfortably with the original discursive perspective. This is to say even less of a more sociological perspective to these data. Individual words and semantic categories can be explored in more depth. Much of the (im)politeness metalanguage here has a strong and inextricable relationship with social status, which there has not been space to explore properly, as well as the relationship with both gender and religion. What is the nature of these relationships? Individuals plays or characters can also be explored, considering how the notions of (im) politeness introduced here might relate to their linguistic realisation. What is the role of (im)politeness metalanguage for the playwright and their audience? Such questions highlight the fruits that (im)politeness metalanguage research yet bears.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Declaration of competing interest
None.

Data availability
The data used is freely available corpus, links to which are in the article.