Magical women: Representations of female characters in the Witcher video game series Discourse, Context & Media

Several videogames allow players to form their own narratives by making the player choose certain options with different dialogues and thus different representations. This can be problematic when exploring the representation of gender from the perspective of player ’ s experiences. I argue that one way to overcome this is to use corpus linguistic methods. In this paper, the videogame series The Witcher (CD Projekt Red 2007, 2011, 2015) is taken as a case study for lexico-grammatical analysis of the representation of gender via corpus methods. Keyword analysis shows that male characters are more likely to occur than female characters and have a more diverse range of professions than female characters. I argue that the main female characters of the game are typically sorceresses, and so I explore how this term is used across the corpus. The analysis demonstrates that sorceresses are represented as educated and intelligent, but subject to a glass ceiling effect: they are only ever advisors and not leaders. I argue that regardless of what options players choose, they are statistically more likely to encounter these problematic representations of gender, thus raising questions about whether it is possible to escape sexist discourses in this medium.


Introduction
In 2019, Netflix released The Witcher, a television adaptation of a popular book series by Andrezj Sapkowski. Sapkowski's work had previously been popularised with a series of videogame adaptations, which also gained more popularity following the success of the television show. The Witcher videogame series has three instalments: The Witcher 1 (CD Projekt Red, 2007), The Witcher 2: Assassins of Kings (CD Projekt Red, 2011), and The Witcher 3: Wild Hunt (CD Projekt Red, 2015). Each game was popular at release with The Witcher 3 winning over 200 game-of-theyear awards and, following the success of the Netflix adaptation, the series as a whole has now sold more than 50 million copies (CD Projekt Red, 2020). Thus, The Witcher videogame series is a highly popular and successful franchise. Given that the media can influence people's perception of gender roles (see Jeffries, 2007;Stermer and Berkley, 2015), such popularity highlights the need for closer analysis of these digital texts.
In this paper, I critically analyse the representation of gender within The Witcher videogames. In Section 2, I start by outlining The Witcher, paying particular attention to previous work which has explored the videogame series from a linguistic perspective. I then discuss previous literature which has taken a critical approach to the representation of gender within videogames. I argue that previous work typically only examines visual communicative modes and that language (i.e., lexis and grammar) is rarely explored (though, see Heritage, 2020;Heritage, 2021b). However, representations can cross communicative modes, and language can reveal representations which might not necessarily be exposed from visual content analysis alone. While this is not to dismiss the work of visual content analysis, I demonstrate that linguistic methods of analysis can also provide useful insights into representations, which can then be explored in relation to the hermeneutic situation in which the player encounters it. This then leads to a discussion of previous work which has specifically focused on the language used to represent gender in videogames.
Given the limitations of space, this paper focuses on the linguistic representation of women within The Witcher videogame series. To address this topic, I take a corpus approach and build a corpus of 699,764 words from the videogame series. I start by conducting a keyword analysis, focusing on gendered social actors and gendered character's names. I argue that named male characters typically occupy positions of political and/or military leadership, while named female characters are likely to be sorceresses (who are political advisers). This leads to a closer analysis of a small subset of named characters, exploring how they are represented. Following this, given how many female characters are sorceresses, I turn to explore how this role is represented in the videogames. I argue that there are a range of representations of sorceresses, but that there are multiple problematic representations. Finally, I then provide some concluding remarks and directions for future research.

The Witcher
As discussed in the introduction, The Witcher videogame series consists of three fantasy RPGs (role-playing games), developed by CD Projekt Red. The videogames follow certain elements associated with the fantasy RPG genreplayers are given quests, may or may not meet certain characters, and there are a range of different narratives which give each player a unique experience (see Heritage, 2021a). The player embodies a specific avatar, the eponymous Witcher (named Geralt), and this avatar must be the focus of their experience. Players meet and interact with a range of secondary characters, known in ludological terms as NPCs (Non-playing characters). NPCs will interact with players in different ways depending on what choices the player makes. For example, if a player decides to kill an NPC, other NPCs may refuse to interact with them. This makes analysing a representative sample of language from the videogame somewhat challenging. While one player will experience one representation of gender (e.g., a female character being represented in a positive way), a different player may experience a vastly different representation (e.g., the same female character being represented in a negative way). However, there are also commonalities around representations running through the game, regardless of what choices players make. In an analysis of Geralt as an avatar, Matuszek (2017, p.130) summarises this exact problem: ""My Geralt" will obviously differ from "your Geralt," yet only to the degree that "my Batman" would from "your Batman." At times he may seem cynical, at others surprisingly principled and conscientious, but only within the strict limits of institutionalized knowledge about the franchise, and with clearly defined reference to the complete cultural baggage that The Witcher alludes to." The fantasy world constructed within The Witcher videogames is expansivethere are hundreds of NPCs, with different levels of relevance to the story lines. Given the sheer size of the series, it is unsurprising that a number of videogame scholars have explored different aspect of the games. For example, from a linguistic perspective, previous research on the language of The Witcher has explored phonological features associated with different NPCs (e.g., Troelsen, 2021), which found a correlation between the perceived class of a character and their accent. Elsewhere, others have compared central figures and overarching narratives to other videogames to explore patterns across similar texts from the same genre. For example, Lucat (2017) compared the representation of fatherhood in The Witcher with BioShock: Infinite and The Last of Us. Lucat's work presents an interesting comparative analysis, focusing on tropes associated with this kind of digital text (e.g., anti-fathers). However, like with many other analyses of gender in videogames, as I will argue in the next section, Lucat focuses on overarching narratives, particularly as experienced through playing the games. While both experiences of playing videogames and the visual elements of videogames are vital for understanding representation, I would also argue that there is also a need to look at the more "fine grained" elements of representation which players may not consciously detectfor example, the kind of words gendered terms may collocate with; or phonological features used by particular characters, as alluded to by Troelsen (2021).

Gender in videogames
The representation of gender (and intersecting identities) in videogames is a vast and growing line of scholarly inquiry (see, for example, Heritage, 2020;Richard and Gray, 2018;Shaw, 2014;Heritage, 2021a b). An overarching finding from various studies is that videogames contain numerous examples of hostile sexism 1 (see Fox and Bailenson, 2009;Bègue et al., 2017). Indeed, this notion has been demonstrated and argued by mainstream-feminists such as Anita Sarkeesian (see Sarkeesian, 2014), whose synthesis of sexism in gaming culture became the focus of cyber-trolling. However, a commonality running throughout an overwhelming amount of literature into gender in videogames is a focus on how characters are visually realised, as opposed to represented in terms of the lexis and grammar used by and about gendered characters (see, for example, MacCallum-Stewart, 2014; Machin and van Leeuwen, 2016;Carrillo Masso, 2019).
The language used by any form of mass media, including videogames, is a way to normalise views towards gender (see Jefferies, 2007). However, only focusing on how visual communicative modes contribute to the normalisation of gender roles may be problematic. Imagine an image of a visually sexualised womanwe might assume that this is a poor representation of gender. However, what if this woman is referred to with terms like respectable or powerful? This theoretical example demonstrates the multifaced nature of representations. The representation of gender can occur in different communicative modes, such as in visual communication and in the language, and the representation of gender in one communicative mode might draw on different discourses than other communicative modes. While I would argue that multiple communicative modes need to be examined in context, as I touch upon in later sub-sections, language is often overlooked but can be a useful way to explore how gender roles and norms are constructed. This is not to say that a (corpus) linguistic study is the only way to explore gender in videogames, or even that it should not be combined with other forms of analysis, but that it can illuminate some of the ways gender is constructed in these texts.

Ludolinguistics and gender
Before discussing ludolinguistic approaches to the representation of gender, it is worth making a distinction between ortho discourse and para discourse (Carter et al, 2012). Ortho discourses are discourses within videogames as a text (similar to what is analysed in this paper). By contrast, para discourse is concerned with discourses about videogames and surrounding videogames (e.g., in videogame-related media). Although there is a growing body of research into ludolinguistics (i.e., the language of gaming), the literature typically focuses on para discoursetypically through examining texts related to videogames such as tutorial manuals (see, for example, Ensslin, 2012;Balteiro, 2019;Ensslin and Finnegan, 2019) or the language used by people who engage with the game as a text (Graham and Dutt, 2019;Kiourti, 2019;Rudge, 2019). Although not linguistic in focus, it is also worth acknowledging that some previous literature has explored ortho discourse through the use of broad content analysis in a successful attempt to weave together both feminist perspectives (see Hoffin and Lee-Treweek, 2020) and queer perspectives (Youngblood, 2015;Mejeur, 2018;Colliver, 2020) on videogames. Papers in this vein occasionally use examples of language to underpin individual analytical points, although close linguistic analysis across the data set does not typically occur.
Ludolinguistic studies of gender in videogame para discourse typically examine how gender is represented in fora and magazines dedicated to videogames (see Burgess et al., 2007;Carrillo Masso, 2011;Heritage, 2022;Miller and Summers, 2007;Summers and Miller, 2014). However, while such forms of para discourse might offer and interesting 1 Glick and Fiske (1997p.121) define "hostile sexism" as follows: "Hostile sexism seeks to justify male power, traditional gender roles, and men's exploitation of women as sexual objects through derogatory characterizations of women".
"window" into how gender is represented within the videogame, ortho discourses might beand often arevery different and should be researched in their own right. Indeed, as various scholars have argued, there is regularly a mismatch between media representations of videogames and the texts themselves (see, for example, Hart, 2020;Kelly et al., 2020).
To date, there is a dearth of literature which approaches the representation of gender within videogame ortho discourse from a linguistic perspective. Of particular note are Machin and van Leeuwen's (2016) study into how gender is represented through sound and visual elements, as well as Toh's (2015) visual analysis from a critical discursive perspective. Machin and van Leeuwen examine how sound, as a communicative mode, could be gendered in two mobile videogames. One game focused on racing (which the authors claimed demonstrated features of masculinity) and another focusing on actions like bathing animals (which they claimed demonstrated features of femininity). The study yielded interesting results, such as how the completion noises in the game aimed at girls were higher pitched and there were more crashing sounds in the game aimed at boys. However, there were a number of problematic elements to the research. One of the most obvious issues is that the research might only be uncovering elements relating to sound and visuals associated with particular tasks or types of game, as opposed to findings relating specifically to gender (for example crashing noises are likely to be associated with racing games, and the research could have compared racing games aimed at girls against racing games aimed at boys). Nevertheless, Machin and van Leeuwen demonstrate that gender can be presented in complex way within the videogames and that multiple semiotic modes contribute to the consturction of gender within videogames.
Elsewhere, Toh (2015) took the first three hours of gameplay for 6 different bestselling videogames and audience responses to this, paying attention to how characters were visually represented and how players responded to these visual representations. Focusing on a small number of participants, Toh required participants to play through the videogame and have a preliminary and post-playthrough interview, which allowed Toh to build a ludo narrative model cantered around player's experiences. While the analysis did not necessarily focus on the representation of gender, it did consider the visual representation of gendered characters. However, this may not have been a representative sample, particularly considering that longer games can contain many more hours of dialogue, which the players in Toh's research might not have encountered. Yet, it is understandable that it did not contain every possible permutation, simply because of how expansive the games can beand how participants might have needed to be recorded for in excess of 100 hours if they were to play all of the game several times. Indeed, Toh's research raises questions about how best to sample language from videogames, as they are regularly open-ended, and players can construct their own narrative.
In my own research (Heritage, 2020;Heritage, 2021a b), I have argued for the use of corpus linguistic methods to examine the representation of gender within videogame ortho discourse. While Toh's analysis demonstrated the need to consider more data, I also previously positioned my work in contrast to the work of Goorimoorthee et al. (2019), who undertook a "typical playthrough" approach to the data (here, playthrough refers to playing a videogame and selecting different dialogue optionsthis is typically used with regards to games where there are multiple choices and possible narratives for a player to experience). In Goorimoorthee et al.'s study, the researchers played through one version of the story line and gathered phonetic data from different characters. While this might be useful for answering some research questions (e.g., such as looking at a small number of accents), I would argue that because players can shape their own narrative, I do not believe a single playthrough can ever be "typical" -especially given the range of hermeneutic situations in which a player may encounters such language.
In order to overcome the issue of typicality of playthroughs, in my previous work on corpus approaches to videogame ortho discourse, I have taken large representative random samples of data from different videogames. For example, in Heritage, 2020, I found that in a corpus of c.330,000 words taken from 10 of the bestselling videogames published between 2012 and 2016, men were represented as physically strong and as the perpetrators of violence, while women were discussed in terms of their knowledge. The findings broadly suggested that men were likely to sustain hegemonic physical masculine ideals, while women were relegated to conducting mental processes. In other studies, I have argued that by taking a more decontextualised approach to the language about gender, it is possible to explore subtilities in gender representation beyond hermeneutic situation in which the player encounters it (see Heritage, 2021b). While player interpretation is useful, different players might have varying interpretations, which may lead to a lack of systematicity in the study of gender representation. Furthermore, while my previous research has typically focused on the representation of gender in a broad sense (e.g., looking at search terms for gendered nouns such as man and woman), this paper differs somewhat (e.g., Heritage 2021a, b). In those previous studies, I explored gendered nouns in a range of series to explore how ideas of gender were coded into the grammar and semantics of videogame ortho discourse. I argued that there has been a significant change over time, with representations becoming more progressive (see, Heritage, 2021b). In this paper, rather than examining the representation of gendered categories, I take a sample of gendered characters and examine how these gendered characters build up to broader pictures of representation. The underpinning idea behind such an approach is that if members of a gendered group are represented in a homogenous way, then such repeated representations can reflect broader gender-based ideologies.
The above review of the existing literature demonstrates a range of gaps in the research into the language used to represent gender in videogames. Thus, taking The Witcher as a case study for how gender can be represented in videogames and explored through (corpus) linguistic methods, I aim to answer the following research questions: 1) What gendered characters are referenced in The Witcher?
a. How frequently are these characters referenced? b. How are these characters represented?

Gathering the corpus
Before discussing how the sub corpora were built, it is worth noting why this paper utilises corpus linguistic methodologies as a tool for uncovering the mechanisms for mediatizing gender-based discrimination, as opposed to other methodological approaches. Utilising corpus linguistic methodologies on this kind of data is particularly important because it allows researchers to examine not only across the videogame (s) more broadly, but also patterns of discrimination can be seen regardless of the choice's players make within videogames. While previous ethnographic research has examined the representation of gender, this ethnographic work only ever explores the researcher's (or player's) experiences (e.g., Sundén, 2012), rather than a variety of experiences included for a range of players. Autoethnographic work is particularly problematic in the study of ludolinguistics due to the unreliability of recalling linguistic representations with accuracy. Even if a small number of linguistic examples are correctly recalled, it is likely analysts will not be able to examine patterns across a range of examples and representations which might run counter to researchers' interpretations might be missed. This is not to say that such interpretations and collections of knowledge gained through autoethnographic methods are not compatible with (corpus) linguistic methods, rather that these should supplement the analysis and help in the interpretation of data instead of being the source of the linguistic constructions under analysis.
Although there are a variety of methods in building corpora from videogames (see, Heritage, 2020;Heritage, 2021a, b), in this paper I utilise pre-existing computer software to extract as much data as possible from The Witcher series. To build the corpus used in this chapter, the videogames were all downloaded onto a Windows-compatible PC using the videogame distribution platform 'Steam' (Valve, 2021). Once downloaded, Steam creates a 'program file' for the software within the main hard drive of the computer. When a videogame is downloaded via Steam, it comes with multiple files which automatically save to the computer systemthis includes the file that the PC uses to know what language to present to the player. This file is called a string file (it is also referred to as a text dump). The string files were illegible without a decryption key provided in the game files. I used three different pieces of software to run the decryption on the string files. For The Witcher 1, I used a programme called UnBIF (Csimbi, 2012); for The Witcher 2, I used a programme called Gibbed Red Tools (Gibbed Red Tools, 2015); and for The Witcher 3, I used a programme called Lua Utis Tools (Zaitsev, 2015). Three different pieces of software were used as each instalment used a different method of encrypting the data (this process, and limitations associated with it are discussed in more detail in Heritage, 2021a, b). The language presented in the files (and thus the corpus) contains all the language within the gamesfrom in-game item descriptions 2 to the language which would normally appear as subtitles during interactions. I played through 15 hours worth of each game, making a note of the language at several randomised points (across different periods in the narrative) and tested this against the data in the corpus to see whether the string files match what is said by charactersand there were no differences.
Details on the size of the corpora, as counted using WordSmith 7 (Scott, 2016), from the three games are outlined below in Table 1.

Corpus structure
Before discussing how the corpus were analysed, it is worth drawing attention to how the sub corpora are structured. Due to the amount of choice a player is given within the game, the corpora were structured in a way similar to a "choose your own adventure" book (see Pickard, 1979). This meant that although the data was presented in the form of full and complete sentences, there occurred in a way which often disrupted the textual cohesion. Take the following example: 1. The witcher lifted the princess's curse and was to get half the kingdom and her hand in marriage when she healed… 2. Temporary insanity! 3. When he asked for the reward and didn't get it -swish! -he slashed Foltest with his witcher's razor… 4. But sergeant, they bathe more often than we do. 5.
Step lively, stretch those legs! As can be seen, examples 1 and 3 are related to each other (they tell the story of Geralt seeking a reward for his work). However, there are parts which are interjected and do not fit with the "flow" of the story. Similarly, language from different stories are presented in 4 and 5, but they also appear close to the narrative presented in 1 and 3.
However, all the language presented above can be read in a "meaningful" way. In other words, it is not structured as follows: The witcher Temporary insanity lifted the princess's for the reward and curse and was -swish! -he slashed to get half witcher's razor the kingdom and her hand in marriage when she healed…But Step lively sergeant, they stretch those legs! bathe more often than we do.
While the way the data is presented is not ideal, it does provide a representative sample of the language used within the series and is sufficient data to answer questions about representations of identity. It should be noted that this decontextualization of the language canand often shouldbe used in tandem with an understanding of how the text contextually operates. In other words, while I only talk about gender representation from a (decontextualised) linguistic perspective in this paper due to the limitations of space, gendered roles and characters will be presented in multiple ways, which can only fully be appreciated through also playing and experiencing the videogame.

Corpus tools and analytical procedure
To decide what words to look at in more detail, I ran a keyword analysis on each of the three sub corpora individually and on the combined corpus. To do this, I used WordSmith 7 (Scott, 2016). I took the top 150 keywords 3 (as calculated by triangulating BIC Score, Log Ratio, and Log Likelihood 4 ) and then analysed the collocates 5 and conducted a close reading of 100 concordance lines for each keyword.
Following this, I applied van Leeuwen's social actor framework (van Leeuwen, 2008) to the sample of concordance lines for each keyword to identify whether it typically related to a gendered social actor. In this paper, I draw attention to two categories of gendered social actors: those who have been referenced through nomination (i.e., terms denoting specifically named people, such as Triss), and those referenced through functionalisation (i.e., terms indicating a profession which has a gendered suffix or is 'culturally gendered', such as sorceress or king). This is not to say that other terms denoting gendered social actors did not occur, rather these were the two categories within van Leeuwen's framework with the most frequently occurring lexical tokens denoting gendered social actors. I start the analysis of these terms by exploring the nominated terms and explore the similarities/differences in the kind of male/female characters players are most likely to encounter. This draws on knowledge from the book and playthroughs of the videogame, thus using elements of autoethnography and play to bolster the analysis.
Following this, I zoom in on the representation of sorceress(es). I explore the collocates for both the singular and plural forms in more detail and group different semantic categories for collocates. I arrived at these categories by reading through the list and inductively grouping the collocates by broadly applying different tags for each collocatefor example, "magic" was marked as "abilities", "power", "supernatural abilities", etc. Once I had noted all the potential categories, I then 2 In-game item descriptions include a range of different texts that players might encounter in the game. For example, notes which can be read in-game were includedsuch as wanted posters. 3 Keywords are words which are statistically more likely to appear in one corpus in comparison to another corpusas created comparing one corpus against a second corpus which provides some sort of meaningful contrast against the first corpus (see, Baker, 2006). 4 Given the limitations of space, it is not possible to provide full explanation of the mathematic tests associated with keyness. However, readers interested in these statistical measures may which to read the work of Brezina (2018), Gabrielatos (2018), and Scott (2021) for more information. 5 Collocates are occur words which occur 'frequently within the neighbourhood of another word, normally more often than we would expect the two words to appear together because of chance' (Baker et al., 2013: 36).
grouped similar categories together and labelled them with the broadest and most overarching label of what had been noted. To avoid proliferation of categories, I tried to make as many categories of collocate as broad as possible and tried to collapse collocates into only one category. I also created a "miscellaneous" category for collocates which did not fit with any others. While this grouping is somewhat interpretive from the onset, it is used as a starting point for further analysis.

Initial keywords
In Table 2 below, I have listed keywords which fall into these categories discussed in the previous subsection.
There is clearly a greater prominence given to male characters within the keywords of The Witcher series. This prominence occurs both in terms tokens which denote a male social actor (i.e., there are more names and functionalised terms used to refer to men) and tokens for such terms (i.e., combined, all the terms that denote men occur more frequently than terms that denote women).
Given the limitations of space, I will zoom in to examine the representation of sorceress(es) in a subsequent section of analysis, primarily because it is a term which denotes women's professions and is a term associated with features of both fantasy RPGs and the fantasy genre more broadly. First, however, I will discuss how different male and female characters are represented and examine similarities/differences within the professions of such characters.

Names
An examination of the different professions which male and female characters held revealed that there were differences between male/female professions and intra-group similarities. One such area where there were overlaps was in what profession male/female characters undertook. These professions, as they relate to named characters, are presented below in Table 3.
The kind of professional roles of the different male/female characters who players are likely to interact with demonstrates what is seen as acceptable within the game world. Named male characters are typically either kings or fighters, while female characters are typically sorceresses. One exception to this is Saskia, who is also the leader of a rebel army or a queendepending on the choices the player makes, which again speaks to the need to consider the variety of playthroughs that players may experience. Nevertheless, the professions demonstrate a complex relationship here between physicality and gender: male characters are typically those who enter hand-to-hand combat more than female characters, who typically give political advice. This reliance on physical strength for fighting resonates with the idea of physical masculinity (see Connell, 2005), in which men achieve success through physical means (this is also discussed in more detail in a subsequent section).
There are two female characters who take up potentially physical roles: Ciri and Saskia. Ciri is Geralt's ward (as well as a sorceress and a princess), while Saskia leads an army. When examining a random sample of 100 concordance lines for the search term Ciri, it was revealed that there was little language about Ciri fighting, and the majority of concordance lines contained verbs for tracking hersuch as finding or learned (where she had been Interestingly, although there are points in the game where she is visually represented as fighting, these were generally absent from the concordance lines. This could be because Ciri doesn't talk about what has happened in that scene, or that the language within the narrative at that point does not explain what is going on, rather it relies on players visually watching the action and experiencing the actions. For example, there are scenarios where the player sees Ciri fighting wolves, but as she is alone in this scene, there is no dialogueand a monologue might seem unusual in that situation. In turn, this demonstrates the conflicting ways mediation can contribute towards representations: while visual representations can be included, they may not necessarily be spoken aboutthough, they appear to be spoken about more frequently for male characters than female characters. In turn, this has the potential to reiterate what male characters have done in terms of demonstrating physical prowess but may leave what female characters have done as forgettable, ambiguous, or unmentioned. This mediation can thus demonstrate a form of repeated discrimination against female characters. For Saskia, there were mixed representations. These mixed representations of Saskia could occur because of a variety of reasons. One is likely to be that Saskia is represented as a complex character and encompasses a range of identities. However, certain elements in her characterisation are likely to also be locked to particular narratives that rely on particular player choices. This therefore demonstrates why corpus approaches can be helpful in understanding the multiple ways characters are constructed: corpus approaches allow researchers to cut across narrative choices and focus more specifically on such ways of characterisation. For example, some concordance lines discussed Saskia being a formidable fighter, such as: Saskia -the wench who killed a dragon! Facing one another in a chivalrous duel shall be Henselt of Ard Carraigh, King of Kaedwen, (The Witcher 2) Other concordance lines which discuss fighting depict people fighting for Saskia, such as:  The fact that people are fighting for Saskia is not surprising, given that she is the leader of a rebel army. However, concordance lines showed that there were multiple characters interrogating people about why they were fighting for her 6 -which did not occur for male rulers. Indeed, a search for the phrase fighting for revealed that it is only used to talk about: 1) abstract concepts (fighting for one's life; fighting for freedom), 2) children (a father fighting for this children), and 3) Saskia as a ruler. Such interrogatives (as demonstrated in the above examples) appear to question Saskia's ability as a rulersomething which male rulers are not subjected to.
The final way Saskia is represented is through needing to be saved. There were a number of concordance lines which discussed the player needing to help Saskia in some way to save her life. For example: a crocodile's tears and a sorceress' smile as ingredients of the antidote for Saskia. I believe the witcher would have had an easier time obtaining either of those. (The Witcher 2) Such representations are problematic because they end up relying on the classic "damsel in distress" trope (see Sarkeesian, 2014; but see also Stang & Trammell, 2020 for different gendered tropes). Even though Saskia is represented as a perfectly capable fighter in a handful of visual scenes (and indeed, there are instances where she resists the "damsel in distress" trope), there are still a number of instances where she is still rendered incapacitated and needs saving by the player who embodies a male avatar. There are also a number of instances where Saskia is represented as capable at a visual level, which may not necessarily be captured by this linguistic analysisdemonstrating again the disconnect between communicative modes. However, implicit in the above concordance line is the way sorceresses are represented: cold and compared to crocodiles. While this is true of certain individual sorceresses (see, Heritage, 2021a), it is worth looking at how this group is represented in more detail, as well as how they contrast to witchersespecially because players act as witchers.

Zooming in on the words sorceress(es)
Sorceress and sorceresses were the two lexical tokens denoting functionalised female social actors within the top 150 keywords. In addition, as Table 3 demonstrates, most nominated (i.e., named) female characters were also sorceresses. Thus, it is worth exploring how this term is used in more detail.
Sorceresses are a common feature in fantasy literature and RPGs: they are women who are magically gifted, and often very powerful. Within the context of The Witcher, they are members of a secret society called the lodge of sorceresses, dedicated to the advancement of magic regardless of political affiliations. This is important contextual information for the player, as they are likely to encounter magically powerful women from a range of political identities. While players may 'mod' their games (i.e., download additional features from third parties) to change their avatar's skin to that of a sorceress, in the official release of the game they only ever encounter these women as NPCs.
I started the exploration of how these women were represented by examining the collocates of sorceress and sorceresses. The top 40 collocates which occurred ≥ 5 times (as measured by MI score 7 ) have been grouped into semantic and/or grammatical categories in Table 3 belowfrequencies at which the collocations occur are included within the table. As discussed, these categories were established after reading through the collocates a number of times and attempting to identify themes. Although other researchers may have identified different thematic links, the themes presented here are intended to serve as a starting point for more in-depth analysis.
There appear to be a number of different categories which the collocates fall into. There is a lot to say about each individual categorysuch as how there are different women who are referred to as a sorceress in comparison to the women who are referred to as part of a collection of sorceresses, or how there is a focus on the aesthetic qualities of women (e. g., beautiful). However, in the following sections, due to limitations of space, I focus on words relating to knowledge and male names/professions. I focus on these two because words related to knowledge are somewhat unexpected: names of sorceresses might be assumed to collocate with the search terms, but knowledge is an attribute prescribed to them. Secondly, I focus on the male names/professions because this might demonstrate gendered relationships.
The collocates in Table 4 suggest that when sorceresses is used as a collective plural noun, it collocates with words which denote the group's level of education 8 or general knowledge. 9 For the singular noun sorceress, there was only one collocate ('knew') which denoted knowledge.
The concordance lines which contain these collocates support the idea that a prerequisite of being a sorceress is to be both learned and knowledgeable: the city of Aretuza, situated on a little islet of Thanedd off the coasts of Temeria, was famous for its School of Sorceresses. Becoming one of its students was considered a great honor (The Witcher 3)
Sorceresses, with all their learned books and schools of magic (The Witcher 3) This kind of academic knowledge prescribed to the collective group of sorceresses is contrasted with the intelligence prescribed to singular sorceresses: The sorceress knew where to find the flowers, so the witcher followed. (The Witcher 2) Four out of the six occurrences where 'knew' collocated with sorceress related to the sorceress knowing something which would go on to help the player complete a quest (such as where to find an item). By contrast, all occurrences where 'knew' and sorceresses collocated discussed the sorceresses' level of education in a more general sense (such as knowing about politics). Thus, there appears to be a difference between how sorceresses are represented as a group and how the individual sorceresses are represented.
From a gender-based perspective, it could be suggested that the authors of the game script construct the representation of sorceresses in different ways depending on whether they are referred to with the singular form or the plural form. The plural form sorceresses positions educated women above non-educated women: such women are able to gain power and prominence through study and education, which appears to be a positive representation of women. This representation is also important because it contrasts with what is expected of men who enact hegemonic masculinity. One part of hegemonic masculinity is associated with physical prowess, while these women seem to be hailed for their cognitive abilities. Thus, what is expected of women (mental aptitude) contrasts with what is expected of men (physical prowess). While it is possibly a positive representation to discuss women as mentally strong, this contrast could in turn further sustain ideals of hegemonic masculinity by suggesting that women do not have the same physical strength as men, because they are represented as either not possessing it or, if they are physically strong, this is not mentioned (Schippers, 2007).
One aspect of these categories is the use of male character's names which collocate with both sorceress and sorceresses. One reason for this appears to be that sorceresses are political advisors to kings. For example: How is it that King Radovid's court sorceress and advisor is supporting a rebellion in Aedirn? (The Witcher 2) One commonality running through these collocates is that sorceresses work for a king, but not vice versa (kings do not work for sorceresses). Potentially, this could suggest a kind of glass-ceiling effect for sorceresses¸ in so far as they are able to become 'powerful' but are not seen as political leaders, rather they still must work for or below someone else. Indeed, this is also the case in Sapkowski's original books, in which sorceresses are trained to become political advisers to kings. This representation could thus also be explained in terms of a feature of the books that the game writers wanted to retain. However, needing to have access to this contextual information speaks to a broader issue in the synergies between corpus linguistics and critical discourse studies. Some scholars, such as Sinclair (2004) would argue that researchers should treat the corpus as a 'black box', without any knowledge of what is in it or the kind of data surrounding it. However, Partington et al. (2013) argue for the opposite: that not only should an analyst be familiar with the data, but they should also look at surrounding texts that might change such representations. In this case, I would suggest that knowing how sorceresses are represented within the books helps to understand such choices on the videogame writer's part, and therefore would urge additional research into videogames which have been inspired books to follow Partington et al.'s advice.
This advice of not approaching the text from a 'black box' perspective can also be useful when considering the multimodal forms of representation in the game: player experience and kinaesthetic engagement with the text will provide a range of additional contextual factors which can shape the analysis. In essence, while corpus linguistic methods can reveal patterns players might not realise, they are experiencing, player experience and broader contextual data cannot be wholly removed from a corpus-based analysis of videogames like The Witcher.

Conclusions and directions for future research
This paper has demonstrated some of the ways that female characters are represented within The Witcher videogame series. I have particularly zoomed in on the names and professional roles of men and women: particularly those which the player is most likely to interact with. This paper has that shown that there are differences in the frequencies at which gendered characters occur and the professions they have. I have also demonstrated that there may be a form of digital discrimination within the videogame seriesthat in this digital medium, women's leadership is questioned while men's is not, and that women are subjected to a glass-ceiling effect. Such representations can go on to inform how we conceptualise offline professions and gender-roles more broadly.
Returning to the work of Goorimoorthee et al. (2019), who took a "typical playthrough" approach to gathering the data from a videogame, one must question how many of these repeated patterns could be seen in such a sample. As I demonstrated with the discussion of Saskia, depending on the player's choice, she could either be a queen or an army commanderand this could have serious implications for the interpretation of gendered professions. By taking a corpus linguistic approach, I have been able to look at the representation of gendered professions across all potential playthroughs. While this is not to invalidate or completely dismiss the work of scholars such as Goorimoorthee et al., I want to emphasise the use of choosing appropriate methodologies for the scope and aims of research. While the application of a "typical playthrough" was probably more suitable for their research, it would not have necessarily be useful in answering questions about representations across playthroughs and at a broader level. Indeed, there are possible synergies with such an approach and corpus linguistic methodologies: comparing how identities are constructed in a player's interpretation against how it is represented in the series as a whole (i.e., what a player might have missed) could bear fruitful results.
The analysis presented here is also limited in some ways: while I have taken an approach typically associated with "corpus-based" methodologies (see Baker, 2006), I have not necessarily applied other discursive analytical frameworks (e.g., transitivity, modality analysis, appraisal analysis) to the data. Future research might want to examine the way different gendered professional roles are represented through the lens of such frameworks. For example, it could be the case that sorceresses are seen as more agentive than witchers. Such synergies could provide interesting new takes on this dataset.
It is also worth noting that this paper has not explored the demographics of players, or the demographics or the writing team. While this data has not been made available, it could also be the case that these identities influenced how the female characters were represented. More work could be done to explore whether these features, and features from across the genre relating to the representation of women, are the product of the writing team or the audience.
In this paper, I have mostly only focused on the language used to represent gender within a videogame series. While I have argued throughout that this can be a complimentary method to the analysis of gender in other communicative modes (e.g., visual analysis), I have not fully explored those modes in here. Part of the reason behind this is due to the lack of space, and the need to demonstrate some of the fruitful possibilities that corpus methods can afford. In other words, while a number of previous papers have already explored the representation of gender at a visual level, this paper intended to demonstrate other communicative modes which could be explored, and it is hoped that future researchers find ways to weave the analysis together in a complimentary way. While some work has used multimodal corpora (e.g., Balirano, 2013;Weninger & Li, 2022), this has yet to be fully applied to videogame data, but it may be a line for future research.
Finally, it should be remembered that the analysis presented here is a singular case study from one series of videogames, focusing on a small number of gendered names and an in-game profession. Future research could explore a greater range of videogames, professions, and so on to make the findings more generalisable.
The Witcher series is only set to grow in popularity. The release of the Netflix adaptation is only going to help increase game sales, and so it is likely to continue to be a useful source of data for analysis. However, as the series gains more traction, we must continue to be critical towards the ideologies presented in such data and their incremental effects.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.