A Diachronic Cross-Platforms Analysis of Violent Extremist Language in the Incel Online Ecosystem

ABSTRACT The emergence and growth of incel subculture online has triggered a considerable body of research to date, most of which analyzing its worldview or mapping its position and connections within the broader manosphere. While this research has considerably enhanced our understanding of the incel phenomenon, it tends to offer a somewhat static, one-dimensional portrayal of what is—like all online subcultures and communities—a highly dynamic and multi-layered environment. Consequently, we lack sufficiently nuanced answers to what is arguably a critical question for law enforcement and academics alike: is this a violent extremist ideology? Using a uniquely extensive corpus covering a range of online spaces constitutive of the incelosphere spanning several years, we analyze the evolution of incel language across both time and platforms. Specifically, we test whether this language has grown more extreme over time as online spaces shutdown and others emerged. Our findings demonstrate that, while levels of violent extremist language do vary across the incelosphere, they have steadily increased in the main online spaces over the past 6 years. Further, we demonstrate that, while activity on these online spaces is responsive to offline events, the impact of these on violent extremist ideation is not uniform.


Introduction
In recent years, acts of violence inspired by "incel" ideas-either directly or partially as part of "Mixed, Unclear, Unstable" (MMU) ideologies-have become regular occurrences in the United States, Canada, and more recently the UK, fuelling a growing concern and uncertainty among Western security and intelligence services over this online subculture and its offline impact.When prosecuting acts such as Minassian's van rampage in Canada or Friel's preparation of an attack in Scotland, 1 lawenforcement often struggled to identify the appropriate legal categories.Similarly, academic debates have questioned the exact nature of the phenomenon, crucially whether there is such a thing as an incel "ideology" and whether it really constitutes a form of extremism. 2Far from being mere terminological controversies, these debates have critical consequences for the evolution of terrorism legislation and the attitude of law enforcement vis-à-vis incel communities.In the UK, for example, the 2019 Report from the Independent Reviewer of Terrorism Legislation 3 directly addressed this issue, and in the same year Canada decisively listed Inceldom as a violent extremist ideology.Academics have weighed in, intensifying their scrutiny of the "incelosophere" 4 and critically assessing the pertinence of the terrorism label to characterize this form of misogynistic violence. 5t the heart of these debates lies the question of exactly how concerning discussions found in the incelosphere are, from a violent extremism perspective.Most scholars point to the extreme level of misogyny in these discussions and their regular endorsement of violence against women, 6 even suggesting that incel spaces score similar levels of "toxicity of discussion" as far-right platforms. 7thers nuance this analysis by showing that most incel content exhibits typical anxieties of young men transitioning to adults, 8 and that most incels reject violence and struggle with mental health issues. 9hile providing important, initial insights into the incel worldview and narratives, most of the existing research has, we argue, been limited by focusing on a single online space and with little consideration for evolution across time.Research into this phenomenon now needs to be complemented by more dynamic approaches, offering both diachronic and cross-platform analyses, in order to construct a more sophisticated appraisal of violent extremist ideation in the incelosophere.Inspired by new ecological/ecosystems approaches to online extremism, 10 the present paper makes an ambitious step in this direction, selecting linguistic markers of violent extremism-narrowly understood here as the endorsement of violence against dehumanized outgroups-and tracking their evolution across time on multiple platforms constituting the incelosophere.This diachronic cross-platform analysis complements Ribeiro and colleagues' 11 recent analysis of the chronological evolution of the structure of the manosphere (including the incelosphere), which paved the way towards dynamic evaluations of misogynistic ecosystems.
We proceed in three main steps.First, we situate our research question within the broader literature on the incel worldview, using the scholarship on extremist language and the evolution of extremist ideologies to raise the specific hypotheses guiding our research.Second, we present our data-a unique corpus of 11,717,516 posts (331,708,990 words and associated metadata) collected across seven different types of platforms (thirty-two separate entities) used by incels-and explain our computerassisted content analysis method.We built a custom, incel-specific dictionary capturing the intensity of violent extremism in incel text and used it to conduct two types of analyses (longitudinal time-series and semantic correspondence), tracking the evolution of this type of language across time and platforms.Third, we present and discuss our results, which indicate a clear, steady overall increase of violent extremist language in the main lineage of incel online spaces across six years.Our findings also demonstrate the heterogeneity of the incelosphere, which is constituted by a range of platforms whose respective toxicity can vary widely.A conclusion summarizes our main findings and acknowledges shortcomings to pave the way for further research.
This research contributes to existing scholarship in three important ways.Empirically, this paper provides a rigorous analysis of the evolution of violent extremism in the incelosphere across both time and platforms, building a solid basis for ongoing evaluations and debates on the extremist or ideological character of the incel phenomenon.Theoretically, it advances our understanding of how extremist ideologies evolve and change, 12 and enriches the diverse research agenda on the role of the internet in fostering extremism and terrorism. 13Methodologically, we offer to the research community, not only a uniquely large corpus of incel content but also a new dictionary of incel violent language.

Towards a dynamic evaluation of incel violent extremism
Research on incel online spaces has proliferated over the past few years, especially in the wake of Minassian's attack in April 2018, with efforts mostly concentrated on deciphering the jargon-heavy lexicon and underlying worldview.Building on Ging's 14 pioneering contextualization of incel communities as a particularly problematic corner of the broader ideational landscape of the "manosphere," both Jaki and colleagues 15 and Baele, Brace, and Coan 16 have used a variety of computational text analysis methods to raise concerns about the toxicity of the incel worldview.The former showed that a "considerable proportion of the discourse classifies as hate speech, with the forum brimming with misogynist, anti-feminist, and homophobic utterances," 17 while the latter concluded that "the Incel worldview as expressed in online spaces like Incels.me is an extremist one in terms of its logics of categorization and explanation." 18For Baele and colleagues, incel discourse demonstrates typical markers of extremist language: an essentialist categorization of society into sharply delineated ingroups and outgroups where the latter are linguistically dehumanized, and a conspiratorial narrative presenting the ingroup as the victim of an all-powerful structure of oppression. 19Subsequently, Chang 20 unpacked how the "femoid" label used by incels works to dehumanize women, and O'Malley, Holt, and Holt 21 concluded that the incel worldview is organised around five key normative orders: women being naturally evil and cruel, men being oppressed, traditional masculinity being legitimized and desirable, society being a sexual marketplace, and finally violence being excused as a result of the four others.This worldview and its justification for violence is being cemented in online discussions through a range of linguistic and visual practices; for example, Witt 22 has documented how a glorified, "sanctified" figure of Elliot Rodger-who killed six people in California in 2014became ubiquitous in incel forums, serving as an alternative "reality construction that allows for the justification, enactment, and celebration of extreme violence."A qualitative analysis of more than 3,000 forum posts discussing the 2018 Toronto van attack similarly revealed "overwhelming support among self-proclaimed incels for the attack and violence more generally" 23 Scholars have situated these ideas within the broader ideational landscape of the "manosphere" and, while other parts of the manosphere might quantitatively have greater offline impact, it has been suggested that the incelosphere constitutes its most extreme or toxic subculture in terms of content. 24any incel terms and concepts-e.g., lookism, feminism as a corruption of the natural order, society as a sexual marketplace, "pillings" metaphors (blue, red, and black pills), and dehumanizing conceptions of women as only driven by lust-stem from pre-existing communities of a more mainstream "networked misogyny," 25 such as Men's Rights Activists (MRA), Men Going Their Own Way (MGTOW) or Pick Up Artists (PUA).These groups are similarly concerned with the alleged feminization of the world, the concept of masculinity in crisis, and the perceived erosion of binary and hierarchical gender views, which ultimately creates a sense of male victimization and hatred towards women and liberals. 26Social media platforms have also created novel opportunities for these misogynistic views to evolve and be disseminated more widely than ever before, resulting in a large online misogynistic ecosystem that some have referred to as a "toxic technoculture." 27As a result, the manosphere has come to house increasingly violent and hostile views towards women and other select groups, with subcultures such as incels and MGTOW representing the most extreme elements. 28oreover, recent work has pointed at both direct overlaps and ideological cross-pollination between the incel (but also the broader manosphere) and far-right online ecosystems, 29 with some arguing that far-right entrepreneurs actively use the insecurities of incels and their anti-female supremacist views to lure them into forms of white supremacy. 30hile this research has greatly enhanced our understanding of the incel worldview and its connections to ideas of the surrounding manosphere and adjacent far-right ecosystem, we suggest that it fails to capture the dynamism and heterogeneity of different incel formations, and therefore points to a simplistic understanding of extremism and associated acts of violence.In reality, incel subculture comprises a range of intersecting communities hosted by diverse platforms, which have their own unique conventions and affordances (sub-Reddits, "chan" image-boards, internet forums, etc.) and logic of evolution over time.Now that a good baseline understanding of the subculture has been established, research needs to adopt a more dynamic, ecological approach to account for potential chronological evolutions and cross-platform differences.Crucially, Ribeiro and colleagues 31 have provided a first diachronic analysis of the broader manosphere, documenting that "milder and older communities, such as Pick Up Artists and Men's Rights Activists, are giving way to more extremist ones like Incels and Men Going Their Own Way," and suggesting that "these newer communities are more toxic and misogynistic than the older ones."The present paper continues in this vein by zooming in on the incelosphere and assessing whether its different online spaces display similar or different levels of violent extremism, and whether these levels have evolved across time.This approach also dovetails with some of the survey research that has been conducted among self-identified incels, which shows that extremist views and the endorsement of violence are not uniformly held. 32 number of theoretical frameworks relating to the evolution of extremist views, spaces, and communities across time allow us to raise specific hypotheses on the evolution of extremism in the incelosphere.At the most general level, work carried out in the lineage of Myers' paradigm of polarization through group deliberation 33 has repeatedly shown that discussion between politically like-minded people gradually leads, under certain circumstances, to enhanced support for radical views and greater endorsement of violence. 34These processes apply to the online world: politically coherent online spaces tend to become echo-chambers that foster polarization, 35 and some of these echo-chambers radicalize their members even if they have no offline contact with extremists. 36From this literature, we can raise our first, main hypothesis: H1: Discussions hosted by the incelosphere have displayed increasing levels of violent extremism over time.
However, other theoretical frameworks encourage us to nuance this general hypothesis.First, detailed genealogies of extremist movements, including terrorist ones, document the commonality of "splintering" processes, whereby extremist ideologies tend to fragment over time into a range of subideologies supported by rival factions, with one splintering "avenue," usually a minority one going towards increasing radicalism. 37Indeed, as the aforementioned study by Ribeiro and colleagues 38 demonstrates, the incel subculture is already a more extreme splinter of the manosphere.Second, studies of other extremist online ecosystems revealed that established extremist spaces can induce a chain-reaction of increasingly radical offshoots, with the main platform closure sometimes acting as a trigger.Baele, Brace, and Coan's 39 analysis of the Chan image-boards, for example, showed that the proliferation of boards on the back of 4chan ended up producing a "three-tier" hierarchy of decreasing popularity but increasing extremism.Third, significant external events are known to encourage/ discourage extremism.Of special relevance here are acts of violence inspired by the ideology, which can either inspire or disgust members, thereby increasing extremism or/and triggering the creation of sub-communities with a clear stance against violence.Baele, Brace, and Coan, for example, documented a "Tarrant effect" on 8chan, where the Christchurch shooting was applauded and inspired further violence. 40Similarly, Witt's 41 abovementioned study offered qualitative evidence that Elliot Rodger is celebrated as a "saint" in incel forums in ways that imply an endorsement of violence.Also relevant for the incel case is the impact of the COVID-19 pandemic and lockdown(s); Davies, Wu and Frank have already shown that one incel forum experienced a "sustained increase in activity" during the pandemic, which constitutes a fertile situation for more extremist views to develop. 42In sum, these three different strands of the literature suggest, in different yet convergent ways, that extremist (online) ideologies do not evolve in a uniform, linear way but rather through a more uneven process involving splintering into both more and less radical variants.We can therefore offer the following, second hypothesis with its two sub-hypotheses: H2: The general evolution of incel online discussions towards more violent extremism has not been uniform.
H2a: Different platforms host more/less violent extremist content, representing lineages of ideological splintering of the incel community.
H2b: External events-acts of incel-inspired violence and the COVID-19 lockdown-have triggered increases/decreases in violent extremist content.

Data and methods
To evaluate whether the incelosphere's violent extremism increased in such a way over time and platforms, we harness computational methods to analyze a uniquely large corpus of online incel content and accompanying metadata.

Data
After extensive exploration of incel online spaces using classic snowballing and outlinks techniques, a series of custom-built web scrapers were developed in the generic high-level programming language Python to collect the content of a range of pertinent spaces.These scrapers utilised several common packages such a ScraPy (https://scrapy.org/)and Requests (https://pypi.org/project/requests/). Depending on the platform being scraped, various APIs were used to aid data extraction where appropriate, for example the Telegram scraper we developed used the Telethon API (https://docs.telethon.dev/en/stable/).Data collection was done first-hand by the researchers, following a double process of ethics clearance, with the only exception of the sub-Reddits /r/Incel, /r/Incels, /r/Braincels, and /r/IncelsWithoutHate. 43 These sub-Reddits were shut down for violating Reddit's terms of service in relation to hate speech and bullying long before this project began, so their content was extracted from the Pushshift.ioopen-source data archiving site, which has been collecting Reddit data and making it freely available to researchers since 2015. 44Table 1 below provides key descriptive statistics for the assembled corpus.This dataset totalling no less than 11,717,516 posts and 331,708,990 words (with associated metadata) is made available to the scholarly community for further analysis. 45igure 1 below plots the activity on the main types of platforms across time, with daily posts for each online space aggregated to the level of platform type; for example, the blue line depicts the total number of posts made to all forums listed in the above dataset on any given day.Figure 2 breaks down the data underpinning Figure 1 by displaying the total posts per individual online space on any given day for the various platforms (forums, sub-Reddits, and chan boards, respectively). 46Both of these graphs also feature a light green shaded area depicting the time of the first COVID-19 lockdowns in the U.S., Canada, and Europe, as well as vertical green lines that indicate when prominent acts of incel-related violence occurred in these countries, details of which are contained in Table 2 where the initials used in these graphs are connected to incidents.While this is not an exhaustive list of incel-related violent attacks, it provides a useful overview of those that have had considerable media attention and that might have triggered increases/decreases in violent extremism as per H3.
Four important observations can already be made at this stage, paving the way for the evaluation of our hypotheses in the next section.First, forums currently dominate the incelosphere, with Incels.isoccupying a major role as the longstanding anchor of the community since the closure of the sub-Reddit /r/Braincels. 48With more than four years of continuous text data, this forum provides an excellent source for the analysis of language evolution across time.Figure 2 shows that the less active Incels.netstarted to increase in activity during 2019 and reached its peak during 2020 at the height of the COVID-19 lockdown(s), but given that this increase started in the summer of 2019, it is hard to establish whether these fluctuations are related to the pandemic or just a natural increase and decrease in site traffic.The recent establishment of another new forum in 2022-Blackpill.club-alsoallows us to test whether a rival, initially much smaller forum, occupies a more extremist niche.Additionally, Figure 2 shows that, although their activity is dwarfed by that of Incels.is,there are a number of thematic forums that have had relatively stable levels of activity that started to increase from mid-2021 into 2022.This includes Wizchan.org, an incel space dedicated to incels who are over thirty years of age, and Neets.me,which is an acronym of "Not in Education, Employment, or Training."While NEETs does not present as an incel space, shared concepts and outlinks between NEETS.me and other incel spaces indicate a significantly large shared userbase.The forum Looksmaxxing.orgalso appears to share a similar userbase and also started to grow in popularity during 2021. 49econd, although Reddit has hosted important incel spaces, particularly during the initial formative years of the ideology, the platform's implementation of terms of use on hate speech and bullying and its quarantine policy 50 have made it a more unstable place for incel communities.As a result, the Reddit region of the incelosophere has produced a series of different, shorter lived communities with overlapping  yet not identical membership, a situation prone to the appearance of new ideological positionings, especially on the issue of violence.Looking at Figures 1 and 2, we see that the first major online space dedicated to incels (in our dataset) was the sub-Reddit /r/Incels, which quickly gained a high number of daily posts.The closure of this original sub-Reddit resulted in the creation of Incels.is and /r/Braincels, and the spikes and slopes in daily posts exhibited by these two platforms in Figure 2 is indicative of some inter-platform competition.The other large incel sub-Reddit was /r/IncelsWithoutHate, which claimed to be an online space for those who struggled with finding sexual relationships but did not agree with the language and themes of other incel online spaces.While this was also shut down for violating Reddit's terms of service, it does also appear to show some interaction with Incels.is.
Third, the recent rise of incel activity on the Chan imageboards provides a good occasion to evaluate ideological splintering.Indeed, the longstanding /R9K board of 4chan, which has played a historic role in the development of the incelosphere, is now accompanied-and overcome-by the / leftcel board, which claims to offer a different, radical left-leaning ideological take on incel issues.
Finally, fourth, the above figures seem to indicate an impact of external events, namely the initial COVID-19 lockdowns and the crimes of Minnassian and Hernandez, on online engagement dynamics.On the one hand, a clear spike of activity corresponding to the COVID-19 pandemic lockdown can be seen, as already noticed by Davies, Wu, and Frank's abovementioned work; whether this event has enhanced violent extremism in discussions therefore ought to be assessed. 51On the other hand, some acts of incel-inspired violence correspond with spikes/drops in activity on certain online spaces, potentially indicating discussions on their nature and morality as well as membership increase/dropout (and platform shutdown) that could have led to more/less violent extremist language.In particular, Minnassian's and Hernandez' crimes correspond to sharp spikes or drops in a couple of online spaces, which warrants specific investigation for H2b.

Methods
Given the size of the corpus, we leverage computational methods to assess in the language of the incelosphere the evolution of violent extremism; we believe that such a zoomed-out, large-N, study aptly complements the qualitative approach taken so far by the majority of the scholarship on incel spaces, hence participating in the development of a multidimensional understanding of the phenomenon.
Our method rests on the now well-established idea that violent extremists' language possesses specific markers both in terms of content 52 and non-content features, 53 as demonstrated in a wide range of empirical cases.Specifically, we adopt a dictionary approach that locates these markers, using a custombuilt, incel-specific lexicon to measure the salience of violent extremism in the linguistic content of the incelosophere.A range of alternative computational text analysis methods are available and potentially relevant to the present inquiry, such as those focusing on the grammatical structure of texts 54 or measuring the specific semantic markers of aggressive language. 55We believe, however, that at this stage a dictionary-based method is a best first step, which can be further complemented by these other approaches at a later stage, due to the possibility of tailoring the dictionary to the incel lingo and therefore increasing the likelihood of it providing a solid first set of findings.Following Cohen's argument that dictionaries ought to be adapted to the specific lexical fields under scrutiny, we established a composite dictionary of incel-specific linguistic markers of violent extremism, and measured fluctuations of this ratio (number of dictionary words/total number of words) across time and platforms. 56Three published experts of incel language independently selected, from the 5,000 most frequently words of the entire corpus, three types of words; in a second step, disagreements on initial categorizations were settled by consensus.First, they included verbs unambiguously expressing acts of violence (e.g., "stab," "kill," "rape").Beyond the mere fact that these verbs denote discussions of violent actions, evidence indicates that they are more common in violent groups' texts than in non-or less-violent extremist ones.Second, and in the same vein, we included nouns that label weapons (e.g., "gun," "knife," "acid"), which are obvious proxies for discussions of violence.Third, we included nouns that dehumanize the outgroups (e.g., "femoid"/"foid," "roasties"; "curry").Research indeed insists that dehumanizing outgroup labelswhich are "the most extreme form of negative out-group identity construction" 57 and usually constitute the cornerstone of radical essentialist "discursive strategies" 58 -are a strong indicator for the endorsement of, encouragement for, and engagement in violence, as they imply extremely negative evaluations, foster moral disengagement, and encourage depersonalization. 59Roozen and Shulman's 60 study of the language of the extremist Hutu radio RTLM showed, for instance, that the use of dehumanizing labels increased in the build-up to the Rwandan genocide and then even further as the killings intensified, and Miller-Idriss' work demonstrated the role of dehumanizing labels in far-right narratives claiming that outgroups pose an existential threat to the ingroup. 61Chang's study of the "femoid" label on /r/Braincels already showed how important this particular dehumanizing label is to the incel worldview. 62Overall, our composite "Incel Violent Extremism Dictionary" (IVED), which coalesced these three types of words, comprised 172 words (e.g., "landwhale," "kill") including some of the most recognizable incel terms (e.g., "femoid").The full dictionary is made available to the scholarly community in Appendix 1.
The online spaces discussed above were sliced into monthly sub-corpora, and for each, the IVED score was calculated to allow for an analysis of the evolution of the language captured by the dictionary.In the analysis below, we use the IVED scores to build two types of graphs with associated statistics, each providing a different way to identify and visualize potential diachronic and cross-platforms evolutions.First, we conducted a classic longitudinal time-series analysis, plotting the monthly scores of each platform.This enabled the visualization of fluctuations in linguistic markers of extremist violence across time-(online)space as well as the statistical evaluation of potential changes across time (with a particular attention paid to COVID-19 and incel-inspired violence, as per H3).Second, two different correspondence analyses were carried out in order to explore, in an alternate way, the differences between each platform's corpus.Correspondence analysis is a "multivariate exploratory space reduction technique for categorical data analysis," which allows for the identification of patterns of association and disassociation in complex categorical datasets 63 through the generation of "a low-dimensional projection space with simultaneous placement of both documents and features, making it ideal for explorative analysis in text mining"-in our case, the underpinning categorical dataset is the matrix table including, for each word in the multi-platform corpus, its frequency on each platform. 64Here, we innovated by including in the graphs both the most frequent incelosphere-related yet non-extremist words (such as "woman," "girl," "man," which serve as an indicator of the typical discussion) and the entire IVED, each coalesced into a single signifier that condenses semantic variety into a unique, easily readable datapoint.We constructed two types of correspondence analyses.First, we built a static correspondence analysis, positioning each platform as a single point according to the relative importance of violent extremist language in its content; this provides a general visualization of potential lexical differences between individual online spaces, without consideration for diachronic change.Second, we constructed a dynamic correspondence analysis representing with x timepoints what we identify as the four most scientifically relevant platforms of the incelosphere (/r/Incels, /r/Braincels, Incels.is, and Incels.net),where x is the number of sub-corpora yielded by a platform when its entire corpus is divided into three-month chunks, in order to display the evolution of these four spaces over time. 65This not only allows for the identification of potential trajectories of platforms towards more/less violent extremist language, but also displays the relative positions of the four platforms with regard to their violent extremist content at different points in time.

Results and discussion
To test our hypotheses, we therefore proceed in three steps.First, we look at the static correspondence analysis encompassing all the digital spaces from our dataset.Second, we focus on the dynamic correspondence analysis zooming in on the four most prominent online spaces from our corpus.Third, we conduct the time-series analysis based on the online spaces' monthly IVED scores.

Static all-spaces correspondence analysis
Figure 3 includes, as datapoints, all online spaces from our dataset, as well as the two dictionaries (IVED, and most frequently occurring non-extremist incel terms); this model accounts for 90.2 percent of the variance in the corpora, which is an excellent fit for such a multi-platform corpora.At this stage, the reader should be reminded that correspondence analysis graphs do not show absolute frequencies (in this case highly occurring words on a platform) but instead depicts relatives.In comparing online spaces in the figure, it is therefore important to understand that uniqueness of a space or word is represented by how far it is from the graph's origin (point 0,0 on the graph where the vertical and horizontal lines meet), with data points that are further away from the origin being more differentiated.In contrast, online spaces and words that are closer to the origin are less distinct, with data points that are centred on the origin having (in the figure below) an 87.37 percent chance of not having any distinguishing features, as shown by this value being the variance accounted for on the x-axis and why the two dictionaries are on opposing ends of this axis.Comparing a platform datapoint and a term datapoint thus involves evaluating the length of the line between the graph origin and both the term and platform datapoint independently, with longer lines indicating an association between them; the angle between these two lines are then assessed, with smaller angles indicating a strong association, 90-degree angles indicating no association, and 180 degrees or near indicating a negative association.Because correspondence analysis can be prone to misinterpretation, we offer a full explanation on how to interpret our graphs in Appendix 2, with examples from our dataset individually described.
Two main observations stem from Figure 3. First, the various platforms used by incels appear to occupy different lexical spaces.All forums (with the exception of lookstheory.comand Wizchan.org, which was expected) 66 are situated close together in the graph, as are the two chans and the subreddits, denoting the fact that incel forums, chans and subreddits are generally different when it comes to their violent extremist content.In particular, incel violent extremist language is more specific to the forums than to the chans and the reddits; indeed, the very small angle between the IVED dictionary and the forums, centring on the origin, means that the extreme words are very specific to the forums, relative to the chans and subreddits which have a closer relationship with the non-extreme incel words like "incel," "women," etc.This evidences that forums host a greater proportion of violent words than the other two platforms.To summarize, the incelosophere is not homogenous in terms of violent extremist language use in its constitutive platforms; as a result, any assessment of violent extremism in the incel online ecosystem should take stock of the differences between relevant online spaces.
Second, differences also exist within subreddits that capture key moments in the evolution of the incel online ecosystem across time; specifically, the differences between /r/Incel, /r/Incels, and /r/Braincels are important. 67As the first two dedicated incel subreddits-/r/Incels and /r/Incel (which had a life span that partially overlapped)-were shut down, the /r/Braincels sub-reddit came into being and quickly became the most notable online incel space before being shut down again.Our graph shows that these three subreddits get closer to the graph's origin in temporal order, with /r/Braincels being much closer to the data points for the forums (whose emergence and success was a response to its closure).Put differently, the evolution from /r/Incel and /r/Incels to /r/Braincels is one towards a greater proportion of violent extremist language, paving the way to the even more extreme forums.The more recent evolution of incel activity on Reddit has, however, followed an opposite direction./r/TheRedPill, arguably the most notorious of the currently active subreddits, is located very close to the point representing the generic dictionary of frequent incel words; this shift back to the left indicates that users actively tone down some of the more extreme conversation to avoid having the board shut down after Reddit placed it into quarantine, and also reflects the reduction in posts caused by being in quarantine.To sum up, while Reddit has initially hosted communities that increasingly adopted violent extremist language, the platform's actions (quarantine system, closures) seems to have eventually tamed the discussions in an effective way.The emergence of /r/Braincels and Incels.is at the same time combined with the aforementioned more extreme nature of conversations on forums in the ecosystem could be symptomatic of users migrating the more extreme conversations to the latter platform as it is much more difficult to shutdown a forum compared to a sub-Reddit.
This first correspondence analysis therefore already provides a very preliminary answer to some of our hypotheses.Specifically, there is already partial evidence for H1 and more clearly for H2.Discussions hosted by the incelosphere have displayed increasingly violent extremism over time at the ecosystemlevel (H1), but this evolution has not been uniform (H2), characterizing only the main lineage (H2a) from /r/Incels and /r/Incel to /r/Braincels and the major forums.Contrarily to usual processes of splintering, it is the main lineage that seems to have moved towards greater violent extremism, not small splinters.The next steps consolidate these findings.

Dynamic 4-spaces correspondence analysis
Figure 4 zooms in on the diachronic lexical development of this main lineage, plotting together /r/ Incels, /r/Braincels, Incels.is, and Incels.net at three-month intervals (with the platform datapoint labels following a sequential order).The relative specificity of generic incel language and incel violent extremist language to each of these time-stamped online spaces is thus displayed, leaving two evident "trails."First, the dominant, longest running platform-Incels.is-started off not strongly associated with terms from either dictionary, before moving towards the more extreme one, particularly using misogynistic and racist terms, in near sequential order over time.In other words, extreme incel terms have gradually become more specific to Incels.is compared to the three other online space displayed here, denoting an increased prevalence of these words in the discussions they hosted.Second, the first major dedicated incel online space-/r/Incels-initially had more extreme content similar to that seen in the later forums, before toning down: its first time point is closer to the graph origin than its later iterations, meaning that its content was at that time less differentiated between the non-extreme incel language and the violent extremism one.Time points 2, 3, 4, 6, 7, 8 then show a greater distance from the origin towards the left, combined with an increasing angle with typical non-extremist terms such as "incels," "female," "girl."/r/Braincels and Incels.netdo not display very clear evolutions towards more/less violent content./r/Braincels' later timepoints do move closer toward the graph's origin, and in spite of some later timepoints for Incels.nethaving a closer association with extreme terms such as "foids," the forum does not mark a constant lexical evolution.In sum, our dynamic correspondence analysis of the four main online spaces constitutive of the main incel lineage chiefly evidence the particularity of the ecosystem's premier community, Incels.is,which relative to its predecessors has hosted an increasingly differentiated language moving towards greater relative proportions of violent extremist terms.If we understand this forum to be the main discussion space of the ecosystem, then the graph confirms H1 of increasing violent extremism over time, as words denoting general incel considerations leave more and more room to IVED words denoting dehumanization and violent aggression.Yet again, outside this forum the evolution has been less linear.

All-platforms time-series
Figure 5 plots the salience of the IVED (ratio per one hundred words) across time for each of the online spaces of our dataset.This final step allows us to confirm and further clarify the findings of the correspondence analyses when it comes to H1, H2 and H2a, and offer new evidence on H2b.
First, the most important online spaces of the incelosophere-where the main "discussions" are taking place-are indeed marked by increasing levels of violent extremism in language, confirming H1, with the four major online spaces included in our dynamic correspondence analysis having positive slopes (incels.is= 0.00023; incels.net= 3.21993; /r/Incels = 6.12044; /r/Braincels = 0.00029).As shown in Figure 6 for Incels.is(and Appendix 3 for the other three platforms), plotting these ratios as a box-and-whisker plot shows with even greater clarity the steady year-on-year increase in the ratio of IVED terms.
Second, Figure 5 further evidences that this increase has not been uniform across incel online spaces, some of which do not display increases of IVED words across time.Particularly noteworthy are the lower scores of subreddits since quarantine policies were enforced and the closure of /r/Braincels.Newer sub-Reddits indeed exhibit significantly lower levels of violent extremist language than the forums-Wizchan.orgbeing the only exception-or the two chan image-boards.After November 2019, which corresponds to the end of /r/Braincels, the average IVED score for subreddits (/r/TheRedPill, /r/BlackpillScience, /r/IncelExit, /r/FA30plus, /r/AntiFeminist) has been 0.16, against 0.84 for incels.is,0.65 for incels.net,and 0.52 for Neets.me:forums now host significantly more violent extremist discussions than sub-Reddits.
To further confirm the above findings that the incelosphere, at the ecosystem-level, has become increasing extreme over time, the trend data for the four main platforms of /r/Incels, /r/Braincels, Incels.is, and Incels.netwas extracted from the time series and plotted in Figure 7 as a three-month rolling average.While there is variation with /r/Braincels due to Reddit's policies and Incels.netdue to it being in competition with the larger Incels.is, Figure 7 shows that when considering the genealogy of these platforms, there is a clear upward trend in the level of extremist discussions within the incelosphere between 2016 and 2022.From a starting point at an IVED score of 0 in January 2016, this graph ends with a score of 1 exactly six years later, a sharp increase denoting that at this state one word out of a hundred was either a dehumanizing label or a direct depiction of violence.
Third, the above seems to indicate an impact of external events on online engagement dynamics, particularly in regards to the initial COVID-19 lockdown measures and the attacks by Minnassian and Hernandez.Thus, we conducted a two-fold analysis of the impact of these events on the ratio of IVED terms being used in discussions on two of the four major online spaces that were active during these times: Incels.is and Incels.net.First, Figure 8 depicts both the above IVED ratio score and the median IVED ratio score for six distinct time periods; (1) before Minassian's attack, (2) the month of Minassian's attack and the subsequent 2 months, (3) the months from that point until the first COVID-19 restrictions, (4) the 3 months that these initial restrictions were in place, (5) the month of Hernandez's attack and the subsequent 2 months, and ( 6) the remaining time-span in our data.Second, to test for structural break points in the ratio of IVED term use, we utilised the Chow test, which is based on a structural break in the data being assumed a priori due to a specific and major event, in this case the initial COVID-19 lockdown or either of the two attacks, and determines whether the coefficients between the regression line before the event and the regression line after the event are equal.If they are not equal, then it concludes that there is a structural break in the data; in other words, the data pattern is different after the event.Table 3 shows the results of three different and unconnected Chow tests for each of the three events in chronological order.
While the median use of IVED on Incels.netincreases dramatically during the COVID-19 lockdown/Hernandez attack and actually decreases following Minassian's attack (Figure 8), the Chow test shows that the only event of these three that led to a statistically significant structural break in the use of IVED terms on Incels.net is Minassian's attack.This inherent contradiction between the descriptive statistics and the Chow test is likely due to the relatively large temporal fluctuations we see in the monthly IVED ratio score in Figure 8 impacting the coefficients of the latter two tests with their identical statistical test scores being explained by the temporal proximity of the two events.The results for Incels.isoffer more insight due to its popularity.We see that, despite there being many more daily posts made to the forum around the time of the first COVID-19 lockdown (Figure 2), this did not result in a significant increase in the median use of IVED terms (Figure 8).Moreover, while the COVID-19 lockdown and Hernandez's attack resulted in a statistically significant structural break in the data (Table 3), it was actually Minassian's attack that led to the largest structural break in the IVED ratio data (Table 3) and increase in median use of IVED terms (Figure 8).This is interesting for two reasons.First, it shows that, as we hypothesised (H2b), the behaviour of these online spaces is responsive to real-world, offline, events.Second, it indicates that an increase in daily posts does not necessary correlate with an increase in extremist discussions.This analysis complements Davies, Wu, and Frank's above mentioned work in showing that there is an enhanced sense of violent extremist discussions during the three-month period of the first set of COVID-19 restrictions, more so for the smaller Incels.net. 68Interestingly, while Hernandez's attack has no noticeable impact, Minassian's attack signals an increase in IVED term usage on Incels.is and a decrease on incels.net.

Conclusions
Pressing debates about whether the incel worldview belongs to the realm of extremist ideologies-and thus whether acts of incel-inspired violence potentially fall into the "terrorist" category-have so far rested on somewhat monolithic appraisals of the incel online ecosystem.This paper sought to provide a richer, multi-dimensional basis for this important question by exploring the diachronic evolution of the linguistic markers of violent extremism across a range of platforms constitutive of the incelosphere.
Analysing the largest known linguistic corpus of incel online content (spanning all the major incel online spaces between 2014 and 2022) with a custom dictionary of incel violent extremist language (IVED), we showed that the main lineage of incel online discussion has worryingly hosted, over time, an increasing proportion of dehumanizing outgroups labels and words depicting violence.We also highlighted, however, the heterogeneity of the incelosphere when it comes to violent extremist expressions, with IVED language sometimes more than four times more salient in some online spaces compared to others.Overall, particular platforms tend to have their own specific linguistic profile, with forums being more toxic than sub-Reddits (especially since the summer 2019 with the closure of /r/Braincels).When looking at the current major hub of the incelosphere, Incels.is, the COVID-19 lockdown and the two acts of incel-inspired violence associated with noticeable changes in posting activity (Minassian's Toronto van attack and Hernandez' Glendale shooting), were correlated with a statistically significant change in the pattern of data.However, this change was much larger for Minassian's attack than the latter two events, meaning that while one of our hypothesis holds, the dynamics between offline events and online behaviours are not uniform and a growth in the number of daily posts to an online space does not correlate with an increase in the amount of extremist content.
The nuanced, dynamic picture of the incelosphere rendered by the present study calls for further research on several fronts.First, the present analysis, which offered an evaluation of violent extremist ideation based on linguistic markers, might be enriched by a complementary study of visual tropes similarly reflecting an endorsement of violent extremism, such as avatar profiles containing pictures of killers or nazi iconography.Second, while offering a solid longitudinal study of the incelosphere, the present effort did not succeed in retrieving sufficient data from "historical" incel forums tracing further back, like Sluthate.Including them into the analysis would offer insights on the early development of aspects of incel violent extremist lingo such as dehumanizing labels.Third, this study calls for a detailed analysis of cross-platform migrations; our data points to probable migration dynamics between online spaces, especially for Incels.is,Incels.net and /r/Braincels, which is certainly an area worth exploring in more detail.Furthermore, the decrease in the daily number of posts for the platforms in our dataset, particularly Incels.is,starting in late 2020, combined with an increase in daily posts to other sites such Lookstheory.organd the emergence of sites such as Blackpill.club,offers anecdotal evidence that there might now be other prominent online incel spaces that do not feature in our dataset, suggesting that the incelosphere is not just a series of online spaces dedicated purely to incel ideology and discussions, but a dynamic and continually evolving ecosystem connected to neighbouring ones.Fourth, the present study misses an important dimension of incel violence: suicide and self-harm.These forms of violence, which are ubiquitous themes in online discussions, call for serious investigation from another angle than that of extremism; since our study came from that latter perspective, it was geared towards expressions of outgroup hate, but a complementary study aimed at investigating mental health issues would need to analyse linguistic and visual tropes reflecting ingroup depreciation and suicidal and self-harm ideations, to offer a truly nuanced account of violence in the incel worldview.Fifth, further studies should look beyond forums' IVED scores and examine whether these scores mask variation between particularly extreme contributors and less extreme individuals, in the vein of Scrivens and colleagues' (e.g., Scrivens et al.) 69 work on differences in terms of extremist views between types of posters in far-right forums.Additionally, further studies could analyse our data with alternative language analysis tools, as evoked already in the methods section.Finally, our study offered valuable insight into the development of an extremist ideology over time, calling for other cases to be studied in similar ways in order to gain data susceptible to strengthen existing attempts to theorize this important issue.

Disclosure statement
No potential conflict of interest was reported by the author(s).row across all platforms (x) by the average frequency for all terms within platforms (y), divided by the average frequency for all term-platform combinations (a): This means that in the figure below, a word that appears frequently in all platforms, such as "woman," would have a low residual, whereas a word that is more unique to a specific platform, such as "anons" on the chan boards, would have a high residual.If the residual for a specific term-platform combination is a large positive one, this indicates a strong positive relationship between the two, with the opposite being true for negative valued residuals.
These residuals then determine the placement of the data points in our graph.However, these data point locations have to be evaluated in conjunction with the amount of variance accounted for in the graph, as depicted along the two axes.The graph below accounts for 94.04 percent of the variance, meaning that it represents a vast amount of the information contained within the residuals, with the majority of the relevance occurring along the x-axis (89.39 percent).
In the graph below, therefore, platforms with similar residuals have been placed closer together; the same applies to the terms shown in the graph.However, it is crucial to understand that the proximity of a platform data point to a term data point does not mean there is a higher residual association between the two, due the complexity of placing term data points in such a way that their location accurately reflects its residuals for various platforms.Instead, in a correspondence analysis, how unique a platform or term is depends on how far it is from the graph's origin (point 0,0 on the graph where the vertical and horizontal lines meet), with data points that are further away from the origin being more differentiated.In contrast, platforms and terms that are closer to the origin are less distinct, with data points that are centred on the origin having a 89.39 percent chance of not having any distinguishing features.
Comparing associations between specific platforms and terms is a little trickier.Here, we need to look at the length of the line between the term data point and the graph origin, and do the same with the platform data point, with longer lines indicating a term having high association in both cases.In the graph below, we see that the line drawn between the graph origin and the term "incel" (green line) and the one between the graph origin and time point 4 for Incels.net(red line) are both relatively long, thus indicating a high association.However, to fully understand an association between a term and a platform, we also have to look at the angle between these two lines, with smaller angles indicating an association, ninety-degree angles indicating no association, and 180 degrees or near indicating a negative association.In the graph below, therefore, we see that the green line and red line form a very small angle, indicating that the two are strong and positively associated.

Figure 1 .
Figure 1.Evolution of posts across time, aggregated to platforms types.Graph shows a 5-day rolling average of number of posts per day.

Figure 3 .
Figure 3. Correspondence analysis of incel online spaces' lexical proximity with violent extremist language (IVED) and most common incel words.

Figure 4 .
Figure 4. Dynamic time-split correspondence analysis for four foundational incel platforms' lexical proximity with violent extremist language (IVED).The graph displays the fifteen most frequently occurring IVED words as well as the ten most commonly occurring "non-extreme" terms.

Figure 6 .
Figure 6.Year-on-year trend of IVED term ratios for Incels.is.

Table 1 .
Corpus descriptive statistics

Table 2 .
Key acts of incel-related violence that have gained substantial media attention

Table 3 .
Chow test for structural breaks in the IVED ratio data for Incels.isand Incels.net