Silent Majority or Vocal Minority: A Corpus-Assisted Discourse Study of Trump Supporters’ Facebook Communication

The topic of immigration remains contentious in American political debate, and it played a prominent role in the presidential campaign of 2016, being exploited by presidential candidate Donald J. Trump. His proposal to ban Muslims from entering the USA stimulated passionate discussions, including those on social networks. The material for the investigation was collected from Trump’s official Facebook page and was analyzed by Critical Discourse Analysis and Corpus Linguistic methods. The results were compared to the data from the COCA corpus as an indication of the general Americans’ attitude toward Muslims. The results demonstrate a difference in the negativity level between Trump’s supporters and the broader American public and provide a window into the ideology of the section of the US population supporting Trump. Publisher’s Note: This article was originally published with an incorrect peer review statement, which said that this article was an internally reviewed editorial. This has now been amended to reflect the fact that this is a piece of research that underwent double blind peer review by two external reviewers.

The topic of immigration remains contentious in American political debate, and it played a prominent role in the presidential campaign of 2016, being exploited by presidential candidate Donald J. Trump. His proposal to ban Muslims from entering the USA stimulated passionate discussions, including those on social networks. The material for the investigation was collected from Trump's official Facebook page and was analyzed by Critical Discourse Analysis and Corpus Linguistic methods. The results were compared to the data from the COCA corpus as an indication of the general Americans' attitude toward Muslims. The results demonstrate a difference in the negativity level between Trump's supporters and the broader American public and provide a window into the ideology of the section of the US population supporting Trump.

Introduction
Online social networks, such as Facebook and Twitter, play an increasingly important role in society. They are widely used by the public, and they are also increasingly employed by politicians, especially since the 2008 U.S. presidential campaign (Cogburn and Espinoza-Vasquez 2011). U.S. President Donald Trump seems to be quite successful in adopting Twitter as a strategic communication tool to disseminate his views and to attain popularity, practice self-promotion and criticize his opponents (Goldfarb 2017;Kreis 2017). He was also present and active on Facebook (FB) during his presidential campaign.
One of Trump's signature positions is his anti-immigrant stance, one of the aspects that was instrumental in attaining and increasing his popularity. A sizable portion of the U.S. population appears to feel threatened by the ongoing flow of immigrants seeking refuge or a better standard of living. These citizens request a halt or a slowdown of legal immigration and deportation of "illegals" -those who came to the country without proper documentation.

Anti-immigrant anger was initially focused on migrants from Latin and South
American countries, especially Mexico, but later, prompted by the Syrian refugee crisis in Europe and exacerbated by reports of several terrorist attacks, it has shifted to include the group designated by the general term 'Muslims'. Followers of Islam are often discussed, and sometimes discriminated against, as a homogenous group as if all of its members shared the same set of characteristics (Akbarzadeh and Smith 2005;Baker, Gabrielatos and McEnery 2012;Poole 2002;Törnberg and Törnberg 2016). This is how the then presidential candidate Trump referred to them when he suggested a "total and complete shutdown of Muslims entering the United States until our country's representatives can figure out what the hell is going on" during one of his campaign rallies (Trump 2015).
This message seemed to resonate with the public: the post on Trump's official FB page with a video clip from the rally where the proposal was made attracted over 17,400 responses within two weeks (over 35,000 at the time of this article's preparation). Many of those responses received replies of their own, and some of the Trump's supporters claimed that they represented mainstream U.S. views as the "silent majority" which supposedly had found its voice and refused to be sidetracked by the "vocal minorities" dominating the U.S. social and political scene. Several FB groups combine the phrase "silent majority" and the name of Trump in their title (e.g., "Trump 2016"). Trump, in turn, has referred to his supporters as the "silent majority", promising that they will be silent no more and implying that he would verbalize their beliefs and positions (for example, see Crowley 2016). However, such beliefs might be influenced by the tendency of individuals' social networking accounts to develop info media bubbles and echo chambers (Geschke, Lorenz and Holtz 2019;Pariser 2011;Schwarz and Shani 2016). Social media that allows users to "unfriend" and block others enables people to hide from unfavorable voices, discourages dialogue between different factions, and deepens social division (Baysha 2020;Stroud 2010).
This project aims to determine how closely the ideology of Trump's social base of supporters was aligned with mainstream American views on immigration and Muslim immigrants in the year before his election. This is done by analyzing the discourse of the group devoted to the issue of Muslim immigration and comparing it to the Corpus of Contemporary American English (COCA), which is a widely-used, genre-balanced corpus of American English of over one billion words of text (over 20 million words each year from 1990 to 2019) (Davies 2008). By analyzing linguistic phenomena, this article makes inferences about the cognitive and psychological features of Trump's supporters as a discourse community -that is, a group of people sharing a set of basic values, assumptions, and particular ways of communicating (Porter 1992). The representation of Islam and of Muslims in the conversations of the Trump supporters' discourse community is examined by studying the usage of such keywords as ' Muslim', 'Islam', 'Quran' (Koran), 'Sharia' (Shariah), 'immigrant' and 'refugee'. Knoblock: Silent Majority or Vocal Minority 4 Theoretical Foundations Immigration (and anti-immigration) discourse has been studied, among others, by van Dijk (2000), Cisneros (2008), Mamadouh (2012), Hart (2013), Gattino andTartaglia (2015), Knoblock (2017), and Musolff (2019). Representation of Islam and Muslims in traditional media, such as newspapers and magazines, has also been researched (Baker, Gabrielatos and McEnery 2012;Baker and McEnery 2005;Gabrielatos and Baker 2008;Moore, Mason and Lewis 2008;Poole 2002;Richardson 2004), and the examination of the treatment of Islam in the western news media has generally found evidence of negative bias (Akbarzadeh and Smith 2005;Awass 1996;Mårtensson 2014;Kassimeris and Jackson 2015;Richardson 2004;Saeed 2019).
While the analysis of mass media is useful, it makes sense to extend research to the investigation of social networking as a close representation of the public opinion externalized in discourse. Indeed, researchers are turning their attention toward that domain by studying Islamophobia in cyber contexts. For example, Aguilera-Carnerero and Azeez (2016) and Awan (2016) scrutinized Islamophobia on Twitter, Oboler (2016) investigated how Facebook is being used to legitimize hatred of Muslims, and Törnberg and Törnberg (2016) provide an insightful analysis of an online forum known for its right-leaning bias. Unfortunately, the studies focusing on these processes within social media are fewer than those studying traditional media, such as newspapers and broadcast journalism. One probable reason for this is the practical difficulties of collecting, processing, and analyzing the large amounts of unstructured textual data in social media. To avoid these concerns, the current project is a Corpus-Assisted Discourse Study.

Corpus-Assisted Discourse Studies (CADS)
The CADS approach combines elements of Critical Discourse Analysis and Corpus Linguistics. Several authors have suggested that corpus linguistic methods can effectively support quantitative and qualitative research in discourse analysis Gabrielatos and Baker 2008;Partington 2006;Salama 2011).
This combination is lauded as benefitting from both the rigor of the computational The current project continues the trend of addressing the attitudes toward Muslims in online communities using Critical Discourse Analysis (CDA), a wellestablished framework for research into the relationship between language and society. It underscores the strategic character of linguistic acts and emphasizes the idea that texts are based upon choices, which are ideologically and sociologically driven (Fairclough 1995). It also emphasizes the interconnectedness between discourse and ideology. In accordance with van Dijk (1995: 17), "ideologies are typically, though not exclusively, expressed and reproduced in discourse and communication". Thus, it is possible to reconstruct mental structures existing in the national consciousness that would be unavailable for direct observation given the information provided by the discourse.

Corpus Linguistic Methods
The issues of collecting and processing material for this study have been addressed by utilizing Corpus Linguistic Methods which have gained popularity with the development of the machine processing of text. This study utilized the online corpus management system Sketch Engine (Kilgarriff et al. 2004), and the analysis included frequency lists, identifying collocations, and comparing the comments on Trump's FB page and the COCA corpus.
Frequency of particular words in corpora provide insights about the salience of certain terms and topics in genres, modes of communication, or particular groups.
Frequency results can be used to draw conclusions about the correlation between the structures of the text and social and political phenomena. Typically, frequencies Knoblock: Silent Majority or Vocal Minority 6 are calculated in the number of occurrences per million of words as such normalized figures can provide more meaningful comparisons between texts of different lengths. Another prominent corpus linguistics technique is identifying collocations.
Collocation is the above-chance, frequent co-occurrence of two words within a predetermined span -usually five words on either side of the word under investigation (the node). The statistical calculation of collocation is based on the frequency of the node, the collocates, and the collocation. The higher the MI score, the stronger the link between two items; an MI score of 3.0 or higher suggests evidence that two items are collocates (Hunston 2002: 71). A score closer to 0 indicates a likelihood that the two items co-occur by chance, and a negative MI score indicates that the two items do not co-occur.

Procedures and analysis The Corpus
The material for the study consists of the comments left after Trump's FB post about his proposal to ban Muslims from coming to the U.S. The choice of FB is dictated by its position as the dominant social networking site since 67% of American adults use this platform, compared to LinkedIn (20%) and Twitter (16%) (Rainie, Smith and Duggan, 2013). Although FB discussions evolve over time, the corpus reflects the state of the conversation at the time it was collected in January 2016. The corpus, nicknamed Ban-the-Muslims (BTM), started with 856,769, and then was reduced to 739,466 tokens, or 621,335 words, by the adjustments described below.
To ensure validity, it was necessary to separate the comments of Trump supporters from the writing of those who left critical remarks. However, software was unable do it, and manually sorting the two sets would have been prohibitively time-consuming.
Instead, the researcher manually checked the concordance lines including all tokens of 'Muslim' and deleted comments that expressed a critical attitude toward Trump or his proposal. Even though the resulting corpus, almost definitely, still contains many comments made by people who joined the discussion to argue against the ban, those comments should not affect the concordances and constructions involving the lemmas discussed here. To reduce the influence of pre-compiled texts which were reposted multiple times, and concentrate on the spontaneous discussion, frequency Knoblock: Silent Majority or Vocal Minority 7 lists and collocates were manually scanned, and if multiple postings of identical texts were detected, all but one occurrence were deleted from the corpus. Words written in foreign scripts were eliminated.
The results obtained from the BTM corpus were compared to the data from the COCA corpus. COCA was chosen because it is arguably the largest, well-balanced, and up-to-date corpus of the American variety of English freely available for research.
Considering that COCA accumulates a very large sample of texts (approximately 20 million words) a year, evenly divided between several genres (20% each of spoken, fiction, popular magazines, newspapers, and academic texts), it serves as a good reference point of the national view on the issue discussed in the BTM. The choice was also influenced by the fact that COCA allows limiting searches to a particular year. This project needed to stay within the context of the BTM corpus, and Trump's proposal came on 17th November 2015, so the year 2015 was used for COCA searches.
Two research questions were posed in the current study: 1. Is there any significant difference in terms of lexical frequencies and distributions between Trump supporters' discourse and the national discourse regarding Muslim immigration? 2. What are the collocation patterns of the lexemes identifying Muslims in the corpus under analysis? Is there a significant difference in the image of Muslims shared by the focus discourse community and the general American public?
The analysis of the corpus proceeded as follows: first, the word frequency list was created to identify the most frequent and salient lemmas; then, collocations of the keywords were identified and examined; and, finally, the outcomes were compared with the data from COCA in order to identify any mismatch between the ideology of the group under analysis and the overall American attitude toward Muslims.

Frequency
The Sketch Engine Word List tool was used to identify the most frequent words in the corpus. Unsurprisingly, the most frequent content/open class lemma was TRUMP, which was used 7,395 times (10,000.50 words per million or wpm). The next 24 were: These 25 most frequent lemmas seem to draw a triangle of three main agents: Trump, the USA and its citizens, and Muslims. After that, the high-frequency words describe the needs, intentions, or actions of those agents.
Checking the frequency of these lemmas in the COCA corpus, we see that they are considerably less prominent there. The name TRUMP occurs only 3,064 times during the whole year 2015, so its wpm score is about 0.015; MUSLIMS are 1,882 or 0.009 wpm; and AMERICA/AMERICAN are 12,602 or 0.063 wpm. The frequency list showed that the BTM corpus compiled for this project is a good source of information

Collocational Data
Having identified the focus terms for further investigation, the collocation function of Sketch Engine was used to compile a list of collocations of the lemma MUSLIM.
The collocates were arranged according to the overall frequency of the collocation in the corpus, but they had to have an MI of 3 or higher to ensure collocational significance. They also had to be lexical rather than functional and to appear in the corpus a minimum of twice in order to be included. The results were later compared to the list of collocations of the lemma MUSLIM from COCA texts from the year 2015.

Ban-the-Muslims and COCA Comparison
A query for MUSLIM produced 3,683 concordance lines (4,981wpm) in the BTM corpus. Below is the list of 50 most frequent collocates. The number before the word indicates its rank by frequency, the number after it is the raw frequency of the collocation in the corpus, and the number in parenthesis -the MI score. Looking at the common collocates of the word Muslim obtained from the COCA collection of texts from 2015, it is easy to see that the discussions mentioning Muslims were more evenhanded and did not carry much negativity toward Muslims.  In the BTM corpus, 'Sharia' is depicted as a scary attribute of Islam that all Muslims want to follow themselves and want to impose and force on everyone else. It is a set of barbaric laws that are not compatible with democracy and are aimed at establishing an Islamic 'Caliphate'. It also supports 'killings', suppression of 'women', deals with 'theft', and is used by 'radicals' and 'terrorists'.

SHARIA/SHARIAH in COCA:
The concept is much less salient for the general American public who produced the texts included in COCA 2015 collection. Because the minimum frequency of 28 While the COCA corpus texts also associated immigrants with such problems as crossing the border, staying in the country illegally, crime, and killings, they are spread out over a broader semantic range and mention the countries of origin, while discussing other issues, such as smuggling, scholarships for students who are not legal immigrants, and advocates and activists helping them, in addition to mentioning the problems. In the BTM, 28.2% of the 651 collocations are negative, but in COCA the frequency of negative collocations is substantially lower and the percentage is 23.1%.

Conclusion
The discursive constructs of MUSLIM as presented in the responses to the proposal to ban Muslims from entering the U.S. externalize the xenophobic ideological base of the U.S. population whose opinions are voiced by then-presidential-candidate Trump. The topic of Muslim immigration provoked a heated discussion among Trump supporters, and the conversations exhibit a high level of animosity toward Muslims and concern for the safety of the USA. The examination of the frequency of the keywords allows us to describe the conversation as hateful and paranoid about Muslims.
The discourse of the BTM corpus presents Muslims as dangerous, predisposed to terrorism, and as people who ought to be kept away from the USA in order to protect the American citizens. The data obtained from the COCA corpus, which aggregates a large amount of text balanced between spoken, fiction, popular magazines, newspaper, and academic genres from the year 2015 does not show nearly as much negativity and rejection, with the exception of the lemma ISLAM. The general American discourse reflected in COCA demonstrates a lower level of animosity towards immigrants and refugees compared to BTM. On the contrary, discussions there often reflect concern for people forced to flee their home countries because of wars or unrest and contemplate means to help them.
The topics and attitudes prominent in the BTM corpus do not appear as salient in the COCA corpus. This may indicate that the views of Trump's supporters, who use his FB page to express them, are not shared by the broader American public as reflected in the COCA texts. It is possible to argue that despite claiming to be a silent majority (Crowley 2016), the group whose opinions are revealed in the discussions on Trump's FB page is, in fact, a vocal minority, and that their ideology does not represent the US mainstream. Even though racist, xenophobic, and prejudiced voices emboldened by the leadership of Trump have become even louder in the years of his presidency, Trump's supporters are still a fraction of the U.S. population. Thus, their claims to be the "silent majority" might be illusory and caused by their tendency to consume news from select few outlets and to socialize with groups that are homogenous in terms of political opinion (Baysha 2020;Pariser 2011;Schwarz and Shani 2016;Stroud 2010).
Such networking behavior gives users an illusion of agreement and dominance in terms of worldviews and ideology since they rarely encounter dissonant voices. Thus, it is crucially important for researchers to know the limitations of their data sources and to be judicious in drawing conclusions from them.