Translation and Semantic Shift of Islamic Vocabulary in English Abstracts: A Corpus-Based Study at an Indonesian Islamic University

Translation and Semantic Shift of Islamic Vocabulary


Introduction
Islam-related vocabulary, largely derived from Arabic, prompts discourse on its representation in English texts.Studies have demonstrated diverse approaches to conveying Islamic concepts in English, a language not traditionally associated with Islam.Brown (1996) identifies Arabic loanwords, like "ayatollah," that have been subsequently converted into plain text (txt) format to ensure compatibility with the designated analysis tools.The resultant text files, constituting the raw data, were then uploaded to the AntConc corpus software (version 4.0.1).Developed by Laurence Anthony, AntConc is a sophisticated corpus analysis tool that facilitated the comprehensive profiling of vocabulary content within the abstract corpus and the systematic identification of Islam-related terminology.

Data Analysis
To adhere to the study's mixed-method design, a quantitative frequency count analysis of Islam-related terminology was conducted.The results were presented in tables containing descriptive statistics on general, academic, and, most importantly, Islam-related vocabulary found within the abstracts.Islam-related terms were subsequently analyzed qualitatively through meaning and usage verification.Initially saved as txt files, the analysis results were converted into Excel format for enhanced data visualization.Figure 1 illustrates the Antwordprofiler interface displaying the uploaded raw data files, along with the stop list files used to filter vocabulary categories: General Service Lists (GSL 1 and GSL 2) compiled by Michael West (1953), Academic Word List (AWL) by Avril Coxhead (2000), and Islamic Religious Studies Textbook Vocabulary (IRSTV) List by Simbuka (2019).The interface also showcases partial analysis results, indicating the types of words belonging to the GSL 1st-1000 list.The complete results encompass frequency counts for each vocabulary category, providing comprehensive statistics on their coverage within the abstract corpus.

Figure 1. The display of Antwordprofiler analysis
The second phase of the study involved a qualitative analysis to uncover the translation strategies employed by the translators.This aimed to determine whether differences existed in the translation of Islamic terminology from Arabic to Bahasa Indonesia, and subsequently from Bahasa Indonesia to English.Additionally, qualitative analysis was utilized to examine potential semantic shifts arising from these translation processes.Major reference works, such as the Oxford Encyclopedia of the Islamic World and "Kosakata Keagamaan" by renowned Indonesian Islamic scholar M. Quraish Shihab (2020), were consulted to verify the meaning and semantic features of the translated terms.Then, the data analysis technique employed to address the second research question was content analysis.AntConc version 3.4.3facilitated the search for Islam-related words within their respective contexts (the sentences in which they occurred).Although corpus software was used, the analysis remained qualitative as it focused on the contextual meaning of the investigated words rather than their frequency data.

General vocabulary profile of the corpus data
This section commences with the presentation of quantitative analysis results, providing descriptive statistics on the frequency of each vocabulary type within the dataset.Each vocabulary category is accompanied by its respective occurrence count, derived from the word profiling analysis conducted using the AntWordProfiler program.This procedure aims to address the first research question: "What are the most frequent vocabulary items (single-word and multi-word) related to Islamic studies that occur in the undergraduate and master's theses at IAIN Manado?" The displayed data, imported from the program's output, additionally included the total word count (tokens) for each vocabulary category and the number of unique "word types" within each category.The general, academic, and other vocabulary types were tabulated alongside their corresponding tokens, total word types, and the frequency of each word, ranked from highest to lowest.The final output of this initial data analysis step was a vocabulary profile, including the most frequent Islam-related terms, which constitute the primary focus of this study, as presented in Table 1.97429 6220 The analysis of the corpus, as depicted in Table 1, reveals that the undergraduate and postgraduate theses abstracts contained a total of 97,429 words.Most of these words, approximately 73.02%, were classified as high-frequency words, filtered by the General Service List (GSL) 1st and 2nd.Academic vocabulary, filtered by the Academic Word List (AWL), comprised 12.72% of the corpus.Notably, Islamrelated vocabulary, identified using the Islamic Religious Studies Textbook Vocabulary (IRSTV) list, constituted 2.24% of the total words.A small fraction, 0.2%, belonged to the "Kosakata Keagamaan Islam" by Quraish Shihab (2022), a list of Islamic religious terms.The remaining 11.82% of words did not fall into any of the categories, indicating a diverse range of vocabulary beyond general, academic, and specifically Islamic terminology.
Drawing upon established research on word lists (Sutarsyah, Nation & Kennedy, 1994;Hirsh, 1992cited in Nation & Waring, 1997) and Nation's (2001) concept of ideal vocabulary content, the analysis of the thesis abstracts corpus revealed an approximate alignment with expected coverage for each vocabulary category.High-frequency words, as predicted, constituted around 73.02% of the total word count, falling within the expected range of 78-90%.Academic vocabulary, accounting for 12.72%, also met the suggested range of 8-12%.However, the coverage of Islamrelated vocabulary, categorized as "technical words" by Nation (2001), was lower than the typical 5-7% expectation, comprising only around 2% of the corpus.This deviation is considered acceptable given that abstracts, while academic in nature, are inherently shorter than full-length texts within a specialized field, thus potentially containing a lesser proportion of technical vocabulary.Therefore, the lower coverage of Islamrelated terms aligns with the nature of abstracts as concise summaries rather than comprehensive treatises on Islamic studies.Table 2 illustrates the top 25 most frequent English words within the dataset.These findings align with previous research on typical English vocabulary content, confirming the prevalence of function words such as articles (the, a, an), prepositions (in, of, on), and conjunctions in the English abstracts under investigation.

Analyses of the Bahasa Indonesia Corpus
The Table 3 presents the vocabulary profile of the Bahasa Indonesia corpus, categorized based on word frequency and type.This analysis utilized two stop lists: the Indonesian most frequent words list by Doyle (no year) and the "Kosakata Keagamaan Islam" list based on Quraish Shihab's (2020) compilation of essential Islamic religious vocabulary in Indonesia.As shown in Table 3, the Bahasa Indonesia abstract corpus comprised 28.85% of the most frequent words in Bahasa Indonesia (Doyle, n.d.), and 0.74% of vocabulary pertaining to Islamic religious concepts "kosakata keagamaan".The remaining 70.4% of the corpus consisted of words that were either not highly frequent in general usage or were specific to the field of Islamic studies.
Table 3 presents the 25 most frequent words in the dataset.Consistent with the English abstract corpus, the most frequent words in the Bahasa Indonesia corpus were function words, including the conjunction "dan" and prepositions such as "dalam," "di," and "dari."A secondary word list was employed to filter and calculate the most frequent words that pertain to Islamic religious vocabulary (in Bahasa Indonesia), the results of which are presented in the following section on Islam-related vocabulary.

Islamic vocabulary in the English Corpus
The subsequent section details the findings derived from a qualitative analysis of potential Islam-related terminology within the dataset.This analysis is situated within the theoretical framework of Nation's vocabulary classification system.
The current study employed the IRSTV list (Simbuka et al., 2019;Simbuka & Nagauleng, 2021) as a filtering mechanism ("stop list") within the Antwordprofiler corpus tool.This facilitated the categorization of lexical items within the data as "technical vocabulary," as defined by Nation (2001).As previously mentioned, Nation's model encompasses additional categories, including high-frequency words (filtered using the General Service List, GSL 1 & 2; West, 1954) and academic words (filtered using the Academic Word List, AWL; Coxhead, 2001).Results indicate that 3235 tokens (word occurrences), corresponding to 2.32% of the total tokens in the abstract corpus, were classified as belonging to the Islam-related vocabulary list.This list comprised 263 unique word types (distinct words).Table 2, presented below, enumerates these word types, along with their respective frequencies of occurrence and range (the number of texts in which they appear).Table 5 reveals that among the top 25 words of Islam-related vocabulary, "Islamic," "Muslim," and "hadith," all of Arabic origin, have been seamlessly integrated into the discourse of academic Islamic studies.These terms have gained acceptance as English words (Brown, 1996) within this field-specific context.This adoption reflects a broader strategy within academic discourse in Islamic studies to engage a wider readership, including those whose expertise lies outside the field (Hassan, 2016).
When examining field-specific vocabulary, relying solely on word frequency counts derived from corpus methods (i.e., frequency counts of individual words) may prove insufficient.Terms such as 'so,' 'one,' 'concept,' or 'curriculum' might occur frequently but possess limited relevance, or even be irrelevant, to the domain of Islamic studies.Therefore, a more nuanced understanding of how these words contribute to the lexicon of Islamic studies can be achieved through qualitative analysis, examining the specific contexts or sentences in which they appear.

Other representation of Islamic vocabulary data outside the IRTV List
The Antwordprofiler analysis revealed that, in addition to Islam-related words identified through the IRSTV list, numerous terms demonstrated significant relevance to Islam due to their nature as "Arabic loanwords" (Hassan, 2016).The 25 most frequent loanwords are presented in Table 6.Table 6.Islam-related vocabulary filtered by the Islamic Religion Vocabulary by Quraish Shihab Table 6 reveals that most Islam-related terms identified outside the IRSTV list are of Arabic origin and function as loanwords within the Indonesian language.Consequently, terms such as "sharia" (or "syari'ah"), "madrasa," "hajj," "zakat," "aqidah," and "aliyah" occur with relatively high frequency.Notably, the spelling of these Arabic-origin words reflects an Indonesian adaptation of their original orthography.

Islamic Vocabulary in the Bahasa Indonesia Corpus
To identify Islam-related terms within the Bahasa Indonesia abstracts, the AntWordProfiler tool was employed, utilizing distinct stop lists compared to the English corpus analysis.These stop lists included a list of the most frequent Indonesian words compiled by Damian Doyle (no year) and the "Kosakata Keagamaan" list developed by Quraish Shihab (2020).While the latter is not a corpus-linguistically derived list, it represents a qualitatively curated collection of prominent Islamic terminology by a renowned Indonesian Islamic scholar.The resulting top 25 Islamrelated terms identified within the Bahasa Indonesia abstract corpus are presented in Table 7.

Translation of Islam-Related Vocabulary into English
To address the second research question regarding the translation of Islamic vocabulary into English, a mixed-methods approach was employed.This involved quantitative analysis utilizing AntConc corpus software to ascertain word frequency, complemented by qualitative manual searches to identify instances of Islamic concepts and their corresponding translations (or lack thereof) within the corpora.
Utilizing comparative corpora comprising abstracts in Bahasa Indonesia and their corresponding English translations (English-abstract corpus), we traced the trajectory of Arabic-origin terms, examining their adaptation into Bahasa Indonesia and subsequent rendition in English.
Table 8 demonstrates that certain Islam-related terms present in the Bahasa Indonesia corpus were not consistently translated into English within the corresponding abstracts.For instance, the term "syariah" appeared 118 times across 33 texts in the Bahasa Indonesia corpus, but only 37 times in 14 texts within the English corpus.This discrepancy suggests that the term was retained as a loanword in the English abstracts, albeit with Indonesian-adapted spelling rather than its original Arabic orthography.Conversely, approximately 81 instances of "syariah" were translated into its English equivalent, "Islamic law" (Baalbaki, 1998:669).The following excerpts illustrate the usage of the term "syariah" within the dataset.Excerpt a): The term "syariah" is retained in its original Arabic form, albeit with adapted Indonesian spelling, without translation into English.This usage is exemplified in the following context: "...pembiayaan untuk mendapatkan porsi haji secara syariah dengan proses mudah cepat dan aman" (DATA_INDO_2019_069.txt).The AntConc 3.4.4concordance tool facilitates the display of data containing the word "syariah," as depicted in the subsequent figure.Excerpt b) demonstrates a different treatment of the term "syariah" within the same abstract but in the translated (English) version: "Arrum Hajj is a financing to obtain a portion of the pilgrimage through sharia with an easy, fast, and safe process" (DATA_ENG_2019_069.txt).This excerpt, displayed in the AntConc software as illustrated in Figure 3.The data reveals that certain Arabic-origin terms underwent a two-step translation process, first into Bahasa Indonesia and subsequently into English.This phenomenon was particularly evident for common Islamic vocabulary adopted into Bahasa Indonesia, despite having existing English equivalents.Consequently, translation strategies varied.In the Bahasa Indonesia abstracts, many well-known Islamic terms were not translated but rather retained in their original Arabic forms, albeit with adapted Indonesian spelling (Hassan, 2016).For instance, terms such as "syariah," "nikah," "haji," and "akad" were incorporated into the Bahasa Indonesia spelling system without transliteration, a common practice for Arabic loanwords.In contrast, in the English abstracts, "syariah" was rendered with a slightly modified spelling, "sharia" (Baalbaki, 1998:669), and was not translated into equivalents like "Islamic law" (Baalbaki, 1998:669).This translation strategy, known as "Alteration" (Nida, 1964in Molina & Albir, 2002), is employed when the target language (English) lacks a direct graphemic representation for a specific sound present in the source language.In this case, alteration appears to be a preferred strategy for translating concepts with strong ideological or cultural values, such as Islamic teachings and law (Anis, Nababan, Santosa & Mashruki, 2022).Alongside the strategy of pure borrowing, alteration allows translators to preserve the original ideology and values embedded within the terms while gradually introducing them into the target language.Similar patterns were observed for terms like "sakinah," "akhlak," "ayat," "fatwa," "sunnah," "iman," "akidah," "halal," "mahram," and "ulama."Conversely, other Islam-related words, including "hidayah," "mut'ah," "aurat," "bid'ah," "muhasabah," "shalawat," "syubhat," and "karimah," exhibited consistent frequencies in both corpora, suggesting a uniform approach to their translation or retention as loanwords in both languages.
The second observed strategy for translating Islam-related words from Arabic into Bahasa Indonesia and subsequently into English involved pure borrowing.In this approach, the original Arabic terms were retained in their unaltered form, either through maintaining the original orthography or utilizing transliteration.Exemplary instances of such terms include "tahlil" and "Allah." Figure 4 illustrates the usage of "tahlil" in both the Bahasa Indonesia and English abstract corpora (Figure 5).The data presented in Figure 5 reveals that additional Islam-related words, such as "shalawat" and "diba," were subject to the same translation strategy as "tahlil," wherein the original Arabic forms were borrowed entirely.This observation aligns with the findings of Anis et al. (2022), suggesting that translators opt for pure borrowing due to the specialized nature of these terms, which are primarily understood within the Islamic community.Üstün Külük (2023) further elucidates this phenomenon through the concept of the "intra-ummah paradigm," emphasizing the use of shared vocabulary among Muslims, transcending national boundaries, to facilitate effective communication.Terms like "jihad," "Sunnah," "ayah," and Quranic verses exemplify this shared linguistic currency.The preference for pure borrowing over direct literal translation, which can introduce inaccuracies, is particularly evident when dealing with longer stretches of language, such as Quranic metaphors (Nurbayan, 2019).
The third identified translation strategy involved rendering Islam-related vocabulary into their English equivalents.For instance, the term "nikah," appearing 45 times across 12 abstracts, and it's derived from "pernikahan" (occurring 44 times in 16 abstracts), were consistently translated as "marriage" (121 instances in 26 abstracts) or "marriages" (20 instances in 10 abstracts), as depicted in Figures 6 and 7. Figure 6 illustrates the occurrence of the term "nikah" within the Bahasa Indonesia abstracts.Upon comparing this data with the corresponding English abstracts, as depicted in Figure 7, it becomes evident that "nikah" is consistently translated into its English equivalent, "marriage(s)."This translation strategy, termed "Globalization" (Farkhan, 2018), involves the direct conversion of words from the source language (Bahasa Indonesia) into English, under the premise that the Islamic vocabulary in question is universally understood and somewhat familiar to adherents of other Semitic religions due to its presence in their respective traditions.

Semantic Shift in the Translation of Islamic Vocabulary
To address the third research question concerning potential semantic shifts in the translation of Islam-related vocabulary into English, a qualitative analysis was undertaken.This analysis involved comparing entries in Arabic-Indonesian dictionaries, Arabic-English dictionaries, and scholarly references on Islamic terminology.Table 9 presents a sample of Islam-related words, displaying their forms In conclusion, this study demonstrates the utility of corpus tools in facilitating the identification and frequency-based ranking of specific vocabulary within a given dataset, aligning with the advantages highlighted by Anthony (2013), Masrai &Milton (2021), andYuliawati &Ekawati (2023).This method offers a time-efficient alternative to manual qualitative searches, particularly in analyzing texts with extensive word counts.Pedagogically, the resultant frequency rankings of words specific to a particular discourse can be leveraged for teaching English for Specific Purposes (ESP).However, a limitation of this method is the potential inclusion of terms not directly related to Islamic teachings, but rather associated with broader fields contributing to Islamic studies, such as philosophy and law.

Conclusion
This study concludes with three primary findings that address the research questions.First, the vocabulary profile of the English abstract corpus aligns with Nation's (2001) theory of vocabulary classification, demonstrating the expected distribution of high-frequency, academic, and technical vocabulary, including Islamrelated terminology.Second, the predominant translation strategies employed for Islam-related vocabulary involve phonetic alteration and direct borrowing from the Bahasa Indonesia version into the English version, thus preserving the already shifted meanings (and occasionally, spellings) within the translated abstracts.Lastly, regarding the semantic shift of the studied Islam-related vocabulary, the findings reveal that changes in the meaning of Arabic-origin words occurred within the Bahasa Indonesia abstracts, where most of these words were adopted/borrowed/incorporated with altered meanings.This semantic shift is attributed to two factors: 1) changes in meaning due to their specialized usage in the Quran, often differing from the meanings understood by contemporary Arabic speakers, and 2) semantic shift due to the traditions of Indonesian Muslim groups.This shift subsequently influenced the translation of these words into English.The implication of this study is that translators of field-specific vocabulary should consider the intended readership, considering the specific context and targeted publication.This approach ensures that the translated texts are both accurate and accessible to the intended audience.
Certain limitations of this study highlight the potential for future research utilizing mixed methods, combining the quantitative approach of corpus linguistics with qualitative discourse or critical discourse analysis.This would enable a more nuanced understanding of the most frequent Islam-related words identified in the data.Furthermore, additional studies are warranted to enhance the validity of the IRSTV list, thereby improving its utility in identifying Islamic vocabulary.

Figure 2
Figure 2 The Display of the Antconc file view tool showing the data that contain the word 'syariah' in an abstract (Bahasa Indonesia version)

Misbahuddin, Srifani Simbuka :Figure 3 .
Figure 3.The display of Antconc software showing the data of English abstracts containing the word 'sharia' (syariah)

Figure 4 .
Figure 4.The use of the word 'tahlil' in the corpus of abstracts in Bahasa Indonesia (DATA_INDO_108.txt)

Table 1 .
Vocabulary profile of the English abstract corpus

Table 2 .
Top 25 Most Frequent Word Types in the English Abstract Corpus

Table 3 .
Vocabulary profile of the Bahasa Indonesia Abstract Corpus

Table 4 .
Top 25 Most Frequent Words in the Bahasa Indonesia Abstracts Corpus

Table 5 .
Top 25 most Frequent Islam-related words in the English corpus

Table 8 .
Comparative profile of the Islam-related words in the Bahasa Indonesia and