The Semi–Algorithmic Approach to Formation of Latvian Information and Communication Technology Terms

. Information and Communication Technology terms are mainly formed in English and then secondary–formed in other languages. Because of the differences in the morphological and term–formation traditions in various languages, the results of secondary term formation tend to be somewhat chaotic. Latvia’s Information and Communication Technology terminologists and linguists have developed a rather rigorous, semi–algorithmic approach to term formation that has been approbated for over thirty years. This paper aims to describe this approach and show its viability on an example of the most commonly used terms. We also analyse the usage of the officially approved terms in texts and the possible reasons why they sometimes encounter resistance from everyday users. In conclusion, we summarise the research regarding the current situation in the secondary ICT terminology in Latvian and provide insight into possibilities for further development.


Introduction
..it takes all the running you can do, to keep in the same place.
(c) Queen in «Alice in Wonderland», Lewis Carroll Development of the Information and Communication Technology (further in the text -ICT) terms correlated with the emerging and developing of the ICT field; thus, ICT terminology draws its origins from the very beginning of the development of the ICT field.
The main aims and objectives of this article are as follows.
First of all, provide a brief insight into the work carried out in the ICT term development in Latvia in general and in the last 30 years in particular, as well as provide the reasoning for ICT term development.Second, we will describe in detail the approach to the secondary ICT term formation (also-development), outlining the essential requirements, a system of principles and guidelines we aim to follow in general, as well as describe primary and secondary principles in particular.
Third, we aim to describe and visualize a pseudo-algorithm consisting of four main steps for the actual process of systematic development of terms.Fourth, we will describe the methodology for selecting the most common ICT terms in Latvian and follow analyses of the obtained empirical material.
Fifth, we will look at what happens after the English ICT term is chosen for secondary term creation in Latvian: we will research if the semi-algorithmic approach is used in the actual secondary-term creation process as well as take a look at commonly used ICT terminology units in Latvian.The sixth objective will be to research the actual habits of using ICT terms in public communication as well as carry out preliminary conclusions for further ICT term development based on the research carried out.
Let us remember that, on the one hand, the Babbage Engine (WEB, a) and technological development during World War II can be considered as the beginning of the ICT field and, accordingly, term development (WEB, b).On the other hand, the rapid expansion in the ICT fieldin the world in general, and Latvia in particular in ICT terms began during the 1960s, when the ICTs terms developed with the beginning of the industry (Skujiņa et al., 2011).
Origins of terminology development as a field can be traced back to engineer, industrialist and terminologist E. Wüster (Kast-Aigner, 2009), who, almost a century ago, defined the need for and main principles of terminology standardisation.
In theoretical studies devoted to the primary and secondary term formation (Sager, 1990), it is defined that primary term formation means that a term is formed in a source language (for example, in English following terms were created computer, mouse, Internet of Smart Things) to denote the concept and then by means of secondary term formation the functional analogues were created in the target language (for example, in Latvian dators, pele, viedlietu internets).
There has been carried out research on ICT term development in general, but relatively few researches on secondary ICT terminology development all over the world: in African Language Studies (Magagane, 2011), a corpus-based approach to defining Macedonian ICT terminology (Mickosi, 2017), the role of ICT in English-Spanish Computer research (Medina, 2003).
Terms, i.e. domain-specific words or phrases, are an essential part of the language of science.In a specific industryin our case, the ICT industryterms express the specific concepts of this industry.
The development of terminology in any industry is a laborious and continuous process.The development of ICT terms is no exception either.The names of newly created industry technologies, methods and products are usually in English, which means that most of the terminology has to be secondary formed into Latvian from terminology that has been primarily formed in English.Since the industry is rapidly advancing, we daily encounter new equipment and new technologies that reflect new concepts that have not been present in our lexicon thus far and for which we have to create appropriate terms in Latvian as quickly as possible (Skujiņa et al., 2011, 43-50).
For more than 30 years, the Terminology Commission of the Latvian Academy of Sciences (LAS-TC) has been engaged in the development of ICT terms.LAS-TC analyses new concepts and then selects and approves corresponding Latvian equivalents.Terms must represent the concepts of the ICT industry as accurately as possible while indicating the place of each concept in the general system of concepts of an industry.
This paper is a combined and extended version of the papers (Borzovs et al., 2001) and (Borzovs et al., 2013, 108-126) published only in Latvian and the paper presented at the Conference on Language Resources and Evaluation (Borzovs et al., 2014, 4012-4017).
It is important to emphasize that although translation and terminology formation theories deal with various types of information recreation from a source language into a target language, namely localisation, functional analogues et cetera, this article deals both with primary and secondary term formation, paying special attention to secondary term formation in particular.
In this paper, we analyse techniques and methodologies for the development of ICT terms.We also research the extent to which these terms are ingrained in the scientific research literature and everyday language and summarise the results.

Requirements for the development of ICT terms
In an ideal world, a newly created ICT term must meet the following requirements: a systemic approach, accuracy of meaning, brevity of form, uniqueness, mononymity, contextual independence, and emotional neutrality (Skujiņa, 1993, 224).
English ICT terms only sometimes meet all the requirements mentioned above; therefore, sometimes difficulties arise when coining the corresponding Latvian equivalent.
The development of terms is hampered by the fact that in the English ICT terminology: 1. there is no strict distinction between a scientific term and a professional colloquialism, and 2. the choice of terms may not comply with the requirements of scientific terms in the traditional sense of the terminology.
For example, the standard terminology requirements prescribe that a scientific term has to be stylistically neutral without emotional overtones.Following up some English terms, the ICT terminology includes terms like 'daemon ', 'vampire tap', and other creatures, such as 'crawler', 'snail', 'spider', 'worm', as well as 'mouse', which is familiar to the everyday computer user.Similarly, colloquial elements are not desirable in terminology; however, there are 'oldbie' and 'newbie' in the English ICT terminology.These are words borrowed from everyday speech denoting experienced and beginner computer users.The corresponding Latvian equivalents, 'veculis' and 'jaunulis' have partially preserved expressiveness.
ICT term creators also encounter the challenge caused by a large number of metaphors found in English terminology.The borrowing of metaphors from English is not unconditionally accepted because terminologists would like a clear and unambiguous understanding of the semantics of the term, but the word or phrase used in the metaphor denoting an object usually is used to characterize another object, emphasizing the similarity or analogy between them thus creating ambiguity in the understanding of the semantics of the term.Nevertheless, since there are metaphors even in ISO standards, Latvian ICT terms also include them, e.g.'Trojas zirgs' ('Trojan horse')a seemingly harmless program that involves the unauthorized collection, falsification or destruction of data, or 'maskarāde' ('masquerade')an attempt by one user to impersonate another (authorized) user to gain unauthorized access to data or resources (Borzovs et al. 2005, 108 -121).
Each language has its lexical semantics system.Therefore, words from different languages and their meanings are not identical.Therefore, there is a need for neologisms or loanwords when coining terms in a target language.Thus, for the rendering of terms to be as appropriate and equivalent as possible, a comparison must be carried out at the semantic level.
For the work on the creation of ICT terms to be successful, it is necessary to have a particular term development procedure which respects the laws of the target language.In this work, we focus on the Latvian language.

A system of principles for the development of ICT terminology
In this subchapter, we will describe the types of terms developed and the term development process.
We can conditionally divide newly created ICT terms into three parts: 1.Terms are created based on words used in everyday life and words found in dictionaries; 2. Terms are created as loan words from terms in other languages, but they are already traditionally used in Latvian colloquial speech; 3. Terms that are neologisms created within the ICT industry in the last 30 years.The source of their origin can be both Latvian origin words and borrowings from other languages, namely adjusting translingual borrowings (Borzovs et al., 2010, 329-340).
Thus, in order to develop a unified Latvian term system in ICT, the following guidelines have been established in the Information Technology and Telecommunications Sub-commission (further in the text -ITTS) (Borzovs et al., 2002, 25-32): 1.For different terms in the source language, corresponding different terms in the target language should also be created, for example, error -kļūda; failure -kļūme; fault -bojājums; bugblusa; malfunctiondisfunkcija.
2. For a polysemantic word that functions as a term in the source language should try to find a word with a range of similar meanings in the target language, for example, hard diskcietais disks.
3. For a neologism, it should be easy to use the newly created term as a basis for further derivations.
4. The Latvian equivalent (namely, functional analogue) of the term must be chosen so that when it is back-translated, the same word in the source language is unequivocally used.This is achieved either by "pairing" the words of the basic meaning of everyday vocabulary (e.g.mousepele) or by creating a neologism in Latvian (e.g.menuizvēlne; promptuzvedne).
5. When borrowing a word, it is necessary to take into account how it fits in the target language from semantic, phonetic and morphological aspects.For example, English words with endings -ings or -ments do not fit into the Latvian morphological system due to the fact they duplicate the Latvian words with endings -šana un -ība.
6.If using synonyms of international names and self-origin is possible, preference should be given to words of Latvian origin, such as filedatne, fails; interfacesaskarne, interfeiss, etc.
7. In practice, already established terms should only be changed with sufficient justification.
8.More attention should be paid to everyday terms.They should be short, accurate, euphonious, and easy to perceive.For rarely used terms, the requirements may be flexible.
When designing a system of principles and criteria, it must be considered that they all are closely interrelated and interact; none of them is entirely isolated.According to Inta Freimane, comprehension of the principles is based on the system of principles and criteria of the literary language norms (Freimane 1993), applying them to the terminology industry.
Following the abovementioned guidelines, we can name primary and secondary principles for the development of terminology; now, let us take a more detailed look at them.

Primary principles
For a term to fit into a system of terms by meaning, the semantic correspondence principle must be observed when creating terms.The semantic correspondence principle holds that, when creating terms, each lexical pattern has a specific semantic weight that is characteristic of the corresponding language system.For example, the meanings of the prefixes -umand -šanin terminology distinguish: apliecinājums (affirmation), apgalvojums (statement), izcēlums (emphasis) they name the result of the action and processes, and the corresponding derivatives with the suffix -šandenote the actual activity of the process.
The formal correspondence principle holds that words that share a similar form in the original language should share a similar form in the target language.New forms, new words, and syntactical units are developed based on stable models.The principle of formal correspondence is fundamental when creating industry terminology.
Here it is necessary to observe the relationship between the subordination of terms and the analogy of the form.By analogy, the following terms are created: programmatūrasoftware; aparatūrahardware; bezmaksas programmatūrafreeware; grupprogrammatūragroupware.Formal correspondence is related to equating the form, the formal side, by analogy, using the criterion of literary exemplar.According to stable models, new forms, new words, and syntactic units are formed.
The functional correspondence principle is related to such basic signs as the brevity of a term, ease of use, and euphoniousness.This principle also holds that short terms are easy to use: they form a system more efficiently, and new elements can be added to them, thus creating sub-concept terms.
Creating compound words in computer terminology is one of the most productive ways when creating short terms.They are created from a functional aspect to incorporate the term easier into the context when it is necessary to name the concept, consisting of the designation for multiple components.
For example, atbalstsistēma (support system) is created from sistēma atbalstīšanai (a system for support); darbderīgs (off-the-shelf) from derīgs darbam (valid for work).
When borrowing or creating a new term, attention is usually also paid to the ease of use, i.e., to ensure that the term can be conjugated and used conveniently in the collocations so that other parts of speech can be derived from its root, etc.Long terms should be avoided because, firstly, they are challenging to use in written text and oral speech; secondly, such words are seldom used as a further derivative base.Therefore, short forms of words such as aizture (delay), atteice (failure), piekļuve (access) are most productive in terminology.
Euphonism is essential in both cases: borrowing a term from other languages and coining a new word.Principles stated by the Information Technology and Telecommunications Sub-commission (ITTS) allow borrowing of other language terms while paying particular attention to the euphony of the language.
The Latvian language is considered a sonorous language, i.e. it has a sufficient balance of vowels and consonants.This balance should not be upset by contaminating the language with uncharacteristic aggregations of sounds, e.g., -pjuin the word kompjūters (computer).

Secondary principles
Secondary principles are the principle of term dissemination and the principle of tradition.To comply with the principle of term dissemination, the ITTS pays special attention to terms that could be part of the everyday language, namely frequently and widely used terms.
They must be short, apt and euphonious and correspond to criteria established by Inta Freimane.Terms should be divided into four groups:  The first group are those terms that are widely used, i.e., words that are already broadly implemented in the language or words that should become part of the everyday vocabulary, such as dators (computer), programmatūra (software). The second group are terms used by many people who work with a computer, regardless of their professions, such as tastatūra (keyboard) or izvēlne (menu). The third group are terms used among specialists, such as aizmugurgaismojums (backlight), serdeņatmiņa (core storage), soļkompilators (incremental compiler). The fourth group are terms used in a narrow circle of specialists.Due to the lack of time and narrow specialization, the requirements for the terms of this group are not so strict.These terms often are very similar to professionalisms and do not have the function of a scientific term, such as klinčs (clinch), krosasambleris (cross assembler).
The principle of tradition applies if a term is already widely used or was approved several years ago.The terms accepted and approved shall only be altered with justification.
The primary purpose of terminology is to enable effective communication.A standalone and uncoordinated coining and use of terms is a factor hindering communication.
In all industries, the term coining process should be based on the terminology work experience, using the already term system and developing it according to principles developed in practical terminology work.

Process of systematic development of terms
Finding a correct, entirely appropriate, and euphonious term in a target language expressing the same concept as in the source language is challenging.
Therefore, a description of the terminology development process is proposed to facilitate this work and consistently follow the above-mentioned principles of term development and related criteria.The description reflects the process of terminology development by selecting or creating an equivalent corresponding to the English term in Latvian.
This description is based on the work experience of LAS ITTS but can also be used in other industries.
In order to understand the decisions in secondary term-creation, the description of the pseudo-algorithm for the term-creating process is visualised in corresponding four flow-charts in the following four subchapters.Enumeration within the blocks in the flow-chart indicates the paragraph in the subchapter where each decision made and action taken is described in more detail in most cases.

Use of existing term sources (see flow-chart 1.)
The first step towards finding an adequate equivalent is to understand the concept.Various term dictionaries and general explanatory dictionaries in English and research conducted within the industry are helpful for comprehension of the concept.
When the concept is understood, the next step is more complicatedthe quest for an equivalent term in the target language that names the concept precisely (see flow-charts 1, 2, 3 and 4.).
Thus, the first step in secondary term creation is examining existing term sources to ascertain that the term has not been secondary created in Latvian.The first step consists of evaluating four main criteria, visualised in the flow-chart below.

Flow-chart 1. Use of existing term sources
Let us take a closer look at each of the criteria.

Can the term be found in the industry term base?
For example, standards of ICT industry terms can be found on the Internet in the AkadTerm (http://www.akadterm.lv/),the academic term database.
If the term correctly denoting the concept can be found in the AkadTerm, then the equivalent of the term already accepted in Latvian is used.
If the term is not found, then we check:

Can the term be found in Latvian term dictionaries or other term resources?
Not only the resources of the ICT industry terms but also other term resources of other industries can be used; for example, in the Latvian National Portal of Terminology (https://termini.gov.lv) are available numerous term dictionaries developed by the Terminology Commission of the Latvian Academy of Sciences (LAS-TC), bulletins of the Terminology Commission and other reliable sources.
If the term can be found in term dictionaries or other term sources, the corresponding Latvian equivalent of the English term shall be used.
If the term is not found, then check:

Can this term be found in the general English-Latvian dictionary as a term?
For the identified term to be relevant, it is recommended to choose the newest and most complete edition of the dictionary.
If this term is found in the general English -Latvian dictionary as a term, the corresponding equivalent of the English term in Latvian shall be used.If it is not possible, we check:

Can the term be found in the general English-Latvian dictionary as a common word?
If it can be found in the general dictionary, we choose the Latvian word that could express the respective concept.

Checking common words (see flow-chart 2.)
The second step in secondary term creation, checking common words in Latvian, also follows four criteria described in the flow-chart.

Flow-chart 2. Checking common words
Let us take a closer look at each of the four criteria.

Does the chosen Latvian word corresponds exactly to the concept to be expressed?
Explanatory dictionaries can be used to help determine the conformity of the word with the concept.If it expresses the concept precisely, the corresponding English equivalent in the Latvian language shall be chosen.Next, we check:

Is this Latvian word already used as a term in industry terminology?
Efforts should be made to ensure that one term in the source language also corresponds to one specific term in the target language within at least one industry, e.g.securitydrošība; safety -nebīstamība.If the word in Latvian is already used as a term to express another concept, then the polysemy of the terms arises, which complicates the correct perception of information.
If the chosen word is occupied in the term system, there are two options for us:  to create a new word, neologism (see flow-chart .),  to choose a synonymous word (see flow-chart 1.).
If the word is not used as a term in the industry terminology, the following aspects shall be verified:

Is the back-translation into English possible for the chosen Latvian word?
It is essential to choose such an equivalent in the target language so that the backtranslation of the term results in the same word as in the source language; this is achieved by creating stable pairs of words in both languages, "pairing" the basic meanings of the words, e.g.itemvienums, unit -vienība.
If the result of back-translation from the target language (Latvian) in the source language (English) is the same word as in the source language, it is used in the function of the term.If not, we:  create a neologism (see flow-chart 3),  borrow the source language word (see flow-chart 4).

Is this term classic internationalism?
When borrowing a word, preference should be given to international words, i.e. words that have the same meaning in different languages and are similar in pronunciation and writing.In this article, a word of Greek or Latin origin is considered classical internationalism.When determining whether a word is international, it is necessary first to identify words in other languages that can be identified in their spoken and written form as well as in terms of meaning.
An example: in English, process is defined as the "systematic execution of operations to obtain a particular result, e.g., the conversion of raw information into usable data", in Swedishprocess, in Frenchprocessus, in German -Prozeß, in Latvian -process.
The meaning of international words that have been long-established in Latvian should be clarified, and then they can be used in the terminology of various industries.For example, the terms navigācija (navigation) and naviģēt (navigate) are used in maritime terminology and ICT terminology.
Nevertheless, when borrowing an international word, special attention should be paid to false friendswords with formal similarities that may have different meanings.These are usually internationalisms that are developed based on classical languages but have developed different meanings in different languages.
An example: nowadays commonly used word in Latvian komunikabls in English is communicative, whereas English communicable means lipīgs (sticky) or paziņojums (announcement).If the term is not international or if it is ambiguous, we:  create a neologism (see flow-chart 3),  borrow the source language word (see flow-chart 4).

Create a neologism (see flow-chart 3.)
The third step in secondary term creation, creating neologisms in Latvian, follows six criteria, reflected in the flow-chart.

Flow-chart. 3 Examination of neologisms
Let us take a closer look at each of the six criteria.
First of all, we create a new word when the following criteria are met:  there is no appropriate word of Latvian origin to express the concept,  there is no corresponding international word, or it is ambiguous, i.e. in Latvian, it has another meaning,  the borrowing from English does not fit into the Latvian language system.The neologism should be as short as possible, without a negative connotation, and be euphonious and express the original meaning.

Does neologism express the concept precisely?
Neologism must express the concept precisely.Thus, when evaluating which features of the concept are primary and which are secondary, both written and spoken forms of neologism should be considered.Understanding what can be expressed in the language and by what means it can be expressed is essential.
Hence, if the word does not express the concept precisely, an attempt should be made to clarify the use of word-formation means.The meanings of the word-formation elements are specialised and differentiated, and with their help, it is possible to achieve conceptual accuracy of terms.
In Latvian, a word is usually formed with suffix, prefix or ending.In terminology, they acquire a specialised meaning, which should be considered when creating new terms in Latvian.Misunderstandings about the meaning of prefixes usually do not arise.
The auslaut gives a special meaning to the word.It is essential to choose the word derived with the auslaut that most accurately corresponds to the concept to be expressed.For example, with the auslauts -šana, -šanās words are derived when we name a process.However, sometimes they are used inaccurately in the sense of a completed action or its result.In this case, derivatives with auslaut -ums should be used.

Does neologism fit into the Latvian language system?
The neologism must fit into the language system, thus following aspects should be checked:

Is it phonetically suitable?
Although understanding what is a disharmonic or word that is difficult-to-pronounce is often subjective, there are sound sets that are not euphonious and thus not acceptable in Latvian, so it is desirable to avoid them.

Does it correspond to Latvian word-formation models?
In each language, a system of the word-formation models has been developed, which the language speakers use to establish communication units.These models are persistent, and language speakers generally use them unconsciously.
For example, the Latvian language is not characterised by words with two prefixes, except for the prefix ne-(not-); therefore, they are not recommended in terminology either.Still, when transferring a term from a source language to a target language, challenges might be encountered when including it in the industry term system.
An example: an equivalent for the English term select in Latvian is atlase (selection), but English deselect theoretically should be atatlase (literally translated as "dedeselect"), since the English prefix de-Latvian language term system is translated as at-.
Thus, on the one hand, the term atatlase is contrary to the rules of the Latvian language, but on the other hand, it conforms to the system of terms.

Is neologism easy to use in the Latvian language-system?
The newly developed term should be easy to useinflected, further derived, and easily used in word groups.Long terms should be avoided because, firstly, they are challenging to use in written and also in spoken communication; secondly, such words can rarely be used as a basis for further derivatives.

Is neologism emotionally neutral?
Emotional neutrality is essential in forming terms, especially in forming neologisms.If a term in the source language has a stylistic or emotional connotation, it justifies creating an emotionally expressive term in Latvian.
An example: an English oldbie and a newbie already belong to a colloquial style in English.Thus, in Latvian, they are recreated veculis (elderly) and a jaunulis (youngster) to denote an experienced computer user and a novice in this field.The term muļķudrošs (foolproof) is also not typical of academic language, both in the source and target language.

Borrowing of an English word (see flow-chart 4.)
The fourth step in secondary term creation, borrowing English words in Latvian, follows five criteria, as seen in the flow-chart.

Flow-chart 4. Checking the borrowings
The English word is borrowed in cases when there is no corresponding:  word of Latvian origin for the relevant concept, and it is challenging to coin it,  international word or it is ambiguous.
When borrowing terms from other languages, they are adapted to the Latvian language's phonetic, morphology and lexical system.

Is the term pronounced in English the same way as it is written?
The pronunciation and spelling rules of the English language differ significantly from the phonetic system of the Latvian language.
Thus, those English words with no difference in pronunciation and spelling fit well in the Latvian language.

Does this English term denote a widely used concept?
If a widely used English term is pronounced differently than it is written, then a word of Latvian origin should be chosen.
For a professional term, if it is already used among industry specialists, it is acceptable to borrow the word according to the transcription of the word pronunciation.

Is the borrowing euphonious?
Comprehension of which sound sets are disharmonic or difficult-to-pronounce usually is subjective.We use some of the borrowings and do not perceive them as foreign.Otherwise, it is replaced by a word of Latvian origin.
For example, the term skrembleris (scrambler) is perceived as foreign, so it would be desirable to replace it with a name of Latvian origin.

Does the borrowing fit into the morphological system of the Latvian language?
In Latvian, the borrowings should be included in the inflectional system and used as a further derivative base.

Does the borrowing create adjacent associations?
When the root of the English word coincides with the Latvian word, it can create misleading associations.Therefore, such borrowings should be avoided and replaced with a neologism of Latvian origin.An example: there are incorrectly used terms for skaneris (scanner) and skanēt (to scan).
These terms create associations with the Latvian verbs skanēt (to sound), meaning radīt skaņu (create sound), thus sometimes creating a comical effect.Therefore, the skeneris is approved as a term; in addition, it also complies with an established principle of borrowing the word according to the transcription of the pronunciation.
On the one hand, the development of terminology should not be a formal process; on the other hand, such a constructive approach requires compliance with all the principles of terminology development, which are essential for the inclusion of a new concept and term in the system of industry terms and concepts.

Methodology for selecting the most common ICT terms
For the study (Borzovs et al., 2013, 108-126), using the Focussed Monolingual Crawler (FMC) tool of the ACCURAT project, the ICT corpus FMC (MP) was assembled.The corpus includes Latvian ICT webpages of Latvian news portals (Apollo, DELFI, Diena.lv,etc), blog entries (krizdabz.lv, aidzis.lv, knagis.miga.lv, etc), product reviews (Androids, iPods, etc), press releases, tutorials (Microsoft, Samsung, Lattelecom, etc.) The corpus was deliberately collected from web domains directly or indirectly linked to the ICT field, as well as from social networking portalsotherwise, there would be much more non-essential (in other words, semantically inappropriate) text units.However, the number of tokens collected in the corpus reaches almost 5.5 million tokens in unique sentences.It is a representable number for creating an overview regarding trends in the use of terms and evaluating different functional genres and styles, including slang.
The analysis is based on the official English-Latvian information technology, telecommunications and electronics term database, approved by the Subcommission of Information Technology, Telecommunication and Electronics Terminology of the Latvian Academy of Sciences.The Term Database contains more than 7500 ICT entries of ICT terms.
Officially approved terms were used to analyse and produce statistics on the English-Latvian language bilingual corps, which were obtained on the web (both English and Latvian sources are used).The corpus statistics are summarised in Table 1.The purpose of the analysis was to find the most commonly used ICT terms in English, which have corresponding equivalents in Latvian.The restriction of the analysis was the use of a corpus containing contents created in both (English and Latvian) languages on the same subject matter.The aggregated corpus was used to identify the 200 most popular officially approved terms using valid criteria and to perform further analysis based on a list of these terms.At the next stage, a list of Latvian language equivalents corresponding to all English terms found was created semi-automatically.

Three sources were used to create pairs:
The first source was an officially approved dictionary of terms, the second source was equivalents of English terms in Latvian (found in the bilingual corpus).
Since the corpus is distinctly comparable, the corpus was aligned at the phrase level (using the statistical machine translation platform LetsMT!); namely, a set of possible equivalents in Latvian was found for each English phrase (up to 7 units of text).Subsequently, using the alignment of phrases, all possible forms of Latvian words were found per each English term (including different case variants).Automatic alignment also creates "noise" of statistical data, so the field expert manually checked the alignment results and deleted the erroneous alignment.
For example, the following Latvian equivalents were created for the term "cookie" (in English) after the phrase alignment and data validation: "sīkdatne", "sīkfails" (and the grammatical forms of these words).There was no need for a complete list of possible form variants for the analysis since only words without endings were used in the analysis of the Latvian language corpus, and inflectional forms were grouped.
The third source was equivalents that were added manually.
All possible equivalents used in society were not found in the bilingual corpus, so two field experts manually added words to the list of equivalents.The list also included colloquial variants of words known to experts.
In the semi-automated pairing process, 997 different Latvian terms (excluding inflectional forms) were found for the 200 most popular English terms.Thus, an average of five Latvian language equivalents were assigned to each English term.
To be able to analyse the use of Latvian ICT terms in society, a specialised monolingual corpus of Latvian texts related to the field of ICT was collected on the Internet.The Focussed Monolingual Crawler (FMC) program was developed by the ACCURAT.1 project was used to collect the corpus.Table 3 (below summarises the corpus statistics.After collecting and filtering the data, the frequency of use of 997 different Latvian terms in this corpus was calculated.Statistics on word forms of terms in web domains were compiled, and the frequency of application of each word form was added, both for each Latvian term and for the English term.Latvian ICT terms are coined according to a semi-algorithmic approach, following the principles of ICT terminology development.

Is the semi-algorithmic approach used?
For the list of ICT 200 terms obtained in the selection, we contrasted the officially accepted Latvian terms and terms coined according to the criteria described above.
We determined that the corresponding Latvian term sources can be conditionally divided into three groups.
1. Terms created from Latvian origin words or words that are found in a general English-Latvian dictionary.The validity of the choice of these terms is checked in flowchart 1.First of all, it is checked whether the term can be found in the reference base of industry terms, in other term dictionaries, or is mentioned as a common word in the general English dictionary.If the answer is affirmative, choose the accepted English equivalent of the term in Latvian.Flow-chart 2 shows what steps are needed to check whether the selected word corresponds precisely to the concept to be expressed.
2. Terms coined from foreign words used in Latvian before the personal computer age and, thus, before the establishment of official terminology.These terms are examined the same way as the terms of the first group.
3. ICT terms have been formed in the personal computer age: using words of Latvian origin, transliterations and transcriptions of foreign words.The examination of neologisms is described in flow-chart 3, and flow-chart 4 analyses whether a neologism based on borrowing from another language fits into the morphological system of the Latvian language.
Thus, it can be concluded that terminologists have acted per the rules defined more than 30 years ago.
The list includes 115 ICT terms corresponding to Latvian origin names and 104 foreign words created before the computer age.62 terms have been newly coined, including 39 neologisms of Latvian origin ICT, 19 terms coined from foreign words by transliterating them and 4 terms created from foreign words by transcribing them.The number of Latvian terms is larger because synonyms of terms are also added to the total number of terms.
Although the words used in terms of the first group are used in modern vocabulary when these words are assigned the functions of a term, they describe a specific concept, and this concept is usually more limited than the general use of the word.For example: poga (button)a physical button is usually a constructive element of a pointing device (e.g. a mouse), or it is imitated on the screen (icon).An imitated button is "pressed" by moving the cursor on it and clicking the mouse.lasītājs (reader)a device that reads encoded information recorded in a data medium and converts it into a form suitable for further processing.žurnāls (log)a file used by the operating system for collecting and accounting statistical information, various reports and other data.

Habits of using ICT terms in public communication
It should be acknowledged that until now, there has been almost no research on the habits of using official Latvian terms in the ICT industry.Primarily, there have been only assumptions and prejudices.
Therefore, analysis is needed to bridge the gap between the "terminology commission" and the "terminology users" and dispel unnecessary stereotypes In order to be as objective as possible, the use of official terms was evaluated on a score scale from 0 to 5depending on the percentage of how often (as a percentage of) the official term is used in comparison with other known variants of terms.In particular, the following division in categories was used: 0-1% -0 points 1-10% -1 point 11-30% -2 points 31-50% -3 points 51-80% -4 points 81-100% -5 points Each Latvian term was evaluated separately (for example, interfeiss (interface), word of English origin and saskarne (interface) word of Latvian origin) both mean "interface"), thus the number of entries to be analysed increased and the list of the most popular terms included 252 ICT Latvian terms (see Diagram 1).
Diagram 1. Popularity scale of Latvian ICT terms.
We will now explain the rating of each category in more detail.
One-tenth of the terms included in the analysis were left unrated (not evaluated) for statistical or semantic reasons, namely: 1. if the frequency of use of the corresponding English equivalent in Latvian was too lownot more than 100 cases (e.g.darblapa (worksheet) and vaicājums (query), which are accordingly the 178 th and 181 st most popular English ICT terms).
2. if it is a polysemic word, the qualitative analysis of the frequency of use of which requires context (an example: translations of English words "set", map, "sign").
In this analysis, a term that has two or more meanings in the system of terms of one industry is considered to be a polysemic or ambiguous term, according to the explanation given in Valentīna Skujiņa's book "Principles of Development of Latvian Terminology".
A term is not considered ambiguous if it has acquired a terminological function by transferring the meaning of a common word.An example: in the electronics industry atmiņa (memory), adrese (address).
Mark 0 (0-1% frequency of use) was assigned to the following official Latvian ICT termsalmost all of them are newly coined terms or borrowings from English: When the term 'application software' is used as aplikācija in Latvian, it might create a misleading association with aplikāciju papīrs (paper, used for creating applications) and dūņu aplikācijas (applying mud for healing purposes), especially the context is ambiguous and might provide multiple interpretations.
Neologisms: dators (computer), galvene (header), izšķirtspēja (resolution), vietturis (placeholder) Trends:  likes to use common words in the meaning of the terms;  readily accepts the official term if it aptly describes the functionality and is at the right time  if there are initial objections to a term at the beginning, it does not mean that it will not become popular over time.
Summing up the research regarding the popularity of terms, it can be said that Latvian equivalents of the terms "application" and "blog" have been unsuccessful for the time being.One reason for the failure could be too many synonyms and a difference of opinion among the terminologists.
Preliminary conclusions regarding the data analysed could be formulated as follows: 1. Terminology users in Latvian like to merge semantic groups and name in the same words the terms that are also semantically in one block in English and are used as synonyms (display/monitor/screen -displejs/monitors/ekrāns; folder/directorymape/direktorijs/katalogs; key/button -taustiņš/poga).Minor details do not matter in general use.
2. In almost equal proportions, there are terms of Latvian origin and terms of English origin (interfeiss/saskarneinterface/interface); significant preference is given to common words within the meaning of the terms.

Conclusions
1.The main task of creators of Latvian ICT terminology is to choose words of Latvian origin for newly coined terms as much as possible.
2. The task of ICT terminologists is to assess the need to introduce English borrowing into ICT terminology and to maintain a balance between terms based on Latvian words and borrowing from English.Since most terms are created in the ICT professional environment, where communication takes place in English, maintaining this balance and creating new Latvian terms is a continuous daily work.
3. The driving force behind the development of ICT terminology is the need to translate the documentation into Latvian, which provides information about new technologies and equipment entering our daily lives in English.(Skujiņa, 2011) 4. For users, the systematicity of terms is essential, as well as the accuracy of meaning and the brevity of form; the uniqueness of terms is less important.
5. Official ICT Latvian terms are used in public communication: 43% of the most frequently used Latvian terms are used in more than half of the number of cases mentioned in the collected body of texts; 32.5% of the terms are used in most cases (more than 80% of cases).
6.The popularity of terms is only partially based on linguistic aspects.Mostly the functional aspect and timeliness is deciding factor why even properly created ICT terms sometimes do not gain popularity, while other terms enter the everyday language easily.7. If a term is needed, it is gradually introduced, even if it is not euphonious or initially highly criticized by users, e.g., maršrutētājs (router), galvene (header).
8.There is a need for rapid dissemination of newly coined terms and further analysis of user reactions when the term is not publicly accepted.9.The insight provided in the last 30 years shows how important to Latvian terminology development has been systematic and methodical work in the ICT terminology field.Up to February 2023, more than 9000 terms have been discussed and approved.
10. Latvian ICT terms have been approved, disseminated, and used in broad stylistic range: from academic discourse to popular science articles, from broadcasting programs to everyday communication.
11.It can be seen from the description and analyses that the systematic approach to the secondary ICT term formation provides solid guidelines and serves as a guiding star to orient the term formation process.Speaking in similes, just like "perfect translation" seldom exists, it is still worth striving for this perfection.The same can be said about creating the "perfect secondary term".12.To facilitate the term dissemination in actual use, it is crucial to collaborate with translation agencies in Latvia and the EU, ministries, schools, translation departments in the EU and other institutions.
13. Based on the conclusions from the research work carried out in this article, we outline the possible research avenues for facilitating the three most essential aspects of secondary ICT term development in Latvian, namely: the preparation process (of the term selection for coining the corresponding terms in Latvian), automation of the information searching and amalgamating process for term definitions, terms, parallel texts and last, but not leastwe emphasize the importance of term dissemination process in the society, from an academic environment to the popular science and everyday use.

Table 1 :
Statistics of the English-Latvian language bilingual corpus collected on the web.

Table 2 :
Top 10 most popular English terms in the bilingual corpus (excluding parts of speech and grammatical number).

Table 3 :
Latvian Language ICT corpus statistics collected on the Web  use a semantically similar but simpler word, sometimes a term used in the databases of specific products, e.g.Microsoft (use: logrīks, vidžets do not use: ekrānvadīkla, use: rīki, tūļi do not use: rīkkopa, use: attiecība, saistība do not use: relācija)  some terms have too many synonyms in Latvian.Mark 1 (0-10% frequency of use) was assigned to the following official terms of Latvian origin: birka (tag), būvējums (build), datne (file), datubāze (database), iesūtne (inbox), īsinājumikona (shortcut), kārtula (rule), klēpjdators (laptop), krātuve (storage), lietojumprogramma (File Manager), lietotne (application program), piezīmjdators (notebook computer), rakstzīme (character), rediģēt (edit), sīkrīks (gadget), tīmekļa dienasgrāmata (weblog), tīmekļa pārlūkprogramma (web browser), vadīkla (control). a commonly used term is replaced by another one that might be more functionally accurate (use: mainīt do not use rediģēt, use: atzīme, tags do not use birka, use: darbība do not use: operācija);  uses a semantically similar but simpler word (uses: zvans do not use izsaukums, uses: simbols, burts do not use rakstzīme);  do not use too "correct" and embellished terms (use another official term: fails instead of datne, use: kontrole instead of vadīkla, use: programma, risinājums, aplikācija, proga instead of lietotne, use blogs instead of emuāri, use noteikums not kārtula);  use another common word instead of words of English origin (use: variants, iespēja, izvēle not opcija, use ekrāns, monitors not displejs);  if there are several official terms as synonyms, do not use any of them and choose a word-for-word translation from English, which already has different semantics in