Telemedicine as a special case of machine translation

doi:10.1016/j.compmedimag.2015.09.005

Computerized Medical Imaging and Graphics

Volume 46, Part 2, December 2015, Pages 249-256

https://doi.org/10.1016/j.compmedimag.2015.09.005 Get rights and content

Highlights

•
We created Machine Translation system for medical text domain.
•
We adapted it for PL-EN language pair.
•
We improved translation quality by adaptation of additional training data and interpolating language models.
•
Numbers of experiments on different methodologies were performed.
•
The high quality translations were obtained in evaluation process.

Abstract

Machine translation is evolving quite rapidly in terms of quality. Nowadays, we have several machine translation systems available in the web, which provide reasonable translations. However, these systems are not perfect, and their quality may decrease in some specific domains. This paper examines the effects of different training methods when it comes to Polish–English Statistical Machine Translation system used for the medical data. Numerous elements of the EMEA parallel text corpora and not related OPUS Open Subtitles project were used as the ground for creation of phrase tables and different language models including the development, tuning and testing of these translation systems. The BLEU, NIST, METEOR, and TER metrics have been used in order to evaluate the results of various systems. Our experiments deal with the systems that include POS tagging, factored phrase models, hierarchical models, syntactic taggers, and other alignment methods. We also executed a deep analysis of Polish data as preparatory work before automatized data processing such as true casing or punctuation normalization phase. Normalized metrics was used to compare results. Scores lower than 15% mean that Machine Translation engine is unable to provide satisfying quality, scores greater than 30% mean that translations should be understandable without problems and scores over 50 reflect adequate translations. The average results of Polish to English translations scores for BLEU, NIST, METEOR, and TER were relatively high and ranged from 7058 to 8272. The lowest score was 6438. The average results ranges for English to Polish translations were little lower (6758–7897). The real-life implementations of presented high quality Machine Translation Systems are anticipated in general medical practice and telemedicine.

Graphical abstract

Introduction

Statistical Machine Translation (SMT) is the translation of the text by a computer, with no human involvement. SMT systems have no knowledge of language rules. Instead, they “learn” to translate by analyzing large amounts of data for each language pair. They can be trained in specific industries or disciplines using additional data relevant to the sector needed. Typically, SMT systems deliver fluent-sounding but less consistent translations.

The machine translation is evolving quite rapidly in terms of quality. Currently, several machine translation systems are available on the web that provides reasonable translations. Developed systems are not perfect, and their quality may decrease in some specific domains. In addition to this, the scientific community is involved in machine translation. It must be pointed out that the scientific organizations, conferences, and events dedicate great effort to its improvement. One of the biggest advantages of machine translation is that most users do not require perfect translations [1]. Users may only be interested in roughly understanding a text simply to get an idea of what the text is about. However, other users may not be that flexible. For example, the correctness and beauty of writing in medicine may not be important, but the precision and adequacy in the translated message is crucial. In medical communication, a translation error between the patient and the physician, or an error in communication regarding treatment or a diagnosis may have serious consequences for a patient's health [2].

Recently, due to the growing success of an interest in language technologies, machine translation has been applied to the field of medicine. For example, one study [3] analyzed the feasibility of post-editing machine translations of health-promotional English documents from local and national public health websites in the USA. It was assumed, a priori, that machine translation would not provide a high enough quality for the documents to be used as official versions. Despite that, language technologies are steadily increasing in quality. It should be expected that, in the not-too-distant future, machine translation will be capable of translating any text in any domain with the required quality.

The medical data domain is, in our opinion, a very narrow, but relevant and promising field of research for language technologies. MT systems can be used for translation of medical records of any kind. Accessing and translating a foreign patient's medical history might even save their life. Preparation of direct speech-to-speech translation systems is also possible. The foreign patient's speech is recognized using an Automated Speech Recognition (ASR) system. After recognition, the speech is translated into another language and synthesized in real-time. For example, the EU-BRIDGE project aims at developing automatic transcription and translation technology that will permit the development of innovative multimedia captioning and translation services of audiovisual documents between European and non-European languages [http://www.eu-bridge.eu].

Obtaining and providing medical information in comprehensive ways appears to be of crucial importance for both patients and physicians [4], [5], [6], [7]. For example, as emphasized by Healthcare Technologies for the World Traveler (HTH) [8], a foreign patient may require an explanation and description of their diagnosis and comprehensive information about available treatment options. In several countries, many residents and immigrants communicate in languages other than the official one.

According to Karliner et al. [9], it is necessary to analyze how human translators could enhance access to health care, including improvement of its quality [10]. Nevertheless, human translators experienced in telemedicine information are very often unavailable for both patients and medical professionals [11]. Although existing machine translation capacities are imperfect [11], machine translation must ensure the reduction of costs associated with medical translation. On the other hand, it is necessary to increase its availability and quality [12].

Medical professionals, researchers, and patients require adequate access to the abundance of telemedicine information on the Internet [6], [13]. This information can potentially improve our health and well-being. Sharing medical information could improve medical research, as well. English is the most dominant language used in medical science, but not the only one.

Polish is considered to be the one of the most challenging West-Slavic languages, due to its complexity. It is a tough language for an SMT system. For example, Polish grammar, includes complicated rules and elements, including an immense vocabulary (thanks to its complex declension). Nearly free word order in sentences is also problematic. All of these are the main reasons for its challenging character. In addition, the Polish language includes 7 cases and 15 gender forms for both nouns and adjectives.

As expected, these facts strongly influence the data and data structure used in statistical translation models. The lack of available and appropriate resources necessary for data input to SMT systems presents another problem. SMT systems give the best results for concrete and narrow text domains. The proper quality of the parallel data, including the required domains, has inadequate availability. On the other hand, Polish and English differ strongly in syntax. Above all, English is a positional language. This means that the syntactic order, which includes the word order of one sentence, has an invaluable significance, especially because of the limited inflection of words (for example, lack of declension endings). Sometimes, the position of the word in a sentence is the only indicator of the sentence meaning. As far as English sentences are concerned the subject comes before the predicate. Therefore, a sentence is structured in Subject–Verb–Object (SVO) word order. In contrast, Polish simply has no particular word order. Additionally, the word order itself has no decisive impact on a sentence's meaning. In Polish, one can express the same idea in many ways, which is simply not possible in English. For instance, the sentence “I have bought myself a new car.” can be expressed in Polish as “Kupiłem sobie nowy samochód”, or “Nowy samochód sobie kupiłem.”, or “Sobie kupiłem nowy samochód.”, or “Samochód nowy sobie kupiłem.” As one can see, changes in word order influence the complexity of the translation process.

As a consequence, the development of SMT systems for the Polish language has been considerably slower in comparison to English and other languages. The primary goal of this research is to develop an SMT system for translation from Polish to English language and vice versa, with an emphasis on medical data. This paper has the following structure: Section 2 contains an introduction to the preparation of Polish data. Section 3 presents the English language issues. Section 4 describes the methods associated with translation evaluation. Section 5 presents the results. Sections 6 Discussion and conclusions, 7 Future work provide the summary of potential implications and opportunities for future work.

Section snippets

Polish data preparation

The Polish data we included was a corpora derived from the European Medicines Agency (EMEA) parallel corpus. This corpus was created from biomedical PDF documents from the agency. It includes documents related to medical products and their translations into 22 official languages of the European Union. It contains roughly 1500 documents for most of the languages, but not all of them are available in every language [14]. It comprises around 80 MB of data and 1044,764 sentences constructed from

English data preparation

The preparation of the English data was far less complicated than that of the Polish data. We developed a tool to clean the English data by eliminating foreign words, strange symbols, etc. Compared to Polish, the English data included fewer errors drastically. However, some problems needed to be fixed. The most problematic were translations into languages other than English itself, including strange UTF-8 symbols, repetitions, and unfinished sentences. Such errors are typical when corpora are

Methods of evaluation

Human evaluations of machine translation outputs require considerable effort and are expensive. Human evaluations can take days or even weeks to finish. So, automatic metrics is needed to measure the translation quality derived from SMT systems. Different automated metrics is used to compare SMT translations and match the human translations. Among the most widely used SMT metrics are:

-
the Bilingual Evaluation Understudy (BLEU),
-
the U.S. National Institute of Standards & Technology (NIST) metric;
-

Results

We executed the following experiments to evaluate the optimal translation method from Polish to English, and vice versa. These experiments were performed with test and development data. Data was obtained through the process of random selection and removal from the corpora itself. We have cumulated 1000 sentences for each case. The BLEU, NIST, TER, and METEOR metrics evaluated the results of these experiments. It is worth mentioning that a low value of the TER metric tool is considered to be a

Discussion and conclusions

Wu et al. [40] analyzed statistical machine translation output for six foreign language—English translation pairs (bi-directionally). They built a high-performing in-house system and evaluated its output for each translation pair on a large scale both with automated BLEU scores and human judgment. They also evaluated Google Translate's performance specifically within the biomedical domain. In their study automated BLEU scores did not achieve higher scores than 3624 for Polish to English

Future work

Using machine translation for medical texts has a high potential for providing benefits to patients, including tourists and people who do not know the language of the country in which they require medical help. Improved access to various medical information can be very profitable for patients, medical professionals, and eventually to medical researchers.

Human interpreters with proper medical training are extremely rare and costly. Machine translation could also assist in the evaluation of

Acknowledgements

This work was supported by the European Community from the European Social Fund within the Interkadra project UDA-POKL-04.01.01-00-014/10-00 and Eu-Bridge 7th FR EU project (Grant agreement no. 287658).

Krzysztof Wolk is a PhD student and assistant at Polish-Japanese Academy of Information Technology. He is the author of numerous technical books (mostly on Windows Server and Mac OS X Server) and researcher in machine learning area. Certified Microsoft, Adobe, w3schools and Apple specialist since 2008.

References (42)

A.M. Turner et al.
Modeling workflow to design machine translation applications for public health practice
J Biomed Inform
(2015)
M. Costa-jussà et al.
Latest trends in hybrid machine translation and its applications
Comput Speech Lang
(2015)
P. Pecina et al.
Adaptation of machine translation for multilingual information retrieval in the medical domain
Artif Intell Med
(2014)
P. Koehn et al.
Open source toolkit for statistical machine translation
M.R. Costa-jussà et al.
Machine translation in medicine. A quality analysis of statistical machine translation in the medical domain
K. Kirchhoff et al.
Application of statistical machine translation to public health information: a feasibility study
J Am Med Inf Assoc: JAMIA
(2011)
O. Dušek et al.
Machine translation of medical texts in the Khresmoi Project
N. Pletneva et al.
Requirements for the general public health search
vol. Public Technical Report
(2011)
L. Goeuriot et al.
Report on and prototype of the translation support
M. Gschwandtner et al.
Requirements of the health professional search

Worldwide H: Medical Phrases and Terms Translation Demo. In.;...

L.S. Karliner et al.

Do professional interpreters improve clinical care for patients with limited English proficiency? A systematic review of the literature

Health Serv Res

(2007)

Y. Schenker et al.

Patterns of interpreter use for hospitalized patients with limited English proficiency

J Gen Intern Med

(2011)

G. Randhawa et al.

Using machine translation in clinical practice

Can Fam Physician

(2013)

S. Deschenes

5 benefits of healthcare translation technology

Healthcare Finance News

(2012)

C. Zadon

Man Vs Machine: The Benefits of Medical Translation Services

(2013)

Tokenization...

P. Koehn et al.

Moses: open source toolkit for statistical machine translation

A. Radziszewski

A tiered CRF tagger for polish

KantanMT—a sophisticated and powerful Machine Translation solution in an easy-to-use package...

Cited by (10)

Adopting machine translation in the healthcare sector: A methodological multi-criteria review
2024, Computer Speech and Language
The recent advances in machine translation (MT) offer an appealing and low-cost solution to overcome language barriers in multiple contexts (e.g., travelling, cultural interaction, digital content localisation). However, highly-technical domains typically exhibiting as long, complex, and specialised texts as the healthcare sector, pose multiple challenges to the effective and risk-safe use of MT.
To examine how MT nowadays assists written/verbal health communication and because of the existing considerable heterogeneity in technological enablers, language pairs and user groups, training approaches, evaluation processes, and users” requirements, we propose in this paper a methodological multi-criteria literature review based on current guidelines in computer science research and grounded on a customised configuration of the PRISMA methodology, normally used to perform meta-analyses on clinical trials. The review focuses on language-to-language medical MT, covers the time period January 2015–February 2023, and only refers to articles written in English that are accessible via four scientific online digital libraries. Articles are ranked according to a meta-evaluation scoring method for MT scientific credibility along with a scoring for assessing the scope of MT in healthcare. Finally, a guideline to properly design a study about MT in healthcare is also proposed.
The review included a final set of 58 articles from journals ( $n = 30$ ) and conference proceedings ( $n = 28$ ), considering 48 different language combinations. We identified a predominance of English-to-Spanish ( $n = 19$ ) and English-to-Chinese ( $n = 16$ ) implementations, mainly tailored to medical staff only ( $n = 14$ ) or along with patients ( $n = 12$ ). Included papers addressed clinical communication ( $n = 21$ ) and health education ( $n = 37$ ). Unidirectional real-time bilingual MT ( $n = 24$ ) was the most frequent configuration. MT implementations were dominated by Google Translate ( $n = 22$ ) often used as baseline, OpenNMT ( $n = 12$ ), or Moses ( $n = 11$ ). Training and evaluation approaches varied considerably, while deployment and pre-/post-editing were rarely described with an adequate level of detail.
Even if a significant number of articles reported that the proposed MT solutions were effective when translating (bio)medical texts, only a subset of them complied with rigorous translation quality assessment criteria (e.g., use of automatic metrics better related to human ranking than BLEU or statistical significance testing). Nevertheless, MT can be a valid support/supplement in health communication but to cope with issues in fluency, accuracy, unnatural translations, domain-adequacy, and potential safety risks (for highly-sensitive documents), appropriate MT training is essential, along with in-domain human post-editing. The presence of in-domain training text corpora has also proven to be beneficial. Finally, guidelines about how to design studies on MT in healthcare are also proposed to engage more researchers in this field.
Human versus machine editing of electronic prescription directions
2021, Journal of the American Pharmacists Association
Citation Excerpt :
MT is a natural language processing task of using computer software to translate text in 1 language into another without human intervention. With the quality of MT improving in recent years, such approaches have been adapted for medical texts and in health care settings to assist with multilingual communication.9,10 An extension of this approach is using MT to “translate” the difficult-to-read e-prescription directions from prescribers into concise, easy-to-understand, patient-friendly language.
Pharmacy staff are responsible for editing poor-quality and difficult-to-read electronic prescription (e-prescription) directions. Machine translation (MT) models are capable of translating free text from 1 sequence into another. However, the quality of MTs of e-prescriptions into pharmacy label directions is unknown.
To determine the types and frequencies of e-prescription direction component errors made by an MT model, pharmacy staff, and prescribers.
A prospective evaluation was conducted on a random sample of 300 patient directions in a test set of e-prescriptions from a mail-order pharmacy. Each row included directions produced by (1) prescribers on e-prescriptions, (2) pharmacy staff on prescription labels, and (3) an open neural MT model. Annotators labeled direction sets for missing direction components, use of abbreviations and medical jargon, and incorrect information (e.g., changing the number of tablets to be taken). The longest common subsequence (LCS) compared the amount of pharmacy staff editing with and without MT.
Out of 279 direction sets labeled, the MT model directions contained no quality issues in 196 (70.3%) samples compared with 187 (67.0%) and 83 (29.8%) samples for pharmacy staff directions and prescriber directions, respectively. The MT model directions contained more incorrect components (n = 23). Median LCS was greater without MT (30.0 vs. 18.5, P < 0.01, Wilcoxon signed-rank test), indicating more editing was needed.
MT could be used to improve the quality of e-prescription directions; however, MT makes high-risk mistakes such as incorrectly predicting the tapering regimen for prednisone. The use of semiautomated MT, where pharmacy staff can review model predictions to detect and resolve quality issues, should be considered to improve safety and decrease total work time compared with current practice. MT has strengths and weaknesses for improving the editing process of the patient directions compared with pharmacy staff alone.
Editorial: Special issue on information technologies in biomedicine
2015, Computerized Medical Imaging and Graphics
Barriers to Video Call–Based Telehealth in Allied Health Professions and Nursing: Scoping Review and Mapping Process
2023, Journal of Medical Internet Research
Experiences of the Telemedicine and eHealth Conferences in Poland—A Cross-National Overview of Progress in Telemedicine
2023, Applied Sciences (Switzerland)
Research on the Construction System of Language Service Platform based on Computer Corpus
2022, Proceedings - 2022 International Conference on Computers, Information Processing and Advanced Education, CIPAE 2022

View all citing articles on Scopus

View full text

Telemedicine as a special case of machine translation

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Polish data preparation

English data preparation

Methods of evaluation

Results

Discussion and conclusions

Future work

Acknowledgements

J Biomed Inform

Comput Speech Lang

Artif Intell Med

Open source toolkit for statistical machine translation

Machine translation in medicine. A quality analysis of statistical machine translation in the medical domain

Application of statistical machine translation to public health information: a feasibility study

J Am Med Inf Assoc: JAMIA

Machine translation of medical texts in the Khresmoi Project

Requirements for the general public health search

vol. Public Technical Report

Report on and prototype of the translation support

Requirements of the health professional search