Skip to main content

SNN-BS: A Clinical Terminology Standardization Method Using Siamese Networks with Batch Sampling Strategy

  • Conference paper
  • First Online:
Advanced Data Mining and Applications (ADMA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14177))

Included in the following conference series:

  • 486 Accesses

Abstract

Clinical terminology standardization is important for effective integration and sharing of medical information. It aims to convert clinical colloquial descriptions into standard clinical terminologies. However, the accuracy and efficiency of this task are challenged by the gap between colloquial descriptions and standard terminologies, the slight discrepancy across standard terminologies, and the low efficiency of terminology retrieval. To address these challenges, we propose a novel method called SNN-BS for standardizing clinical terminology based on a Siamese network with a batch sampling strategy. SNN-BS enhances its discrimination ability by sampling a set of terminologies to form a retrieval set with the target terminology. By combing two kinds of similarities, we amplify the differences in features between colloquial descriptions and clinical terminologies while considering deeper semantic relationships. Moreover, we use the lighter Bert-tiny model to encode the terminologies and improve the efficiency of terminology retrieval by reducing comparison numbers through regarding it as a question-and-answer selection task. Finally, we conducted experiments on two datasets to evaluate the performance of our model. The experimental results demonstrate that our method achieves a high level of accuracy, reaching 91.30\(\%\) and 90.24\(\%\), respectively, which outperforms the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 (2018)

  2. Gao, Y., Fu, X., Liu, X., Wu, J.: Multi-features-based automatic clinical coding for Chinese ICD-9-CM-3. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds.) ICANN 2021. LNCS, vol. 12895, pp. 473–486. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86383-8_38

    Chapter  Google Scholar 

  3. Huang, J.: Automatic encoding of disease terminology based on combined semantic similarity calculation. Micro Comput. Appl. 36(08), 157–160 (2020)

    Google Scholar 

  4. Liu, Y., Li, S., Yu, J., Tan, Y., Ma, J., Wu, Q.: Many-to-many Chinese ICD-9 terminology standardization based on neural networks. In: Huang, D.-S., Jo, K.-H., Li, J., Gribova, V., Hussain, A. (eds.) ICIC 2021. LNCS, vol. 12837, pp. 430–441. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84529-2_36

    Chapter  Google Scholar 

  5. Rao, J., Liu, L., Tay, Y., Yang, W., Shi, P., Lin, J.: Bridging the gap between relevance matching and semantic matching for short text similarity modeling. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5370–5381 (2019)

    Google Scholar 

  6. Ruby, U., Yendapalli, V.: Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends. Comput. Sci. Eng. 9(10), 1–8 (2020)

    Google Scholar 

  7. Sinha, R., Desai, U., Tamilselvam, S., Mani, S.: Evaluation of Siamese networks for semantic code search. arXiv preprint arXiv:2011.01043 (2020)

  8. Sun, Y., Liu, Z., Yang, Z., Lin, H.: Standardization of clinical terminology based on BERT. Chinese J. Inf. Technol. 35(4), 75–82 (2021)

    Google Scholar 

  9. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  10. Wullschleger, P., Lionetti, S., Daly, D., Volpe, F., Caro, G.: Auto-regressive self-attention models for diagnosis prediction on electronic health records. In: 2022 IEEE International Conference on Big Data, pp. 1950–1956. IEEE (2022)

    Google Scholar 

  11. Yan, J., Xiang, L., Zhou, Y., Sun, J., Chen, S., Xue, C.: Application of deep generative model in clinical terminology standardization. Chinese J. Inf. Technol. 35(5), 77–85 (2021)

    Google Scholar 

  12. Zhang, Z., Liu, J., Razavian, N.: Bert-xml: Large scale automated ICD coding using BERT pretraining. arXiv preprint arXiv:2006.03685 (2020)

  13. Zhou, L., Qu, W., Wei, T., Zhou, J., Gu, Y., Li, B.: A review on named entity recognition in Chinese medical text. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds.) ICAIS 2021. CCIS, vol. 1422, pp. 39–51. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78615-1_4

    Chapter  Google Scholar 

  14. Zhu, N., Cao, J., Shen, K., Chen, X., Zhu, S.: A decision support system with intelligent recommendation for multi-disciplinary medical treatment. ACM Trans. Multim. Comput. Commun. Appl. 16(1s), 33:1-33:23 (2020)

    Article  Google Scholar 

Download references

Acknowledgemment

This work is partially supported by National Natural Science Foundation of China under Grant No. 62202282 and Shanghai Youth Science and Technology Talents Sailing Program under Grant No. 22YF1413700.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nengjun Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wei, X., Wang, X., Zhu, N. (2023). SNN-BS: A Clinical Terminology Standardization Method Using Siamese Networks with Batch Sampling Strategy. In: Yang, X., et al. Advanced Data Mining and Applications. ADMA 2023. Lecture Notes in Computer Science(), vol 14177. Springer, Cham. https://doi.org/10.1007/978-3-031-46664-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46664-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46663-2

  • Online ISBN: 978-3-031-46664-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics