Skip to main content

Mining in Discharge Summaries

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (JSAI 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1357))

Included in the following conference series:

  • 296 Accesses

Abstract

This paper proposes a method of construction of discharge summaries classifier. First, morphological and correspondence analysis generates a term matrix from text data. Then, machine learning methods are applied to a term matrix. The method compared several machine learning methods by using discharge summaries stored in hospital information system. The experimental results show that random forest is the best clasifier, compared with deep learning, SVM and decision tree.

This research is supported by Grant-in-Aid for Scientific Research (B) 18H03289 from Japan Society for the Promotion of Science(JSPS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Outpatient clinic is still based on the action-based payment system even in large hospitals.

  2. 2.

    The method can also generate \(p (p\ge 3)\)-dimensional coordinates. However, higher dimensional coordinates did not give better performance that the experiments below.

  3. 3.

    Darch was removed from R package. Please check the githb: https://github.com/maddin79/darch.

  4. 4.

    The reason why 2-fold is selected is that the estimator of 2-fold cross-validation will give the lowest estimate of parameters, such as accuracy and the estimation of bias will be minimized.

  5. 5.

    DPC codes are three-level hierarchical system and each DPC code is defined as a tree. The first-level denotes the type of a disease, the second-level gives the primary selected therapy and the third-level shows the additional therapy. Thus, in the tables, characteristics of codes are used to represent similarities.

References

  1. Egakutsushinsha, Tokyo (2020)

    Google Scholar 

  2. Ishida, M.: Rmecab (2016). http://rmecab.jp/wiki/index.php?RMeCabFunctions

  3. Jones, K.S.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)

    Article  Google Scholar 

  4. Karatzoglou, A., Smola, A., Hornik, K., Zeileis, A.: kernlab - an S4 package for kernel methods in R. J. Stat. Softw. 11(9), 1–20 (2004). http://www.jstatsoft.org/v11/i09/

  5. Kim, J.H.: Estimating classification error rate: repeated cross-validation, repeated hold-out and bootstrap. Comput. Stat. Data Anal. 53(11), 3735–3745 (2009). https://doi.org/10.1016/j.csda.2009.04.009

    Article  MathSciNet  MATH  Google Scholar 

  6. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002). http://CRAN.R-project.org/doc/Rnews/

  7. Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957)

    Article  MathSciNet  Google Scholar 

  8. Mares, M.A., Wang, S., Guo, Y.: Combining multiple feature selection methods and deep learning for high-dimensional data. Trans. Mach. Learn. Data Min. 9, 27–45 (2016)

    Google Scholar 

  9. Nezhad, M.Z., Zhu, D., Li, X., Yang, K., Levy, P.: SAFS: a deep feature selection approach for precision medicine. CoRR abs/1704.05960 (2017). http://arxiv.org/abs/1704.05960

  10. Therneau, T.M., Atkinson, E.J.: An introduction to recursive partitioning using the RPART routines (2015). https://cran.r-project.org/web/packages/rpart/vignettes/longintro.pdf

  11. Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002). https://doi.org/10.1007/978-0-387-21706-2. http://www.stats.ox.ac.uk/pub/MASS4. iSBN 0-387-95457-0

Download references

Acknowledgments

This research is supported by Grant-in-Aid for Scientific Research (B) 18H03289 from Japan Society for the Promotion of Science(JSPS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shusaku Tsumoto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tsumoto, S., Kimura, T., Hirano, S. (2021). Mining in Discharge Summaries. In: Yada, K., et al. Advances in Artificial Intelligence. JSAI 2020. Advances in Intelligent Systems and Computing, vol 1357. Springer, Cham. https://doi.org/10.1007/978-3-030-73113-7_6

Download citation

Publish with us

Policies and ethics