Abstract
Despite the huge accumulation of scientific literature, it is inefficient and laborious to manually search it for useful information to investigate structure-activity relationships. Here, we propose an efficient text-mining framework for the discovery of credible and valuable domain knowledge from abstracts of scientific literature focusing on Nickel-based single crystal superalloys. Firstly, the credibility of abstracts is quantified in terms of source timeliness, publication authority and author’s academic standing. Next, eight entity types and domain dictionaries describing Nickel-based single crystal superalloys are predefined to realize the named entity recognition from the abstracts, achieving an accuracy of 85.10%. Thirdly, by formulating 12 naming rules for the alloy brands derived from the recognized entities, we extract the target entities and refine them as domain knowledge through the credibility analysis. Following this, we also map out the academic cooperative “Author-Literature-Institute” network, characterize the generations of Nickel-based single crystal superalloys, as well as obtain the fractions of the most important chemical elements in superalloys. The extracted rich and diverse knowledge of Nickel-based single crystal superalloys provides important insights toward understanding the structure-activity relationships for Nickel-based single crystal superalloys and is expected to accelerate the design and discovery of novel superalloys.
Similar content being viewed by others
References
Mostafaei M, Abbasi S M. Designing and characterization of Al-and Ta-bearing Ni-base superalloys based on d-electrons theory. Mater Des, 2017, 127: 67–75
Mostafaei M, Abbasi S M. Prediction of incipient melting map and γ′ features of Ni-base superalloys using molecular orbital method. In: TMS Annual Meeting & Exhibition. Cham: Springer, 2018. 453–466
Luo L, Ma Y, Li S, et al. Evolutions of microstructure and lattice misfit in a γ′-rich Ni-based superalloy during ultra-high temperature thermal cycle. Intermetallics, 2018, 99: 18–26
Krallinger M, Rabal O, Lourenço A, et al. Information retrieval and text mining technologies for chemistry. Chem Rev, 2017, 117: 7673–7761
Olivetti E A, Cole J M, Kim E, et al. Data-driven materials research enabled by natural language processing and information extraction. Appl Phys Rev, 2020, 7: 041317
Kononova O, He T J, Huo H Y, et al. Opportunities and challenges of text mining in materials research. iScience, 2021, 24: 102155
Eltyeb S, Salim N. Chemical named entities recognition: A review on approaches and applications. J Cheminform, 2014, 6: 17
Vaucher A C, Zipoli F, Geluykens J, et al. Automated extraction of chemical synthesis actions from experimental procedures. Nat Commun, 2020, 11: 3601–3611
Tarasova O A, Biziukova N Y, Rudik A V, et al. Extraction of data on parent compounds and their metabolites from texts of scientific abstracts. J Chem Inf Model, 2021, 61: 1683–1690
Kim E, Huang K, Saunders A, et al. Materials synthesis insights from scientific literature via text extraction and machine learning. Chem Mater, 2017, 29: 9436–9444
Jensen Z, Kim E, Kwon S, et al. A machine learning approach to zeolite synthesis enabled by automatic literature data extraction. ACS Cent Sci, 2019, 5: 892–899
Mahbub R, Huang K, Jensen Z, et al. Text mining for processing conditions of solid-state battery electrolytes. Electrochem Commun, 2020, 121: 106860
Tshitoyan V, Dagdelen J, Weston L, et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature, 2019, 571: 95–98
Weston L, Tshitoyan V, Dagdelen J, et al. Named entity recognition and normalization applied to large-scale information extraction from the materials science literature. J Chem Inf Model, 2019, 59: 3692–3702
He T J, Sun W H, Huo H Y, et al. Similarity of precursors in solid-state synthesis as text-mined from scientific literature. Chem Mater, 2020, 32: 7861–7873
Huo H Y, Rong Z Q, Kononova O, et al. Semi-supervised machine-learning classification of materials synthesis procedures. npj Comput Mater, 2019, 5: 62
Islamaj R, Leaman R, Kim S, et al. NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature. Sci Data, 2021, 8: 91
Hawizy L, Jessop D M, Adams N, et al. ChemicalTagger: A tool for semantic text-mining in chemistry. J Cheminform, 2011, 3: 17
Swain M C, Cole J M. ChemDataExtractor: A toolkit for automated extraction of chemical information from the scientific literature. J Chem Inf Model, 2016, 56: 1894–1904
Mavračić J, Court C J, Isazawa T, et al. ChemDataExtractor 2.0: Autopopulated ontologies for materials science. J Chem Inf Model, 2021, 61: 4280–4289
Wang W R, Jiang X, Tian S H, et al. Automated pipeline for super-alloy data by text mining. npj Comput Mater, 2022, 8: 9
Gu X Y. Study on quality index of bibliographic information (in Chinese). Library, 2007, 1: 73–75
Yu J R. Impact factor: Calculation, application, and limitations (in Chinese). Chin Bulletin Life Sci, 2002, 14: 2
Garfield E. Citation indexes for science: A new dimension in documentation through association of ideas. Int J Epidemiol, 2006, 35: 1123–1127
Kumar R, Singh S, Bilga P S, et al. Revealing the benefits of entropy weights method for multi-objective optimization in machining operations: A critical review. J Mater Res Tech, 2021, 10: 1471–1492
Wang Y, Guo J L. A comprehensive evaluation method for author influence based on grey relational analysis. J Intell, 2017, 36: 185–190 +184
Kuznetsov O P. Complex networks and activity spreading. Autom Remote Control, 2015, 76: 2091–2109
Gleich D F. PageRank beyond the web. SIAM Rev, 2015, 57: 321–363
Yang Y S, Chen W L, Li Z H, et al. Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning. In: Proceedings of International Conference on Computational Linguistics, Santa Fe, 2018. 2159–2169
China National Committee for Terminology in Science and Technology. Chinese Terms in Materials Science and Technology (in Chinese). Beijing: Science Press, 2011. 1–199
Shi C Q, Tang M, Zhang D F, et al. Hash table based on Trie-tree. J Comput Appl, 2010, 30: 2193–2196
Shi C X, Zhong Z Y. Forty years of superalloy R&D in China (in Chinese). Acta Metallurgica Sinica, 1997, 33: 1–8
Yuan Y, Yan P, Zhuang J Y, et al. Classification and Designation for Superalloys and High Temperature Intermetallic Materials (in Chinese). Standards Press of China, 2005, GB/T 14992-200
Chen X, Zou X Z, Qiu Y T. The application of resource discovery system in information tracing service for scientific research (in Chinese). Library Tribune, 2015, 5: 68–74,43
Zhang J, Wang L, Wang D, et al. Recent progress in research and development of Nickel-based single crystal superalloys (in Chinese). Acta Metall Sin, 2019, 55: 1077–1094
Shi Z X, Liu S Z, Yue X D, et al. Effect of Nb content on microstructure stability and stress rupture properties of single crystal superalloy containing Re and Ru. J Cent South Univ, 2016, 23: 1293–1300
Hiszpanski A M, Gallagher B, Chellappan K, et al. Nanomaterial synthesis insights from machine learning of scientific articles by extracting, structuring, and visualizing knowledge. J Chem Inf Model, 2020, 60: 2876–2887
Nie Z W, Liu Y J, Yang L Y, et al. Construction and application of materials knowledge graph based on author disambiguation: Revisiting the evolution of LiFePO4. Adv Energy Mater, 2021, 11: 2003580
El-Bousiydy H, Lombardo T, Primo E N, et al. What can text mining tell us about lithium-ion battery researchers’ habits? Batteries Supercaps, 2021, 4: 758–766
Chen B, Xie Y B. Functional knowledge integration of the design process. Sci China Tech Sci, 2017, 60: 209–218
Liu T Y, Zhang S, Wang Q, et al. Composition formulas of Ti alloys derived by interpreting Ti−6Al−4V. Sci China Tech Sci, 2021, 64: 1732–1740
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China (Grant No. 52073169), the National Key Research and Development Program of China (Grant No. 2021YFB3802101), and the Key Research Project of Zhejiang Laboratory (Grant No. 2021PE0AC02). We also appreciate the High Performance Computing Center of Shanghai University, and the Shanghai Engineering Research Center of Intelligent Computing System for providing the computing resources and technical support.
Supporting Information
The supporting information is available online at tech.scichina.com and link.springer.com. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.
Supplementary Information for
11431_2022_2283_MOESM1_ESM.docx
Domain Knowledge Discovery from Abstracts of Scientific Literature on Nickel-based Single Crystal Superalloys, approximately 5.07 MB.
Rights and permissions
About this article
Cite this article
Liu, Y., Ding, L., Yang, Z. et al. Domain knowledge discovery from abstracts of scientific literature on Nickel-based single crystal superalloys. Sci. China Technol. Sci. 66, 1815–1830 (2023). https://doi.org/10.1007/s11431-022-2283-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11431-022-2283-7