The Study of Pigments in Cultural Heritage: A Review Using Machine Learning

: In this review, topic modeling—an unsupervised machine learning tool—is employed to analyze research on pigments in cultural heritage published from 1999–2023. The review answers the following question: What are topics and time trends in the past three decades in the analytical study of pigments within cultural heritage (CH) assets? In total, 932 articles are reviewed, ten topics are identified and time trends in the share of these topics are revealed. Each topic is discussed in-depth to elucidate the community, purpose and tools involved in the topic. The time trend analysis shows that dominant topics over time include T1 (the spectroscopic and microscopic study of the stratigraphy of painted CH assets) and T5 (X-ray based techniques for CH, conservation science and archaeometry). However, both topics have experienced a decrease in attention in favor of other topics that more than doubled their topic share, enabled by new technologies and methods for imaging spectroscopy and imaging processing. These topics include T6 (spectral imaging techniques for chemical mapping of painting surfaces) and T10 (the technical study of the pigments and painting methods of historical and contemporary artists). Implications for the field are discussed in conclusion.


Introduction
In the last three decades, the analytical investigation of pigments has received increased attention, especially for the analysis of cultural heritage (CH) assets.It is now a core research topic in the field of heritage science, and adjacent fields, such as archaeometry, conservation science, and technical art history.Different scientific tools and methods have been adopted, developed, and optimized for this type of study, involving expertise from the fields of chemistry and physics, and, more recently, computer science.Different research questions and issues have been identified, discussed, and addressed.This development in scientific research entails that there is a vast collection of literature available at present.The growth in literature on the subject makes it increasingly challenging to maintain an overview of scholarly knowledge; an issue that will become even more demanding in the coming decades given the current growth rates of academic literature.Additionally, traditional methods of literature review, such as integrative literature reviews and metaanalyses, typically require in-depth manual coding of individual articles, which becomes less feasible on massive datasets and could be prone to coding errors.However, it is important to take stock of published information in a particular subject area to trace the scholarly development in the field, including identifying major topics, time trends, and research communities [1].
Recent applications of natural language processing (NLP) to review the content of a large collection of published studies in medicine [2], public administration [3], economics [4], business studies [5], and heritage science [6] show promising results.In particular, the unsupervised machine learning topic model, latent Dirichlet allocation (LDA), has been found to be a reliable literature review tool to acquire a synthesis of a large corpus of existing research on a subject.LDA is a generative probabilistic model developed by Blei, Ng and Jordan in 2003 to automatically cluster text documents into a user-set number of clusters of related content, commonly referred to as topics [7].The basic idea behind the topic model is that similar words tend to co-occur in texts dealing with the same subject.Hence, the LDA topic model is an automated way of text analysis and thus an alternative to traditional literature review methods [4].
This review article is structured as follows: Section 2 involves a discussion of the methods used for data collection of journal articles and LDA topic modeling.Section 3 presents the results of the LDA topic modeling together with an analysis of the topics and time trends.Section 4 considers a discussion of the implications of the findings and the relevance of the topic modeling for the CH field and provides a conclusion.

Data Collection Process
The text corpus of 932 articles was collected by undertaking a systematic review process.The first step consisted of conducting a search in the Web of Science (WoS) Core Collection by using the following search formula (TS = "heritage" AND TS = pigment*) (The search was conducted on 18 January 2024 in the Web of Science Core Collection.TS is the abbreviation for topic search in WoS).Subsequently, the search results were refined by setting the document type filter to article and the language filter to English.This search yielded an initial dataset of 1067 articles.During the second step of the data collection process, the Full Records of these articles were imported into an Excel spreadsheet (This was carried out by selecting the Export Records to Excel option in WoS, and subsequently filling in the 'Records from . . . to . ..' in the Record Options, which allows the export of a maximum of 1000 records at once).The third step involved data preprocessing by transforming the initial dataset consisting of 1067 articles into a text corpus suitable for the topic modeling.This step encompassed the manual checking of the Article Title, Abstract, and Source Title of all articles to obtain an optimal dataset of articles for the topic modeling process.Articles were removed from the Excel spreadsheet based on the following criteria.First, articles without an abstract, or which were included twice, were excluded from the initial dataset.Second, conference proceedings or technical briefs, which still made it into the dataset despite setting the document type filter to article, were eliminated.Third, articles focusing on a topic distinct from the analytical study of pigments present in heritage assets (e.g., binding media, cultural heritage stone materials, plastics, gemstones, geo-tourism, biomarkers, skin cancer, albinism) were also removed from the Excel data file.In some instances, the full article was accessed when the Article Title, Abstract, and Source Title did not provide enough details to decide whether it should be included or excluded from the dataset.The outcome of the data collection process was an Excel spreadsheet containing 932 articles, including their meta-information.

Topic Modeling
The corpus of 932 articles is clustered for text analysis by means of topic modeling.The procedure of topic modeling employs the LDA topic model in the statistical software package Stata (the name is a combination of statistics and data) (StataCorp.2023.Stata Statistical Software: Release 18. College Station, TX, USA: StataCorp LLC).The ldagibbs command, as developed by Schwarz [4], is used to implement the LDA topic model in Stata.The detailed procedure for LDA topic modeling is described by Harth [6].The current review is conducted according to this procedure.In the following paragraphs, a brief overview of the procedure's different steps is given.
During the first step, the Excel spreadsheet containing the 932 articles and their metainformation is imported into Stata.The Stata data file as such comprises 932 observations, each row representing a specific article and each column representing a specific variable, such as Article Title, Article Abstract, and Article Source, i.e., the meta-information derived from WoS.An LDA topic model is specifically used to model two variables, i.e., Article Title and Article Abstract.The second step consists of a preprocessing procedure to ensure the text corpus is optimal for the LDA topic modeling.This procedure consists of several steps of data denoising, which are executed in Stata.The data denoising process starts with combining Article Title and Article Abstract into one string variable in Stata, this string variable representing the text that will be analyzed.As a next component in the data denoising method, non-alphanumerical characters are eliminated from the text by implementing several Stata commands.The eliminated characters are periods (.), exclamation points (!), question marks (?), double points (:), semi-colons (;), colons (,), brackets (()) and blanks ().The reason for removing the non-alphanumerical characters is that they can impede LDA's text clustering into the user-set number of topics.After having removed the non-alphanumerical characters, stop-words are eliminated from the article titles and abstracts, as these can also interfere with LDA's clustering, especially due to their abundant usage in all types of texts.For this removal procedure, a pre-defined list of global stop words is employed.The list of global stop words is given by Harth [6].The removal of the stop words from the corpus of text is the final component in the data denoising procedure.
After completing the preprocessing procedure, the LDA topic modeling of the text corpus is initiated in Stata by employing the ldagibbs command of Schwarz [4].The number of topics in which the text corpus is categorized can be selected by the user.In this study, the default of ten topics is selected.As such, the same default option (together with other default parameters) of the review paper of Harth [6] is chosen, which uses topic modeling to cluster scholarly studies on XRF analysis of painted heritage objects.Moreover, the normalize and mat-save commands are inputted.The normalize command guarantees that each of the produced variables contain the probability that the article fits a specific topic, whereas the mat_save command is used to save the word probability matrix, which is needed to produce the ten most frequent words within the ten topics.
Lastly, LDA involves two main components of modeling to cluster the text corpus into ten topics.The first component is the probabilistic model, and the second component is an approximate interference algorithm.The probabilistic model describes the text corpus as a likelihood function.The principle underlying the probabilistic model is that similar words tend to co-occur in texts that consider the same topic.In other words, LDA uses the co-occurrences of words to describe each topic as a probability distribution over words and each article as a probability distribution over topics [4].Moreover, an inference algorithm is utilized because maximizing the likelihood function generated by the probabilistic model is computationally impossible.In this review, the inference algorithm called Gibbs sampling is applied.Gibbs sampling is a Monte Carlo algorithm, which specifically uses a Bayesian inference technique called Markov Chain.This technique randomly and repeatedly draws samples from distributions without completely knowing the distribution's mathematical properties [8].
After the LDA process is completed, three types of output can be loaded in Stata for the analysis of the content of each topic and identification of time trends.First, a list including the titles of the articles assigned per topic based on their probability score can be extracted.LDA creates ten new variables in the datafile for each of the ten topics.This variable contains the probability score of a specific article belonging to that topic.Thus, articles most likely belonging to a topic can be selected for further review.Second, the ten most frequent words used within each of the ten topics are enumerated by importing the information of the word probability matrix.Both types of outputs, i.e., title and word list, provide crucial information for the analysis of the content of each topic and to assign labels to them.The third type of output that is yielded in Stata are the time trends.Indicator variables are created for this output by employing the earlier calculated topic probability score.As such, the topic with the largest topic share within each article is generated and the average topic shares over time are plotted.Average topic share means the average share of a specific topic in articles published in a specific year.This output enables the comparison of attention to different topics over time, and thus the identification of time trends.

The Evolution of Literature on the Study of Pigments from 1999-2023
Basic bibliometric information on the evolution of literature on the analytical study of pigments from 1999-2023 can be acquired from the WoS extracted text corpus of 932 articles.This information includes the number of published articles, the main journals contributing to the subject, and hence the various fields contributing to the scholarly literature.Figure 1 shows the evolution in number of published articles in scholarly journals from 1999 to 2023.The bar graph indicates that the number of published articles has been increasing from 2007 onwards.The body of literature on the subject has expanded rapidly in the past eight years with the largest number of publications being realized in 2023 with a total of 119 articles.In addition to technological advancements, such as mobile noninvasive equipment [6], the observed increase in literature reflects the development of heritage science.The field resulted from a growing awareness about the role of natural scientific methods for the study of cultural heritage, which had emerged around 2006.A shift in terminology also accompanied this development.The terms "heritage" and "heritage science" became widely used around that time instead of the more traditional terms "restoration", "conservation science" and "preservation science".Another important development, which has contributed to the growth in scholarly literature, has been the establishment of journals involving the field of heritage science in the past decade.This can be further derived from the number of published articles per journal.In the field of heritage science, the scientific journals that have published the most articles on the subject are Journal of Cultural Heritage (41 articles), Heritage Science (40 articles) and Heritage (31 articles).

Analysis of the Topics
After the text corpus is clustered by LDA into ten topics, their content can be analyzed in detail.The content analysis of the ten topics consists of two components and requires human interpretation.The first component of the analysis involves the production of a list containing the most frequent words in each topic by using the word probability matrix in Stata.Table 1 contains this list, in which the ten most frequent words per topic are ranked based on their word probability score (highest to lowest).Afterwards, a preliminary label can be assigned to the topics.The contribution to the literature corpus by the fields of conservation science and archaeometry is rather limited.This is remarkable given the fact that there are a significant number of journals in those fields, especially compared to the few journals devoted to heritage science.More precisely, the number of articles published by conservation science and archaeological journals is thirty-one for both.Most articles have been published by Studies in Conservation (11 articles), the International Journal of Conservation Science (7 articles), the Journal of the American Institute for Conservation (7 articles), the Journal of Archaeological Science: Reports (7 articles) and Archaeological and Anthropological Science (5 articles).

Analysis of the Topics
After the text corpus is clustered by LDA into ten topics, their content can be analyzed in detail.The content analysis of the ten topics consists of two components and requires human interpretation.The first component of the analysis involves the production of a list containing the most frequent words in each topic by using the word probability matrix in Stata.Table 1 contains this list, in which the ten most frequent words per topic are ranked based on their word probability score (highest to lowest).Afterwards, a preliminary label can be assigned to the topics.Several preliminary insights can be drawn from the word lists generated per topic.Table 1 reveals that at least five of the ten topics deal with the analytical study of pigments.This can be observed from the inclusion of the words "pigment" or "pigments" in the word lists of topics 1 (T1), 4 (T4), 5 (T5), 6 (T6) and 9 (T9).Additionally, the incorporation of words, such as "painting" (T1, T6), "paints" (T9), "cultural" and "heritage" (T5), "analysis" (T1, T4, T5), "study" (T5, T9), "spectroscopy" (T1, T4), "Raman" (T4), or "hyperspectral" and "imaging" (T6) suggest that all these topics include analytical studies on pigments present in CH assets.However, the inclusion of the word "palette" in combination with the words "painting", "manuscripts", "study" and "technical" in the word probability matrix of topic 10 (T10) shows that this topic possibly also clusters articles devoted to the same area of study.
Yet another preliminary observation can be made from the data acquired with the word probability matrix, particularly from the word list of T7.The presence of the words "madder", "organic", "natural", "synthetic" together with "spectrometry" and "chromatography" shows that the textual corpus extracted from WoS additionally includes articles devoted to the analytical study of natural and synthetic organic pigments and dyes.Or, to put it slightly different, the term "pigment" employed in the search formula (TS = "heritage" AND TS = pigment*) refers to both natural and synthetic inorganic pigments, as well as natural and synthetic organic pigments and dyes.
For the group of topics involving the analytical study of pigments, the following initial labels are suggested based on the identified words per topic: • Topic 1 (T1): The spectroscopic and microscopic study of pigments in paintings • Topic 4 (T4): Raman spectroscopic analysis of pigment samples • Topic 5 (T5): X-ray techniques dedicated to materials characterization (including pigments) in CH • Topic 6 (T6): Imaging methods (including hyperspectral imaging) for the examination of paintings • Topic 7 (T7): Chromatographic and spectroscopic methods for the identification of organic pigments and dyes • Topic 9 (T9): The analytical study of paint and pigment degradation • Topic 10 (T10): The technical study of paintings and illuminated manuscripts The initial labels clearly indicate a difference between the content of the seven topics in terms of objects studied and/or analytical techniques used.Moreover, the label of T5 suggests that the scope of the ten articles is broader than the analytical study of pigments and probably includes physiochemical analyses of other CH materials.Besides these seven topics, a preliminary label can be assigned to each of the other three topics (T2, T3 and T8) created by the LDA topic model.However, it is not yet clear how the articles included in these topics are exactly related to the analytical study of pigments, nor which analytical techniques and methods are precisely used.
• Topic 2 (T2): Fungal deterioration and other types of biodeterioration of CH assets • Topic 3 (T3): The analysis of mineral compounds contained in CH assets • Topic 8 (T8): Scientific research for restoration and conservation of CH assets and materials The next step of the content analysis is the validation of the initial labels of each topic by examining the content of the titles and abstracts of the ten highest scoring articles per topic produced in Stata.The Stata output listing the titles of the ten highest-scoring articles per topic is included in the Supplementary Materials, Supplementary File S1 (top ten articles per topic based on topic probability score (Stata output)).For the content analysis, the full article is occasionally consulted when the article title and abstract do not provide enough information.In this way, the earlier proposed labels of the ten topics are cross-validated and refined if deemed necessary.Subsequently, the result of the content analysis and the refined labels per topic can be employed to make sense of the identified time trends per topic.
The cross-validation process commences with the analysis of the content of topics T1, T4, T5, T6, T7, T9 and T10, which word lists imply that they cluster articles dedicated to the analytical study of pigments.Afterwards, the analysis of the article titles and abstracts of topics T2, T3 and T10 are considered.

Topic 1 (T1)
Topic 1 (T1) clusters articles that focus on the characterization of painting materials and techniques of CH assets by means of spectroscopic and microscopic techniques.The ten articles can all be categorized as case studies and are conducted in an interdisciplinary manner, encompassing expertise from various fields.The fields involved are applied physics, earth science, material science, building engineering, art conservation, archaeology, cultural geography, and CH.The CH artefacts examined are polychrome carpentry [9], a polychrome wooden sculpture [10], gouache sketches [11], easel paintings [12,13], wall paintings [14][15][16][17] and painted limestone reliefs [18].These CH assets represent different time periods, covering a broad time range from antiquity up to the modern era, and different cultures and art forms (Ancient Egyptian, Brazilian, Iranian, Italian, Moorish, overseas Chinese, Portuguese, and Spanish).The scope of T1 in terms of studied painted CH artefacts is thus broader than the most frequent word list suggests with the inclusion of the terms: "paintings" and "painting" (see Table 1).The inquiry of the article abstracts discloses that these case studies are concerned with the spectroscopic study of the stratigraphy of the different types of painted CH artefacts, although the emphasis is on pigment identification and characterization.This explains the presence of the words "pigments", "painting" and "techniques" in the most frequent word list.In sum, the main objective of the spectroscopic studies clustered under T1 is to identify and characterize pigments and painting methods, which includes the identification of compounds forming preparatory layers, binding media, or other paint materials.For example, in articles 1 and 5, this objective also comprises the identification of the gilding materials and techniques present in the examined polychrome carpentry [9] and easel painting [13].Moreover, the aim of both articles is broader than the identification of painting materials and methods.
Here the additional purposes are to detect non-original materials, such as later repaints and restorations, and to characterize degradation products caused by the alteration of some pigments, by environmental contamination [9], or by biological activity of lichens, fungi, or bacteria [13].The physiochemical study of degradation products is also a major objective in article 8 in conjunction with the identification of pigments and binders [16].
The common objective of studying painting materials and techniques notwithstanding, the reasons underlying the extensive analytical campaigns of the ten studies of T1 are rather diverse, such as to support conservation treatments [9,16], to gain a better understanding of an artist's modus operandi [11,16,17], to consider authenticity issues [10,12], to identify geological sources of pigments [13,14], or to distinguish between paint materials used in ancient Roman and Arabic wall painting [15].
The analytical approach adopted in the articles under T1 is in general a micro-invasive one, which combines spectroscopic and microscopic techniques.However, a completely non-invasive method is used in three articles [12,13,15].In one of these articles [15], a non-invasive approach is facilitated by the type of CH object examined.It concerns the investigation of small fragments of ancient wall paintings, which enables access to each layer forming the paint stratigraphy on a micrometer scale.In the other two articles [12,13], on the other hand, a non-invasive approach is chosen to avoid sample microextraction by using energy dispersive X-ray fluorescence (EDXRF) spectroscopy combined with Raman [13], as well as X-ray diffraction (XRD) and diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy [12].In both studies, EDXRF is utilized for the preliminary elemental analysis of individual pigments to yield chemical information from the surface layers.Raman spectroscopy is deployed, with or without XRD, to corroborate and supplement the initial EDXRF results, as well as to investigate the chemical composition of the exposed preparation ground.In article 5, Raman analyses are also conducted in order to identify degradation products [13].Furthermore, DRIFT spectroscopy is included as an examination technique in article 3, which focus on the spectroscopic study of an oil painting on copper.DRIFT is specifically employed for the characterization of the organic binding medium because the strong fluorescence hampered the interpretation of the Raman spectra [12].
In the other seven studies, micro-analyses of cross-sectioned samples are conducted, and thus a micro-invasive approach is adopted to resolve compositional complexities.The sample microextraction is preceded by a non-invasive analysis of pigments and ground layer material of the CH artefact in five of these articles [9][10][11]14,18].The non-invasive analysis is predominantly performed with portable EDXRF [9][10][11]14].The results acquired with EDXRF are subsequently complemented with the analysis of cross-sectioned samples.In article 7, the micro-analysis of samples extracted from Ancient Egyptian limestone reliefs is anticipated by non-invasive measurements conducted with µXRF, surgical microscopy, ultraviolet fluorescence (UVF) and visible-induced luminescence (VIL) imaging [18].The latter technique allows the mapping and identification of the synthetic pigment Egyptian blue present in the Ancient Egyptian limestone reliefs.Moreover, in addition to qualitative data, semi-quantitative data on the chemical composition of pigments present in the reliefs' paint stratigraphy are gathered.More precisely, laser ablation inductively coupled plasma mass spectrometry (LAICPMS) is employed to determine concentrations of major and minor elements in different sample layers, such as the concentration of S and Sr in the white ground layer.An overview of the different analytical tools, and thus multi-analytical approaches, adopted in each article of T1 is given in Table 2. Similar to T1, T4 categorizes articles discussing spectroscopic techniques for the characterization of CH materials.However, the scope is limited to the spectroscopic analysis of pigment and paint samples and is different, because T4 concentrates on the development of new methods and approaches to resolve specific issues in this study area.The expertise engaged in these studies mainly comes from the fields of chemistry, physics, conservation science, material science, and adjacent fields of applied sciences, such as electrical engineering.In three articles, scholars active in research institutions dedicated to CH and art conservation are also involved [19][20][21][22][23].Most of these scholars have a training background in the natural and applied sciences, however.Hence, as to be expected, this topic within the analytical study of pigments involves less expertise from the arts and humanities than T1.
The advancements discussed in the ten articles of T4 are related to the vibrational spectroscopic techniques of Raman and terahertz time-domain (THz-TDS).Seven articles within T4 concentrate on Raman spectroscopy [20][21][22][23][24][25][26][27], of which five articles focus particularly on the main limitation of fluorescence in Raman scattering experiments [21,[23][24][25]27].The other two articles [20,26], in turn, deal with the issue of developing hybrid systems that combine Raman spectroscopy with other laser spectroscopic techniques.The laser spectroscopic tools combined in a hybrid instrument with Raman are laser-induced breakdown spectroscopy (LIBS) and laser-induced fluorescence spectroscopy (LIF).The rationale behind developing such hybrid instruments is that complementary information on both the molecular and elemental composition of artist's pigments and paint samples can be gathered in a more efficient and faster way.More precisely, the combined usage of Raman spectroscopy and LIBS ensures the identification of inorganic materials in pigment and paint samples, while the addition of LIF spectroscopy to the hybrid LIBS-Raman instrument allows the identification of organic constituents at the same time.
Within the set of five articles dealing exclusively with the issue of fluorescence in Raman analysis, different methods for spectra acquisition and processing are introduced and tested on both pigment and paint samples [21,[23][24][25]27].The underlying objectives are (a) to acquire fluorescence-free spectra, (b) to suppress fluorescence background, (c) to enhance the Raman signal, or (d) to obtain molecular information from the subsurface of layered paint samples.Finally, one of these articles simultaneously considers the related issue of conducting Raman scattering experiments in situ under ambient light [21].
Besides advancements in Raman spectroscopy, T4 contains three articles discussing limitations related to the application of THz-TDS spectroscopy in CH.The considered limitations are the following: the use of the reflection THz-TDS configuration on pigment samples and painted CH objects [28], the lack of measurement protocols [19], and the absence of complete and high-quality THz spectral databases for artist's pigments [19][20][21][22].Article 8, for instance, deals with the lack of THz spectral databases by acquiring reference spectra of traditional Korean pigments [22], whereas article 2 presents a new measurement method and protocol to improve existing THz spectral databases of ancient and modern artist's pigments [19].The new method of measurement is based on the study of freestanding oil-paint samples at room temperature.The aim here is to develop an alternative sample embedding method for the traditional polyethylene-mixed samples, which often results in low-quality spectra and thus affects the quality of THz spectral databases.
In brief, T4 comprises articles that discuss the development and improvement of vibrational spectroscopic techniques for pigment and paint identification and characterization.Hence, it can be concluded that developments in vibrational spectroscopy is a topic within the scholarly literature on the analytical study of pigments.An overview of the issues addressed, the different types of test samples studied, and the developed and tested research approaches and techniques is listed in Table 3. T5 considers applications of X-ray based techniques for characterization of CH materials, particularly inorganic pigments, and their contributions to the fields of CH, conservation science and archaeometry.Seven of the ten articles included in T5 present a state of investigation by focusing on different X-ray techniques [29][30][31][32][33][34][35].The main authors of the articles are researchers from natural sciences (e.g., chemistry, physics, engineering, material science, mineralogy, and crystallography).The co-authors of these studies are practitioners active in museums, and in their conservation departments, as well as in national and international research institutions for CH preservation, art conservation, archaeometry and anthropology.Hence, T5 involves close collaboration between academia and practice.
Article 2 gives an overview of the contribution of XRD techniques to CH.It contains a summary of the different types of information that can be generated by XRD methods, as well as a succinct overview of methodological trends in the application of these methods on CH materials in the early 2000s [30].This article demonstrates that the characterization of CH materials with XRD methods is not restricted to acquiring qualitative information about the elemental composition of inorganic compounds (e.g., pigments, corrosion products of metals).On the contrary, quantitative information on the structure and phase content of complex mixtures and materials can also be acquired.This type of information, for example, facilitates the study of fading processes of inorganic pigments, such as Prussian blue [32].
In terms of the X-ray techniques assessed, T5 also incorporates discussions of the importance and the impact of access to synchrotron radiation facilities (e.g., the European Synchrotron Radiation Facility) for CH scientists [article 4].Articles 3 [36], 9 [37] and 10 [38], on the other hand, discuss different applications of X-ray techniques in CH.Article 9 deals with the adaptation of EDXRF instrumentation for educational experiments to train students from the natural sciences, conservation science and archaeology [37].Given the distinct content of this article, it can be argued that it does not really fit into T5 but is a separate topic within the examined body of literature, i.e., educational X-ray based experiments to train future CH specialists.Articles 3 and 10, in turn, introduce novel portable X-ray based instrumentation for the non-invasive study of CH materials.Thus, both articles do not provide a state of investigation of X-ray based techniques for the characterization of CH materials.For example, article 10 presents a novel low-cost mobile XRF scanner for 2D elemental mapping of CH objects, such as easel paintings [38].To conclude, T5 clusters articles that consider the application, development and contribution of X-ray based techniques for CH.

Topic 6 (T6)
T6 consists of a collection of articles dedicated to the application of spectroscopic imaging techniques for surface chemical mapping of easel paintings.The ten articles with the highest probability score under T6 are not case studies, nor comprehensive reviews of spectroscopic imaging techniques.They are concerned instead with addressing limitations and challenges encountered with the application of spectroscopic imaging in CH by testing and developing experimental methods and theoretical models.Based on the analysis of the articles, the experts engaging with this area of study are mainly scholars from computer science, imaging science and chemistry.However, scholars and practitioners of conservation science, art history and the fine arts are participating in this type of research.They mainly provide their expertise knowledge to produce the reference mock-up paintings for testing novel applications for spectral imaging of CH.
The limitations and challenges considered are related to the optical techniques of hyperspectral imaging (HSI) [39][40][41][42][43][44], multispectral imaging [45], and thermal spectral imaging [46].The studies devoted to remote-sensing technique of HSI form the largest group within the group of ten articles under T6.Among the limitations and challenges discussed for HSI are the key problems of data reduction of HSI datasets by means of advanced dimensionality reduction techniques [42], the visualization and automatic classification of hyperspectral reflectance data for pigment identification and mapping [42,44,47], as well as the unmixing of this type of data for pigment identification [40,41].Another issue addressed is the effect of HSI imaging acquisition parameters, such as focus distance and signal-to-noise ratio, on the accuracy of pigment classification [43].A summary of all the limitations and challenges discussed and addressed in the articles concentrating on spectral imaging methods is given in Table 4. Finally, it must be added that article 10 does not properly fits into the scope of T6.This article deals with the issue of color correction of digital images of CH assets [48].As such, it better fits within topic 8 (T8), as will be discussed further in this article.The CH material studied in the ten articles clustered by LDA under topic 7 (T7) are organic materials present in CH assets or employed in CH practices.The organic materials examined can be characterized as natural and synthetic pigments and dyes, such as indigo, carthamin red, cochineal, dragon's blood, madder, shellfish purple, and eosin Y.In general, the ten articles are concerned with the analytical study of these organic materials.However, different aims can be discerned, i.e., dye/pigment identification, authentication, structure elucidation, and the understanding or improvement of synthesis processes, but also the development and optimalization of research methods adopted in this study area [49,50].The researchers responsible for these studies are chiefly chemists, conservation scientists, and environmental scientists, who are working at academic and cultural institutions.
The most common research technique employed in this set of ten articles is highperformance liquid chromatographic (HPLC), which is the most widely adopted method for the identification of organic pigments and dyes.The technique is used together with photodiode-array detection in seven articles for the separation, identification, and structure elucidation of natural organic dyes [51][52][53][54][55].In three of these six articles, mass spectrometry (MS) is applied simultaneously to increase the analytical capabilities of the hyphenated technique [52,53,56].In article 7, mass spectrometry is also implemented in the research protocol but in combination with a laser-based ionization technique, i.e., negative-mode laser desorption/ionization [49].The objective of the article is to assess the applicability of such laser-based ionization techniques for the identification of shellfish purple in archaeological ceramic.
Other spectroscopic techniques are employed, specifically for the study of plant organic dyes and pigments extracted from madder.In article 6, the color compounds from aqueous extraction of madder, a traditional madder dye of the Monpa people, are investigated by means of phytochemical analysis and UV-visible and FTIR spectroscopy [57].In article 8, UV-visible and FTIR are also part of a multi-analytical approach, which further includes HPLC-PAD, colorimetry, and EDXRF [54].This multi-analytical approach is developed with the specific aim of studying nineteenth-century synthesis methods for red lake pigments obtained from madder based by Winsor and Newton.
Besides all these spectroscopic methods, the study conducted in article 10 is solely based on surface-enhanced Raman spectroscopy (SERS) [50].In this article, SERS is utilized for the investigation of the first synthetic organic dyes employed in Japanese woodblock prints, and in nineteenth-century prints and paintings by Henri Matisse and Vincent Van Gogh.This method does not involve a separation step and is thus applied as an alternative for the more commonly used HPLC.The main objective of article 10 is the development and testing of a new sample pretreatment method based on nitric acid to enhance the sensitivity of SERS to identify synthetic organic colorants in minute and heterogenous samples.
In sum, T7 clusters articles that engage both chromatographic and spectroscopic methods for the identification and characterization of natural and synthetic organic dyes and pigments.
Table 5 gives an overview of the different organic pigments and dyes examined, the analytical techniques utilized, as well as the different objectives of the ten articles in T7.T9 is concerned with the analytical study of aging and degradation processes of paint films, pigments and organic binding media used in historical paintings.Two types of studies can be distinguished within the set of ten articles of T9.The first type of studies is concerned with the characterization of these aging and degradation processes [58][59][60][61][62][63][64][65].This is done through (a) the identification of degradation mechanisms and degradation pathways of pigment and binding compounds [60], and (b) the systematic study of the roles of light exposure [64,65], humidity [62] and paint materials [59,61,63] in these processes.The second type of studies assess the impact of conservation treatments on the optical and structural properties of pigments and paint films of CH assets [66,67].For example, article 1 assesses the impact of gamma irradiation procedures for biodeterioration treatments of organic CH asset on the crystal structure and optical properties of historical pigments [66].
In terms of expertise involved in T9, conservation scientists affiliated with conservation departments of both universities and museums play a key role in the investigation of aging and degradation processes of historical paintings.However, the field of computer science seems to gain importance in this study area.This is mainly due to the application of computational modeling for the investigation of ageing and degradation processes of pigments and binders [58,64].In article 3, the degradation process of the white pigment Zinc oxide (ZnO) in oil paint is elucidated by density functional theory (DFT) modeling by considering its interaction with degradation products of the oil binder [58].Besides computational models, GC-MS and FTIR spectroscopy are adopted to evaluate the role of the interaction between oil binder and pigments in the ageing and degradation pro-cesses [59].In article 5, for instance, both techniques are employed for the multi-analytical study of the ageing characteristics of oil binders in anti-corrosive armor paint [60].In article 8, FTIR spectroscopy is used without any other analytical techniques to examine the UV ageing process of proteinaceous paint binders, as well as the role of non-proteinaceous painting materials (i.e., lipids from linseed oil, terpenoid compounds from varnish, inorganic pigments) in this process [63].For this purpose, the FTIR spectra are subjected to a principal component analysis (PCA) to further study the complex ageing process of the proteinaceous paint binders.
Other analytical techniques are employed in the articles of T9 [59,[61][62][63][64][65], which are devoted to the analytical study of the degradation processes of pigments and paint films instead of paint binders.For the study of the roles of substrates, such as primed canvas, in the complex fading process of Prussian blue, a different set of analytical techniques is engaged [61].These tools are scanning electron microscopy (SEM), XRD, spectrophotometry, X-ray absorption near edge structure (XANES) and Raman spectroscopy, while, in article 10, UV-Vis spectrophotometry, liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QToF-MS) together with electrochemistry techniques are used for the investigation of the photodegradation of Eosin Y [65].
All the studies categorized into T9 are performed on paint samples, which have been prepared in the laboratory according to historical recipes.However, article 4 forms an exception [59].This study also includes the examination of historical samples to analysis the ageing and degradation reactions of oil-based binding media in armor paint.Table 6 provides an overview of the different types of paint materials studied, the different analytical methods used for their study, as well as the research objectives addressed in the articles of T9.The characterization of photodegradation of Eosin Y under different illumination and in oxic and anoxic conditions; identification different degradation pathways UV-Vis spectrophotometry; liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QToF-MS); electrochemistry techniques 3.3.7.Topic 10 Topic 10 (T10) comprises articles, which concentrate on the chemical characterization of materials and techniques of easel paintings, illuminated manuscripts, and hand-written books.The inclusion of illuminated manuscripts and hand-written books notwithstanding, it must be emphasized that the topic is dominated by the examination of easel paintings.In total, eight of the ten articles in T10 present the results of a (multi-)analytical campaign conducted on easel paintings with various types of supports (i.e., canvas, panel, cardboard and copper) [68][69][70][71][72][73][74][75].The easel paintings examined cover a wide range of time periods with the oldest painting dating from 1450 and the most recent one dating from the mid-1950s.
The experts predominantly engaging with this study subject are physicists, chemists and art conservators, this notwithstanding the fact that the studies clustered under T10 yield chemical information, which is not only pertinent to the field of art conservation but also art history.While the studies of T10 consider objects and issues of art historical interest and gather information that can tremendously contribute to this study field, art historians are only involved as authors in two articles [71,74].One of these articles presents an optimized non-invasive research protocol for the study of easel paintings in the collection of the Universitat de Valencia [74].The article's objectives are twofold, namely (1) to acquire more information about these paintings' material composition and their makers' working methods to address art historical questions, and (2) to develop a preventive conservation plan for the entire collection.The second paper centers on a key problem in art history, namely the attribution of art [71].Additionally, two other articles in T9 deal with the issue of authenticity [76], or the related question of attribution [77].However, there are no art historians engaged in both these studies, or, at least, not as authors.Yet, art historians are consulted in another article, which sets out to identify the palette of the Argentine painter Pio Collivadino (1869-1945) and contextualize it within the major contemporary European artistic movements [69].This is to gain a better understanding of the relation between the painting practice of Argentine and European artists of the time.Although interpreting the results within an art historical context, art historians are only consulted together with art conservators for the sampling of paintings, as is indicated by the experimental section of the article.
For the chemical characterization of materials and techniques of easel paintings, mobile and non-invasive imaging techniques are mainly utilized.Chiefly, scanning macro-X-ray fluorescence (MA-XRF) is employed to map the elemental distributions of the painting materials, in particular inorganic pigments, at the paint surface and subsurface [68,72,73].The tool is used to gain information about the painter's palette and technique, as well as the painting's state of conservation.In article 7, for example, the scanning technique specifically enabled the assessment of the painting's condition by distinguishing between the original paint layers and later overpaints [73].As such, most of these scanning campaigns are undertaken on the occasion, or in preparation for, a comprehensive conservation treatment [68,70,71,[73][74][75].Other imaging techniques adopted for the study and characterization of the materials and techniques of easel paintings are the well-established traditional methods: visible, raking light, ultraviolet-induced fluorescence (UVIF) photography and infrared reflectography [70,74,75].In addition, spectroscopic techniques such as portable EDXRF [70,74], Raman spectroscopy [69], and SEM-EDX [71,75] are employed to identify the painter's pigment palette.The latter two methods are executed on cross-sectioned samples.Hence, a micro-invasive approach is implemented in three of these eight case studies on easel painting.An overview of the different adopted analytical approaches and techniques in these eight articles is given in Table 7. Within the set of these eight articles, there is one study (i.e., article 10) where the entire build-up of the painting is pinpointed, including the oil binding medium with µF-IR [75].The painting examined is made by the contemporary Italian artist, Remo Brindisi .The canvas support and priming contain non-conventional industrial materials, such as plasticized PVC and acrylate polymers.The latter materials are prone to fast ageing and thus have a significant impact on the painting's state of preservation.Given the key role of the canvas support and priming in the degradation of the paint surface, the painting's stratigraphy was examined in-depth.Therefore, article 10 better fits the scope of the previous topic discussed: T9.
The two studies concerned with the technical study of illuminated manuscripts and hand-written books, dating from the fifteenth and sixteenth centuries, both employ a noninvasive approach by using Raman spectroscopy [76] and MA-XRF [77] on site.Because the non-invasive and in situ analysis of ancient manuscripts is a major issue in the field of CH, article 8 introduces a lightweight MA-XRF scanner, which is easy to handle and transport [77].The applicability of the scanner is tested on illuminated choir books held by the Abbey of San Giorgio Maggiore in Venice.The common aim of both studies is the non-destructive characterization of inks, pigments, supporting materials (paper and parchments) to gain broader understanding of their history of use.In both articles, attention is given to issues of attribution or authenticity.In article 4, an assessment of the condition of the manuscripts is also conducted.This is done by examining the degradation processes of supporting materials, restoration materials and techniques [76].

Remaining Unclear Topics
The analysis of the articles clustered under the topics T2, T3 and T8 gives valuable information about their connection with the analytical study of pigments in CH.This connection could not be derived earlier from the identified words.

Topic 2
The ten articles of T2 address the issue of biodeterioration of historic buildings [78][79][80], CH storage and conservation facilities [80,81], paper documents [82,83], stone heritage building and sculptures [84][85][86], and fresco paintings [87].The objective of the ten articles is the identification of the biological agents responsible for the biodeterioration and the assessment of their degrading abilities.The main biological agents identified in these articles are fungi species, which produce acids, extracellular enzymes, and pigments.This suggests that fungi's physiological features, which includes the production of pigments, fulfill a key role in the biodeterioration processes of CH assets.

Topic 3
T3 contains articles that consider various subjects and thus do not form a single coherent topic.More precisely, different subjects can be distinguished based on the different CH materials analyzed, or the different issues addressed, in the articles.The first two articles included in T3 examine the chemical composition of colored glazes of fifteenth-and sixteenth-century tiles by means of spectroscopic techniques [88,89].The objective of both articles is to identify pigments and different glazing technologies.Article 1 also aims to characterize the mineralogical composition of the glazes, which is an objective shared with article 9 [90].The latter article presents the results of the mineralogical characterization and chemical analysis of excavated ceramic shreds dating from the fifteenth century.
Article 3 explores a completely different topic, i.e., the study of the chemical composition of weathering patinas (i.e., mineral phases) developed on the surface of medieval stained-glass windows [91].Articles 4 [92] and 7 [93] discuss a related issue, which is the identification and role of air pollutants and indoor and outdoor atmospheric aerosols on blackening and pigment discoloration of, for example, hematite pigments.Article 5 also deals with the issue of air pollutant and blackening [94].However, the focus is not on CH assets but on speleothems, which suffer from black discoloration caused by black carbon resulting from fossil-fuel combustion and biomass burning.Similarly, article 10 is concerned with a natural heritage asset rather than a CH asset [95].This study presents findings on the phyto-pigment composition within the southern Great Barrier Reef World Heritage Area.Finally, articles 6 and 8 are devoted to a specific topic, which is the geochemical analysis of archaeological pigments to identify geological sources [96,97].In sum, T3 is a not an identifiable topic based on the different research subjects and issues addressed in the ten articles with the highest probability score.

Topic 8
Contrary to T3, the articles classified by LDA under T8 form a clear and coherent topic.This can be derived from the common research objective, which is the development of restoration and preservation methods.Most of the articles introduce new computeraided methods to virtually restore CH assets based on digital image processing technology, as well as the information acquired by their analytical study [98][99][100][101][102][103][104].For instance, six article discuss the procedure of the virtual restoration of polychromed imperial gates belonging to wooden churches located in Romania [99][100][101][102][103][104].The first step in the procedure is the physicochemical study of the wooden support and painting materials (ground, pigments) composing the imperial gates with FTIR and EDXRF.In two of these six articles, both analytical techniques are supplemented with GC-MS [99] or Differential scanning calorimetry (DSC) [101].During the second step of the proposed procedure, a digital 3D model is made of the imperial gates using laser scanning technology.Lastly, the digital 3D model is virtual restored and converted into an interactive 3D model for interactive use.
In article 5, digital image processing technology is employed to assess the physical restorations of colored concrete heritage [105].A Color Concrete Restoration Method (CCRmethod) is presented for the chromatic design and application of restoration mortars on colored concrete surfaces.The CRR-method mainly consists of chromatic characterization of the colored concrete surfaces by image processing, the tinting of the restoration mortar by adding pigments, and the final assessment of the restoration.Article 6 is likewise concerned with decorative coatings technology and its preservation.However, the aim is different, as well as the research approach.The technology discussed in article 6 is called brita lavada, which is a traditional coating mortar from Madeira (Portugal) [106].The mortar is traditionally composed of cement, local Madeira basalt gravel and black pigment.The purpose of this article is to analyze the physical properties of the mortar and its durability as an eco-efficient decorative coating.
Finally, article 8 is concerned with the development of conservation materials for paperbased CH assets [107].It introduces an efficient preparation method of two-dimensional zeolite nanosheets to acquire a multifunctional protective agent for paper-based relics.This conservation agent enables deacidification and anti-aging without damaging the original structure of the paper and the color of alkali-sensitive pigments.

The Final Labels of the Topics
The analysis of the ten topics provided more detailed information about the content of each topic.It disclosed that all topics generated by LDA is distinct from each other in terms of content.For instance, the initial labels suggested a possible overlap between T1 and T10 with both topics focusing on the study of pigments in paintings.However, the analysis of the content of both topics indicated that there is a difference in focus, as well as in the research methods implemented (see Tables 2 and 7).T1 focuses on the study of the stratigraphy of painted CH assets, and thus various painting materials, by using spectroscopic techniques on a macro-and microscale, while T10 concentrates on the identification of artists' pigments present at the painted surface by employing mainly non-invasive imaging techniques, such as MA-XRF scanning.In sum, there is no overlap between the ten topics.Furthermore, it was also found that the articles included in T3 do not relate to one specific topic.T3 is thus not considered an identifiable topic in this review.Based on the acquired information from the analysis, the preliminary labels of the nine topics are refined and reformulated.An overview of the preliminary labels and the final labels is shown in Table 8.The analytical study of paint and pigment degradation The analytical study of aging and degradation of paint films, pigments and organic binders, and the roles of light, paint materials and conservation treatments in these processes

T10
The technical study of paintings and illuminated manuscripts The technical study of pigments and painting methods employed by historical and contemporary painters and illuminators

Time Trends
Having now established the topics, in this part the time trends in average topic share are plotted and discussed.For full transparency purposes, each time trend per topic is given in the Supplementary Materials (Figures S1-S6).The plots start from 2008 because the frequency of articles in the years before that do not warrant an average topic share analysis, as there are simply too few to draw meaningful conclusions.In terms of overall trends, T7 has a relatively stable topic share over time.T1, T3, T4 and T5 all experience a downward trend, whereas T2, T6, T8, T9 and T10 experience an upward trend.
T1 and T5 (see Figures 2 and 3) both have the highest topic share, and this remains stable over time, though it is noteworthy that both have also experienced a significant decline over time.They remain the highest still by a wide margin, but this could change in the future as other topics gain more traction and/or continue their upward trend (especially T6 and T10).This could be attributable to the increasing importance of mobile imaging spectroscopy techniques (i.e., the subject of T6) in the CH field for the non-invasive and in situ study of two dimensional (2D) artistic objects, especially easel paintings [108].While imaging techniques have been part of the traditional methodologies to study painting materials and techniques since the previous century, some more recent developments have contributed to the current increase in popularity of imaging spectroscopy techniques.One of these technological developments is the introduction of new advanced imaging techniques, such as Terahertz time-domain imaging and MA-XRF scanning in the CH field, respectively in 2006 and 2013 [108,109], but also the improved accuracy of HSI, which can be used with several devices with different sensors and camera models [107], together with major advancements in image-processing based on multivariate techniques and artificial intelligence [110].Moreover, most of these novel imaging tools have become commercially available, such as the mobile M6 Jetstream MA-XRF scanning instrument, in the past decade.This development has resulted in an increased access for various stakeholders within the cultural heritage field, and thus contributed to the increase in publications.
T6 and T10, which both included articles that utilize imaging spectroscopic techniques to study painting materials, have clearly captured some of the market share of T1 and T5.As can be seen from Figures 4 and 5, both have increased over time and more than doubled their topic share.This upward trend can be further explained in terms of research ethics in CH, which promote non-invasive approaches to safeguard the material integrity of valuable and vulnerable CH assets.Instrumental advancements of the last three decadeswhich also include the development of handheld instruments, such as EDXRF and Raman spectrometers-have made the study of CH assets in both a non-invasive and comprehensive manner more feasible.Another trend, which will probably further contribute to this increasing market share of T6 and T10, is the in situ conservation treatments of easel paintings in museums.
The upward trend of T6 and T10 notwithstanding, T1 and T5 still have about a twice as big topic share in 2023 (although coming from an about six times as big topic share in 2008).As such, micro-invasive approaches and lab-based devices still play a prominent role in the physicochemical characterization of pigments and painted CH assets, as well as in the advancement of the study field.
which can be used with several devices with different sensors and camera models [107], together with major advancements in image-processing based on multivariate techniques and artificial intelligence [110].Moreover, most of these novel imaging tools have become commercially available, such as the mobile M6 Jetstream MA-XRF scanning instrument, in the past decade.This development has resulted in an increased access for various stakeholders within the cultural heritage field, and thus contributed to the increase in publications.T6 and T10, which both included articles that utilize imaging spectroscopic techniques to study painting materials, have clearly captured some of the market share of T1 and T5.As can be seen from Figures 4 and 5, both have increased over time and more than

Discussion and Conclusions
The aim of this review was to answer the following research question: what are topics and time trends of the past three decades in the analytical study of pigments contained in

Discussion and Conclusions
The aim of this review was to answer the following research question: what are topics and time trends of the past three decades in the analytical study of pigments contained in CH assets?The LDA topic model was used as a machine learning-based review approach to identify ten topics and time trends in the share of these topics.Based on the results obtained by LDA, nine topics could be identified through an analysis of the content of the titles and abstracts of the ten articles with the highest probability score in each topic.Three of the nine topics (T4, T6 and T10) are concerned with the identification and characterization of inorganic pigments.This is performed through the analytical study of these pigments (T10), or by advancing the analytical and imaging tools and methods for this type of study (T4 and T6).Moreover, the analysis of T2 and T7 revealed that the term pigments within the textual dataset of 932 articles not solely refers to natural and synthetic inorganic pigments.It also refers to organic pigments and dyes, as well pigments produced by fungi.Additionally, the content analysis of other topics (T1, T5 and T9) identified by LDA disclosed that the spectroscopic examination of pigments is often accompanied by the study of the whole material build-up of a painted CH asset.This is to gain a profounder understanding of (1) the painting techniques used for their production, (2) the CH asset's condition, and (3) the degradation processes of the materials enclosed in them.Finally, the text corpus of 932 articles comprised a topic (T8) dealing with the development of restoration and preservation methods of painted CH assets, mainly built heritage and its digital restoration and preservation.
Besides topics, time trends were identified by using LDA topic modeling.These time trends quantify and visualize which topics have seen increased or decreased shares in attention over time.The analysis of the time trends indicated that T1 (The spectroscopic and microscopic study of the stratigraphy of painted CH assets) and T5 (X-ray based techniques for CH, conservation science and archaeometry) have the highest topic share in the reviewed body of literature on the analytical study of pigments between 2008 and 2013.This topic share has remained stable over time.
Based on the results acquired from the topic modeling process, several conclusions can be drawn about the study area's state of investigation.First, the topics generated by LDA together with the analysis of the ten articles per topic disclosed that historical painted CH assets, especially easel paintings, are the most widely studied objects.Second, the analytical study of pigments in CH is commonly conducted in an interdisciplinary manner.However, it is dominated by scholars from the fields of chemistry and physics, and to a lesser extent, by conservation scientists and art conservators.This is quite understandable when it comes to instrumental advancements in the study area, or when conservation issues are tackled.The active involvement of art conservators can also be explained by the fact that analytical campaigns are mostly conducted on the occasion of a comprehensive conservation treatment.Simultaneously, these analytical campaigns, as this review demonstrates, yield a lot of material information about the painting materials and techniques of CH assets, which are pertinent to the fields of archaeology and art history.Nonetheless, the identified research communities by the current review showed that the involvement of art historians and archeologists in the study area is still rather limited.This is particularly noteworthy as the interpretation of the acquired information by art historians and archaeologists, and scholars of related fields as well, could not only put this data into a broader context to answer questions inherently related to these objects' history; but also advance the study field by developing new research questions, or contributing to the advancement of research methodologies, as is demonstrated by a couple of the articles included in this analysis.
There are also several interesting research avenues in relation to the method used in this paper.Topic modeling can help to analyze a large corpus of text and is thus particularly promising as a review tool in an ever-increasing stream of literature.Still, its applicability is not limited to literature review [110].Any text can be analyzed using topic modeling and related text mining tools.Future research could use the technique to study, for instance, conservation reports in relation to pigment degradation and identify topics particularly relevant to conservators [6].
Relatedly, while this article focuses on topics and trends over time, the creation of a topic probability score per article and per topic allows more modeling to occur.Metaregression, for example, is part of the review toolbox and aims to identify sources of variation in effect sizes from articles.Meta-regression could similarly be utilized to build regression models including a range of meta-information variables per article and whether those variables associate with higher probability scores per topic.For example, are articles in specific journals or specific disciplines more likely to focus on a specific topic?What about the author team, and are specific skills present in the author team more likely to relate to a specific topic?What is the impact of institutions and research funding?Meta-regression could be the next step in this approach, to statistically identify variables that may explain why an article is more likely to focus on a specific topic [111][112][113][114][115].
Finally, topic modeling is one approach to literature review especially focused on a large corpus of text.Other review techniques, including integrative or problematizing reviews, could help answer more specific research questions using a sub-set of the literature [1].

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/heritage7070174/s1,File S1: Top ten articles per topic based on topic probability score (Stata output), Figure S1 Funding: No funding is reported.

Heritage 2024, 7 ,Figure 1 .
Figure 1.The number of published articles on the analytical study of pigments present in heritage assets.

Figure 1 .
Figure 1.The number of published articles on the analytical study of pigments present in heritage assets.In addition to heritage science, the study domains of analytical chemistry and physics have contributed the most to this growing body of literature.Both scientific fields have facilitated the adoption of scientific methods in cultural heritage in the last two decades.mainly through the introduction of new instruments and approaches, which enabled the of pigments in paintings The spectroscopic and microscopic study of the stratigraphy of painted CH assets T2 Fungal deterioration and other types of biodeterioration of CH assets Identification of fungi in CH assets and the assessment of their degrading abilities (including pigment production) T3 The analysis of mineral compounds in CH assets Not identifiable topic T4 Raman spectroscopic analysis of pigment samples Vibrational spectroscopic techniques: their development and improvement for pigment identification and characterization T5 X-ray techniques dedicated to materials characterization (including pigments) in cultural heritage X-ray based techniques for CH, conservation science and archaeometry: their application, development, and contribution to the characterization of CH materials (including pigments) T6 Imaging methods (including hyperspectral imaging) for the examination of paintings Spectral imaging techniques: their development and improvement for surface chemical mapping of easel paintings, wall paintings, and illuminated leaves T7 Chromatographic and spectroscopic methods for the identification of organic pigments and dyes Chromatographic and spectroscopic methods for the identification and characterization of natural and synthetic organic dyes and pigments T8 Scientific research for restoration and conservation of CH assets and materials Restoration and conservation methods, materials, and treatments: Their developments and improvement for CH T9

Figure 2 .
Figure 2. Time trend of topic 1 (The spectroscopic and microscopic study of the stratigraphy of painted CH assets) based on average topic share (2008-2023).

Figure 2 . 25 Figure 3 .
Figure 2. Time trend of topic 1 (The spectroscopic and microscopic study of the stratigraphy of painted CH assets) based on average topic share (2008-2023).Heritage 2024, 7, FOR PEER REVIEW 25

Figure 3 .
Figure 3.Time trend of topic 5 (X-ray based techniques for CH, conservation science and archaeometry) based on average topic share (2008-2023).

Figure 5 .
Figure 5.Time trend of topic 10 (The technical study of pigments and painting methods) based on average topic share (2008-2023).
: Time trend of topic 2 (Identification of fungi and the assessment of their degrading abilities) based on average topic share (2008-2023), Figure S2: Time trend of topic 3 (Not identifiable topic) based on average topic share (2008-2023), Figure S3: Time trend of topic 4 (Vibrational spectroscopic techniques: their development and improvement for pigments identification and characterization) based on average topic share (2008-2023), Figure S4: Time trend of topic 7 (Chromatographic and spectroscopic methods for the identification and characterization of natural and synthetic organic dyes and pigments) based on average topic share (2008-2023), Figure S5: Time trend of topic 8 (Restoration and conservation methods, materials, and treatments: Their developments and improvement for CH) based on average topic share (2008-2023), Figure S6: Time trend of topic 9 (The analytical study of aging and degradation of paint films, pigments and organic binders, and the roles of light, paint materials and conservation treatments in these processes) based on average topic share (2008-2023).

Table 1 .
Ten most frequent words per topic (T).

Table 2 .
Overview of CH assets studied, research methods and analytical techniques used in T1.

Table 3 .
Overview of the studied test samples, the considered issues, and the tested research approaches in T4.

Table 4 .
Overview of the studied CH material, the considered issues, and the adopted research approaches in T6.

Table 5 .
Overview of the CH material studied, the analytical techniques adopted, and the different research objectives of T7.

Table 6 .
Overview of the different types of paint materials studied, the different analytical used for their study, as well as the research objectives of T9.

Table 7 .
Overview of the studied CH objects, the research aims, approaches and techniques used in T10.

Table 8 .
Overview of the preliminary and final labels per topics.