Are Major a Posteriori Dietary Patterns Reproducible in the Italian Population? A Systematic Review and Quantitative Assessment

Although a posteriori dietary patterns (DPs) naturally reflect actual dietary behavior in a population, their specificity limits generalizability. Among other issues, the absence of a standardized approach to analysis have further hindered discovery of genuinely reproducible DPs across studies from the same/similar populations. A systematic review on a posteriori DPs from principal component analysis or exploratory factor analysis (EFA) across study populations from Italy provides the basis to explore assessment and drivers of DP reproducibility in a case study of epidemiological interest. First to our knowledge, we carried out a qualitative (i.e., similarity plots built on text descriptions) and quantitative (i.e., congruence coefficients, CCs) assessment of DP reproducibility. The 52 selected articles were published in 2001–2022 and represented dietary habits in 1965–2022 from 70% of the Italian regions; children/adolescents, pregnancy/breastfeeding women, and elderly were considered in 15 articles. The included studies mainly derived EFA-based DPs on food groups from food frequency questionnaires and were of “good quality” according to standard scales. Based on text descriptions, the 186 identified DPs were collapsed into 113 (69 food-based and 44 nutrient-based) apparently different DPs (39.3% reduction), later summarized along with the 3 “Mmixed-Salad/Vegetable-based Patterns,” “Pasta-and-Meat-oriented/Starchy Patterns,” and “Ddairy Products” and “Ssweets/Animal-based Patterns” groups, by matching similar food-based and nutrient-based groups of collapsed DPs. Based on CCs (215 CCs, 68 DPs, 18 articles using the same input lists), all pairs of DPs showing the same/similar names were at least “fairly similar” and ∼81% were “equivalent.” The 30 “equivalent” DPs ended up into 6 genuinely different DPs (80% reduction) that targeted fruits and (raw) vegetables, pasta and meat combined, and cheese and deli meats. Such reduction reflects the same study design, list of input variables, and DP identification method followed across articles from the same groups. This review was registered at PROSPERO as CRD42022341037.


Introduction
Following the dietary pattern (DP) approach, multiple related dietary components (food items, food groups, or nutrients) are synthesized into combined variables reflecting key dietary profiles in a population [1,2], while overcoming well-known multiple comparison issues [3].
A posteriori DPs [3] are defined by using multivariate statistical methods (i.e., principal component analysis [PCA], exploratory factor analysis [EFA], and cluster analysis [4]) and are therefore advantageous in naturally reflecting actual dietary behavior in a population and related study-or population-specific context (e.g., geography/climate, socioeconomic status, food supply, ethnic background, culinary tradition) [5].However, their specificity limits generalizability, especially when compared with the a priori (i.e., comparing subjects' diet against evidence-based benchmark diets) option [6].
The absence of a standardized approach to analysis (e.g., definition of input variables and their preprocessing, DP identification method, and DP labeling), poor information reporting, and subjective DP labeling (based on supposed similarities with previously published DPs) have further limited fair comparisons among sets of a posteriori DPs [7] and still hindered discovery of genuinely reproducible DPs across studies from the same or similar populations [8,9] (i.e., cross-study reproducibility [7,10]).
A few pioneering [11,12] and more recent [13][14][15][16][17][18][19] articles have explored either qualitatively or quantitatively the cross-study reproducibility of DPs derived from PCA or EFA, which are by far the most commonly derived a posteriori DPs in nutritional epidemiology [3].Following a qualitative approach, the assessment of cross-study reproducibility emerged from a narrative synthesis based on text description and/or visual inspection of loadings of potentially similar DPs.Congruence coefficients (CCs) between factor loadings and correlation coefficients between factor scores have been also used to quantitatively evaluate reproducibility of apparently similar study-specific DPs [13,15,16].Independently of the different cut-offs used, the CC has proved to be an effective measure of reproducibility for PCA/EFA-based DPs across studies [13,15,16].Additionally, the potential effectiveness of more rigorous statistical approaches has been under investigation [17].
The Italian diet is traditionally recognized as a variant of the Mediterranean diet characterized by the abundance of fruit and vegetables, wheat, legumes, and olive oil [20,21].However, per capita weekly consumption data revealed that typical Mediterranean-style foods have been consumed less than expected in 2019 [22].A Life Cycle Inventory analysis suggested that, while intakes of milk/yogurt and legumes were in line with the Mediterranean nutritional model, as estimated by using current dietary reference values [23], fruits, vegetables, pasta, bread, and extra-virgin olive oil showed lower (24%-51%, depending on the food item/group) intakes, compensated by higher (78%-918%) intakes of meat, and higher (580%) intakes of sugar, sweets, snacks, or alcohol-free beverages [22].While waiting for novel findings from official nation-wide food consumption surveys [24]-the most recent one dating back to the 2005-2006 "Istituto Nazionale di Ricerca per gli Alimenti e la Nutrizione-Studio sui Consumi Alimentari in Italia" (INRAN--SCAI) [25]-a systematic review of the otherwise scattered scientific evidence on Italian dietary behavior may contribute to fill in this gap, by summarizing recent evidence in the light of the old one.As recent country-specific dietary guidelines recognized the effective use of DPs as their first evidence base [2,26], a systematic review on all and more recent DPs may contribute to inform future research on DP identification in the Italian population and the development of the next Italian dietary guidelines.
Within the movement supporting reproducible research in science [27], the current article builds on the first 2 systematic reviews on reproducibility and validity of PCA/EFA-based DPs in nutritional epidemiology [7,10] and explores the cross-study reproducibility of PCA/EFA-based DPs in a case study of epidemiological interest, which is Italy.In detail, first to our knowledge, we systematically collected existing evidence on PCA/EFA-based DPs identified in Italian free-living individuals, with a focus on the DP identification process and its consistency across included articles.We also investigated DP cross-study reproducibility, to assess whether major DPs are consistently identified within Italy, by proposing a: 1. qualitative assessment of reproducibility of all available and most recently identified DPs, as based on similarity plots built on original text descriptions and factor loadings; 2. quantitative assessment of reproducibility of subsets of DPs, as based on the CCs applied on the same list of input variables.
As a third research aim, we compared the results from the qualitative and quantitative assessments of DP reproducibility, to identify possible drivers of agreement and discrepancies.This not only informs DP assessment in the Italian population but also future research utilizing a posteriori DP identification methods.A companion article will examine whether the identified DPs, grouped according to their reproducibility, are consistently related to disease outcomes, determinants, or correlates of interest, if any, as described in the original articles included in this systematic review.

Methods
This systematic review was conducted referring to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines [28].The review protocol was registered in the International Prospective Register of Systematic Reviews database (registration no: CRD42022341037).

Eligibility criteria
Articles were considered eligible for inclusion if they (1) were (original) full-texts articles in peer-reviewed journals; (2) enrolled human subjects living in Italy; (3) identified DPs based on PCA and/or EFA (indicated as PCA-based, EFA-based, or PCA/EFA-based DPs in the following) on dietary data, independently of any additional analysis on health outcomes, determinants, or correlates.Articles were excluded if (1) they did not provide original data, or they were case reports, in vitro and in vivo animal studies, conference abstracts or posters; (2) the reference population lived outside Italy or, in international studies, it was not possible to distinguish the Italian-specific DPs, which are of interest in the current review; (3) results concerned single nutrients, single food items, or single food groups; (4) the term DP was used to identify dietary attitudes, perceptions, or patterns of meals; (5) DPs were identified using the a priori approach, the mixed-type approach, or the a posteriori approach but not following PCA or EFA; (6) PCA or EFA were applied on dietary behaviors; and (7) PCA or EFA were applied on lifestyle variables, including diet, to derive lifestyle patterns (details in Supplementary Methods).No restrictions were imposed on year of publication, population characteristics, or participants' health status.

Search strategy
An electronic literature search was conducted in parallel by 2 authors (RB and MT) on December 21, 2022 using 3 electronic databases: MEDLINE/PubMed, Embase, and Cochrane (CEN-TRAL and Reviews).The search strategy used both keywords and controlled vocabulary terms around the fields of "dietary patterns," "factor analysis," "principal component analysis," and "Italy."No language filters were used.No reference was made to potential health outcomes, determinants, or correlates of interest, as far as PCA/EFA-based DPs were identifiable in Italy.Details on strings were provided in Supplementary Methods.We used the EndNote 20 software program (Thomson Reuters, New York, NY, USA) for the electronic management of the review process.

Article selection
After duplicates were removed, titles and abstracts of the remaining articles were screened for eligibility.Subsequently, all eligible full-text articles were retrieved, screened, and included in the systematic review when appropriate.The reference lists of the articles identified during this process were also examined by hand search to further identify potentially relevant articles.Each of the previous steps was carried out in parallel by 2 authors (RB and MT); any disagreements between reviewers were resolved by discussion and consensus with a third investigator (VE).

Data extraction
Using a predefined Excel spreadsheet, data extraction was performed independently by 2 investigators (RB and MT).Data extraction was checked by other 2 investigators (VE and MS) and a third one (MF) was involved in resolving any potential disagreement.Information extracted from each study included the following: (1) general characteristics of the studies; (2) study design; (3) dietary assessment tool used; (4) DP identification method; (5) number of DPs, proportion of variance explained, name, and composition; (6) statistical methods used to relate the identified DPs to disease outcomes/determinants/correlates, and (7) main results on the relationship between identified DPs and disease outcomes/determinants/correlates (corresponding to those statistical models adjusted for all the available confounders, if models were fitted).
The current article is focused on the description of the PCA/ EFA-based DPs identified in Italy, with a focus on their identification process and on their potential cross-study reproducibility.A companion review will be focused on the relationship between identified DPs and disease outcomes/determinants/ correlates, by providing details on the statistical methods used to assess this relationship.

Assessment of study quality
For each aticle that met the inclusion criteria, study quality was independently evaluated by 2 reviewers (RB and MS) by using the Quality Assessment Tools from the National Institutes of Health National Heart, Lung, and Blood Institute [29].Any disagreements were solved by discussion and consensus with a third reviewer's grade (VE).Involved researchers used the available study rating tools on the range of items provided by each tool (range: 0-14 for cohort, cross-sectional studies, or trials; 0-12 for case-control studies) to judge each study quality [29].To better identify mid-high-quality studies, we added an extra category, "very good," to the originally suggested "poor," "fair," and "good" [29].We categorized total scores into 4 levels in such a way that 25% (corresponding to 3 points) of item's positive answers were included in any category.Owing to the lack of previous evidence on reproducibility of DPs in Italy, we chose not to exclude studies based on their quality.Therefore, all the retrieved studies were considered in the analyses.

Narrative synthesis and qualitative and quantitative assessments of reproducibility of DPs
We first performed a narrative synthesis of the findings from the included studies in terms of study design, population characteristics, dietary assessment tool, DP identification methods, and text description of the identified DPs.Second, we performed a qualitative assessment of the reproducibility of all available DPs, as based on similarity plots built on original text descriptions and factor loadings, when available; we referred to factor loadings to assess the relative importance of dominant food groups or nutrients, in case of very rich descriptions of DPs.Third, we performed a quantitative assessment of reproducibility of DPs, as based on the CCs calculated on the same lists of input variables.The CC (-1CC1) is the preferred index for measuring similarity of PCA/EFA-based DPs [30,31].In the absence of any recent and reliable information on Italian DPs, we followed a more conservative approach than the most similar systematic review on PCA/EFA-based DPs from Japan [13].In detail, we opted for (1) calculating CCs over smaller but more comparable groups of articles sharing the same list of input variables (i.e., either nutrients or food groups), to avoid extra subjectivity in defining a common input list and potential artifacts possibly deriving from imputation of new loadings based on the original ones [13]; (2) adopting a higher cut-off (CC: 0.85 vs. 0.80 [13]) for "fair similarity" of DPs, thereby a 0.85CC0.94indicates "fair similarity" [15,16] in our application; (3) adopting a specific cut-off (CC: 0.95) for "equivalence" of DPs, thereby a CC 0.95 indicates "equivalence" [15,16]; and (4) evaluating similarity of DP pairs over the entire CC distribution and not only on the median [13].The quantitative assessment of DP reproducibility was conducted with the R software [32] and its package "psych" [33].When needed, corresponding authors were contacted (twice, 15 days apart per protocol) to provide or confirm information on PCA/EFA loadings that allowed to calculate CCs.Finally, we carried out a sensitivity analysis (including both a qualitative and quantitative assessment of DP reproducibility) on the most recently identified DPs (i.e., those based on dietary information collected at least in part over 2013-2022), to assess if any shifting from typical Mediterranean-style habits can be tracked.

Article selection process
Figure 1 shows the PRISMA flowchart of the article selection process.The electronic literature search detected 4601 records.After 734 duplicates were removed and 3675 records were excluded by title/abstract screening, 193 full-text articles (including 1 article from the reference lists of the retrieved articles) were considered eligible for a detailed analysis.Of these, 52 (all in English language) remained after exclusion criteria were applied and were summarized in the current review [11,12,.Reasons for exclusion are described in Figure 1.

Main characteristics of the included studies
Figure 2 summarizes study design, dietary assessment tools, disease outcomes/determinants/correlates of interest, and the DP identification process used in the 52 selected articles (Supplemental Table 2 for additional details).
The number of DPs described in each article ranged from 2 to 6 (food-based DPs: 2-6, nutrient-based DPs: 3-5), with a median of 4 DPs per article.When reported, the percentage of total FIGURE 1. Flow diagram of the study selection process [28].EMBASE, Excerpta Medica Database; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.variance explained by the retained components/factors varied from 6.6% (3 factors, 46 food groups) [55] to 82% (2 factors, 17 food groups) [61,62], with a median percentage of 45.5%.Seventeen articles showed percentages over 75%, with most of them (15 articles) identifying nutrient-based DPs (Supplemental Table 3).

Qualitative assessment of DP reproducibility: original descriptions
Overall, 186 DPs were identified across all the included articles (food-based DPs: 102; nutrient-based DPs: 84).Except for 15 DPs without any label, the matching of the remaining 171 DPs on original names allowed to identify DPs named as "Vitamins and Fiber" (14 articles, from case-control studies on diet and cancer), "Starch-rich" (13 articles, from case-control studies on diet and cancer), "Animal Products" (13 articles, from case--control studies on diet and cancer), "Prudent" (11 articles, from a research group from Sicily, EPIC-Elderly, ORDET, and "Neonatal Environment and Health Outcomes" birth cohort), "Pasta and Meat" (10 articles, from Moli-sani and EPIC-Elderly), "Western" (9 articles, from a research group from Sicily, ORDET, and "Risk Of Cardiovascular diseases and abdominal aortic Aneurysm in Varese"), "Eggs and Sweets" (8 articles, from Molisani), "Olive Oil and Vegetables" (8 articles, from Moli-sani), as well as "Animal Unsaturated Fatty Acids" ("AUFA") and "Vegetable Unsaturated Fatty Acids" ("VUFA") (7 and 5 articles, respectively, from case-control studies on diet and cancer) (Supplemental Table 3).
To compensate for subjective DP labeling, we referred to text descriptions and loadings in original articles to collapse in Figure 3 the 186 identified DPs (expressed with original names) into 113 apparently different DPs (39.3% total reduction), of which 69 were food-based and 44 were nutrient-based DPs.

Food-based DPs
We organized the 69 food-based DPs into "Mediterraneanstyle" and "Western-style" macro-areas (Figure 3).The Mediterranean-style macro-area included 3 different groups of DPs that we defined as "Mixed-Salad," "Healthy-Protein Foods and Side Dish," and "Traditional" DPs.The "Mixed-Salad" group (in green) included DPs based on olive oil, raw (and sometimes cooked) vegetables (DPs named "Salad Vegetables") [11,12,59,60], with additional presence of legumes and fish [58], soup and  1 The DIETSCAN project included one Italian cohortthe ORDET onewhich recruited women only and it was therefore classified as "nonpregnant women only" instead of "general adults (males and females)". 2The Mamma & Bambino birth cohort was also pooled together with MAMI-MED in another study (Magnano San-Lio et al. [65]).
Although alcoholic beverages have been previously identified in the "Traditional," "Pasta-and-Meat-oriented," and "Unhealthy Foods and Snacks" groups as consumed at mealtime, one article identified an "Alcohol" DP alone, likely because the DIETSCAN project provided DPs based on a parallel analysis of international studies [11].
Within the "Vegetable-source Fatty Acids" group (in lilac), most DPs from the same research group were labeled "VUFA" and were all characterized by linoleic acid, α-linolenic acid, and vitamin E [37][38][39][41][42][43][44]46].Pregnant women additionally loaded high on MUFAs and lycopene [72].A different classification of fats allowed to identify vegetable fat as an additional dominant nutrient in 2 articles from the same research group [34,36].The joint presence of vegetable and animal sources of fatty acids mainly characterized the "Unsaturated Fats" [40], the "VUFA" [35], and the"Fat-rich" [81] DPs in adults, as well as the "Fats" DP in children [47].Finally, 1 article identified a "Fats Pattern" but did not provide further specification on the type of fats; however, the presence of a "Vegetal Oil Pattern" in the same article allowed us to interpret the former "Fats Pattern" as belonging to the "Animal-source Fatty FIGURE 3. Qualitative assessment of reproducibility for all the available dietary patterns: dietary patterns identified using principal component analysis or exploratory factor analysis in Italy from 1965 to 2022, in groups based on original text descriptions and loadings.ALA, α-linolenic acid; AUFA, animal unsaturated fatty acids; DHA, docosahexaenoic acid; DP, dietary pattern; DPA, docosapentaenoic acid; EPA, eicosapentaenoic acid; FA, factor analysis (factor name from original articles); LA, linoleic acid; NDMA, N-nitrosodimethylamine; PC, principal component (analysis) (principal component names from original articles); RAE, retinol activity equivalent; VUFA, vegetable unsaturated fatty acids. 1 Dietary patterns that look similar (based on original loadings and text description) were placed one close to the other and consistently indicated with the same color code.When dietary patterns were virtually identical, we synthetized them as one cell.Dietary patterns left in white were too far from the others to be indicated with a color code.Variants of the same color indicate different subgroups of dietary patterns within the same group, with loadings showing modest but nutritionally relevant differences across color-specific subgroups.Results were separately displayed for food-based (left) and nutrient-based (right) patterns and for adults and children/adolescents (consistently indicated in violet).Food-based and nutrient-based patterns were juxtaposed based on correlation coefficients between nutrient-based dietary patterns and selected food groups, as provided in most of the original articles.Arrows linking the different groups indicate stronger (solid line) and weaker (dashed line) similarities between food-based and nutrient-based dietary patterns.
Acids" group and the latter "Vegetal Oil Pattern" as belonging to the "Vegetable-source Fatty Acids" group [80].

Food-based and nutrient-based DPs: an overall picture
Based on correlation coefficients between nutrient-based DPs and selected food groups provided in the original articles [38][39][40][41][42][43][44][45], we identified similarities between the following groups of nutrient-based and food-based DPs (Figure 3, solid line): 1. "Mixed-Salad" and "Vegetable-based Patterns," 2. "Pasta-and-Meat-oriented" and "Starchy Patterns," 3. "Dairy Products and Sweets" and "Animal-based Patterns." Similarities were less clear between the "Healthy-Protein Foods and Side Dish" and "Animal-source Fatty Acids" groups and the "Unhealthy Foods and Snacks" and "Vegetable-source Fatty Acids groups", respectively (Figure 3, dashed line).As the "AUFA" DP showed fish together with red meat, liver, unspecified seed oil, olive oil, and eggs (ordered according to frequency), it generally showed a healthy source of proteins, but no side dishes.Food groups correlated with the "VUFA" DP included unspecified seed oils, together with red meat, specified seed oil, and olive oil, which might target fried foods potentially present in the "Unhealthy Foods and Snacks" group, but other relevant food groups (i.e., processed meat, soft drinks, or sugar and candies) did not show up.
When integrating corresponding nutrient-and food-based DPs, the "Vitamins and Fiber"/"Olive Oil and Vegetables" DPs were equivalent in 98% of the CCs, the "Animal Products"/"Eggs and Sweets" DPs in 92% of the CCs, and the "Pasta and Meat"/ "Starch-rich" DPs in 71% of the CCs.

Qualitative and quantitative assessment of DP reproducibility: a comparison
In the comparison between Figures 3 and 4 [35,37-45,49-51, 53-55,66,67], we observed that: 1.For the "Animal Products" and "Vitamins and Fiber" DPs, different cells in Figure 3 were indicated to be all "equivalent" based on CCs, so nuances in Figure 3 did not end up into genuinely different DPs in Figure 4 [35,37-45,49-51,53-55, 66,67]; 2. For the "AUFA" DP, ~76% of CCs pointed to "equivalence," with all the "fairly similar" evaluations related to the bladder cancer study [45]; however, the 2 cells identified in Figure 3 did not reflect this finding, as the "AUFA" DP for bladder cancer was not separate from all the other DPs and "equivalence" was identified between bladder and esophageal cancers [39,45], whose DPs, however, were in 2 different cells; 3.For the "Starch-rich" DP, the same 3 dominant nutrients-represented with 1 cell in Figure 3-ended up into an "equivalent" DP in 67% of the CCs only, with all "fairly similar" evaluations given by gastric and bladder cancer studies [35,45]; 4. For the "VUFA" DP, ~61% of CCs pointed to "equivalence," with all the "fairly similar" evaluations related to the pancreatic and gastric cancer studies (which also showed "equivalence" between the corresponding "VUFA" DPs); this finding was reflected in part by Figure 3, where gastric-and pancreatic-cancer-related DPs [35,40] were in different cells compared with the other "VUFAs", but not in the same cell; 5.For the "Pasta and Meat" and "Olive Oil and Vegetables" DPs on both available food-group lists, the DPs presented in Figure 3 were materially confirmed, as all CCs suggested "equivalence," except for 1 in the "Olive Oil and Vegetables" DP on the 43 food groups [49][50][51]53]; 6.For the "Eggs and Sweets" DP, the DP presented in Figure 3 was confirmed on the 46 food groups [54,55], but not on the 43 food groups [49][50][51]53], where only 33% of CCs suggested "equivalence" between DPs with the same name; most differences were related to the DPs identified for the nutrition knowledge and mass media exposure [50,51] articles, which were, however, "equivalent"; 7. The 2 DPs from the research group from Sicily [66,67] were indicated in different cells in Figure 3 and were consistently indicated as "fairly similar" in Table 1 [11,12,35,37-45, 49-51,53-55,59,60,63-68,74,75,79,80].

TABLE 1
Quantitative assessment of dietary pattern reproducibility for those dietary patterns identified on the same list of input variables: summary statistics on congruence coefficients 1 between loadings of pairs of apparently similar dietary patterns 2 Multicentric case-control studies on diet and cancer at several sites, articles presenting the same list of 28 nutrients as input variables [35,[37][38][39][40][41][42][43][44][45] Nutrient 2 Dietary patterns identified within the ORDET cohort [11,12,59,60] were not compared one to the other because the full list of factor loadings was not available anymore from the corresponding authors, we were in contact with; similarly, dietary patterns identified in most articles from the research group from Sicily [63][64][65]68] were not compared because the full list of factor loadings was not available anymore from the corresponding authors; upon contact with the corresponding author, we were able to confirm that dietary patterns obtained from 2 articles from Calabria [79,80] were identified by using exactly the same study population and therefore the comparison is meaningless; finally, dietary patterns obtained from 2 articles from the Salus in Apulia Study [74,75] were not compared because the number of food groups was different across articles.3 Three articles [35,40,44] did not contribute to the congruence coefficient-based analyses as the Animal Unsaturated Fatty Acids dietary pattern was not identified in those articles; among the dietary patterns here named Animal Unsaturated Fatty Acids, the 2 from [39,42] were originally named Other PUFAs and Vitamin D. 4 One article [45] did not contribute to the congruence coefficient-based analyses as the Vegetable Unsaturated Fatty Acids dietary pattern was not identified in that article; among the dietary patterns here named Vegetable Unsaturated Fatty Acids, the one from [40] was originally named Unsaturated Fats. 5 Minor inconsistencies were detected in the names of the food groups across the 2 articles.In the current analysis, vegetable oils in [66] was considered equivalent to plant oil in [67]; sugar, sweets in [66] was considered equivalent to sweet and processed sugar in [67].
with EFA applied over food groups obtained from FFQ-based information.Within the qualitative assessment of DP reproducibility by using similarity plots (based on original text descriptions and loadings), we identified similarities across foodbased and nutrient-based groups of DPs, i.e., between the "Mixed-Salad" and "Vegetable-based Patterns" groups, between the "Pasta-and-Meat-oriented" and "Starchy Patterns" groups, and between the "Dairy Products and Sweets" and "Animalbased Patterns" groups.Within the quantitative assessment of DP reproducibility by using CCs (215 CCs comparing pairs of DPs among the 68 DPs identified in 18 articles which referred to the same input data lists), pairs of DPs indicated with the same/ similar names were all "fairly similar" and ~81% of them were "equivalent."Among them, the "Vitamins and Fiber"/"Olive Oil and Vegetables" DPs were equivalent in 98% of the CCs, the "Animal Products"/"Eggs and Sweets" DPs in 92% of the CCs, and the "Pasta and Meat"/"Starch-rich" DPs in 71% of the CCs.The lack of a standardized approach to DP identification, the subjective labeling of DPs, and a generally poor information reporting have severely limited the ability to genuinely assess reproducibility of a posteriori DPs in different study populations from the same country [9,13,84].This is especially critical nowadays for Italy, where the most recent nation-wide survey dated back to the INRAN-SCAI 2005-2006 [25].The current review may provide support to either of these issues, by popularizing the good practice of assessing factorability, internal consistency, and internal reproducibility of identified DPs [10], by highlighting difficulties in using qualitative criteria for DP comparison, and by proposing a quantitative evaluation of reproducibility based on CCs.
Checks on matrix factorability allow to assess if the correlation structure is amenable to PCA/EFA [85].They are especially useful in food-based PCA/EFA, because the correlation structure is generally weaker.Although they are available in standard statistical software, their use must be increased, to avoid meaningless applications of PCA/EFA.Additional checks on DP internal reproducibility beyond the easiest split-half approach may reassure on their similarity under different statistical options, thus unrevealing the role of subjective decisions in the final PCA/EFA solution [85].
Although DPs are frequently named following a quantitative cut-off applied after rotation, their labeling is still very subjective.In addition, as the label generally needs to be short, often names do not adequately convey to what the underlying principal component/factor is [6].This was evident in our systematic review, where DPs with the same names did not show such a similar dietary composition, and DPs with similar loadings were given different names.We therefore provided the reader with Figure 3, which summarized the 186 identified DPs into 113 apparently different ones, based on original text descriptions and loadings.However, Figure 3 is not as effective in synthesizing Italian dietary behavior as one would expect.This is due in part to the need of integrating nutrient-based and food-based DPs in the same picture; although each of the 2 options has its pros and cons (2), matching of food-based and nutrient-based DPs is an extra step of analysis that requires subjective decisions.In addition, within each group, so many likely similar DPs (e.g., those identified by different nuances of the same color) still needs to be somehow summarized, to distinguish true differences from negligible ones or artifacts/noise.
To compensate for these issues, we proposed to quantify with the CCs [14,15,84] similarities between DPs provided in articles that are based on the same list of input variables.In the absence of any recent and reliable information on Italian DPs, we followed the strictest possible approach and provided the reader with benchmark CCs representing the same lists of input variables.In the current systematic review, however, individual research teams did generally adopt the same list of input variables across multiple articles.Therefore, while starting from the same list of variables, we obtained companion study designs, similar inclusion criteria, and dietary assessment tools, a similar preprocessing of input data, and similar DP identification methods.This is what it is reasonable to expect when the same research team develops experience in the application of the same approach over time; however, we could not separate out the contribution of study design and statistical analysis to the cross-study reproducibility of the corresponding DPs.
In this very conservative set-up, we were able to collapse the 68 DPs under evaluation into 13 genuinely different DPs.Although based on ~35% of included articles only, we believe that the "Vitamins and Fiber/Olive Oil and Vegetables" DPs, the "Animal Products"/"Eggs and sweets" DPs, and the "Pasta and Meat"/"Starch-rich" DPs do effectively summarize the overall Italian dietary behavior expressed in the studies under evaluation in this part of the analysis.
The qualitative assessment added nuances to the quantitativebased representation of the Italian diet.In detail, we identified 3 groups of DPs that we named "Mixed-Salad"/"Vegetable-based Patterns," "Pasta-and-Meat-oriented"/"Starchy Patterns," and "Dairy Products and Sweets"/"Animal-based Patterns."In line with foods typical of the Mediterranean diet, the "Mixed-Salad" or "Vegetable-based Patterns" groups are composed by DPs loading high on raw vegetables and olive oil, with fruit also contributing strongly to the "Vegetable-based Patterns" group.The "Pasta-and-Meat-oriented"/"Starchy Patterns" groups represent the internationally known Italian diet, based on main courses like lasagna, Bolognese pasta, and stuffed pasta; this DP could also encompass pasta/rice eaten at lunch and meat eaten at dinner, together with bread and wine.Finally, the "Dairy Products and Sweets"/"Animal-based Patterns" groups capture the use of cheese, milk, eggs, and sweets, with red and processed meat, butter/margarine, and mayonnaise loading also high on the "Dairy Products and Sweets" group.
Based on 3-day dietary records, the most recent available nation-wide survey INRAN-SCAI 2005-2006 [25] had confirmed results from older surveys that emphasized a large contribution to the overall diet of typical Mediterranean foods, including olive oil to fats, wine to alcoholic beverages, and bread/pasta/pizza to cereals.In 2005-2006, meat was consumed in 99% of the sample, with an alarming average for red meat of ~100 g/day/capita (raw weight) compared with 418 g/day/capita of fruit and vegetables, in line with Food and Agriculture Organization/World Health Organization recommendations.In line with INRAN-SCAI 2005-2006, recently published consumption trends of available food groups (corrected for waste) over 2000-2017 [86] revealed no important changes in cereals, legumes, pork meat, poultry, eggs, and sugars compared with a relevant decline for animal fat, beef meat, and fruits and vegetables, albeit the last two to a lesser extent.However, while looking at DP reproducibility over recently collected (i.e., last 10 y) dietary data (20 articles), the variety of specific subpopulations under investigation did not allow us to assess whether the trends identified (e.g., the "Mixed-Salad" group is no longer prevalent, the "Pasta-and-Meat-oriented" or the "Traditional" groups are less frequently followed than in past) are generalizable to the overall Italian population.The current sensitivity analysis cannot, therefore, confirm the putative shift of current Italian DPs from more traditional habits, including fruit and raw vegetables, legumes, pasta with meat and tomato sauce, to deli meat, ready-to-eat and/or energy-dense foods.
The current systematic review has strengths and limitations.First, it is based on a nonnegligible number of articles-in line with the systematic review from Japan [13]-and allowed for tracking of Italian dietary habits over a reasonably long time period, with most of the articles covering the last 20 y.Second, it provided graphical summaries of results, synthesizing results on the DP identification process and the qualitative and quantitative assessments of DP reproducibility.Third, being first to our knowledge, we compared qualitative and quantitative evaluations of DP cross-study reproducibility.Among limitations of this systematic review, we acknowledge that it mostly included cross-sectional studies/cross-sectional analyses of cohort studies and case-control studies (73% of the included articles).Moreover, 9 research groups were responsible for ~83% of articles, and 6 Italian regions, including Sardinia and Trentino-South Tyrol, were not covered by any publication, thus reducing the possibility of identifying nuances in dietary behavior likely useful in defining Italian dietary guidelines.Even though most studies were of "good quality," reporting of statistical analysis methods and of results was poor in several articles.In the absence of published factor-loading matrices, contacts with the corresponding authors were sometimes unsuccessful, preventing the inclusion of the article in the quantitative assessment of DP reproducibility.Although simple to calculate, CCs look at pairs of DPs; when sets of 5-10 similar DPs are under comparison, this implies evaluating 10-45 CCs and it might therefore be difficult to obtain one clear picture of reproducibility.In addition, we could only apply CCs to distinct lists of nutrients and food groups, thus limiting our ability to provide a global quantitative assessment of DP reproducibility.Finally, although the high CCs obtained did reflect similarities in study design and statistical analysis, we cannot exclude that overlapping of study participants artificially inflated the CCs.In particular, we acknowledge that CCs calculated on the Moli-sani study referred the same original study population, even if the corresponding DPs were identified over the specific subpopulations under investigation in each article and sample sizes generally differed substantially across these articles.
In conclusion, the current systematic review of evidence on 186 PCA/EFA-based DPs identified in Italy confirmed that labeling of DPs is still not performed with sufficient accuracy, even when a quantitative cut-off is followed.Although a degree of subjectivity exists, a qualitative assessment of DP reproducibility, by using graphs built on text descriptions and corresponding loadings, may inform further quantitative assessments performed by using CCs.However, further analyses are needed to better assess why discrepancies, if any, were found between qualitative and quantitative assessments of DP reproducibility.The quantitative assessment of DP reproducibility was carried out following very strict criteria; in particular, we restricted the analysis to articles using the same list of PCA/EFA variables.Although this choice depicts the best-case scenario of consistent study design and analysis, future quantitative assessments of DP reproducibility should include all available articles, to test how much CCs were reduced, when calculated on DPs from independent research groups.

FIGURE 2 .
FIGURE 2. General characteristics of the studies included in the systematic review and main steps in the dietary pattern identification process: a summary of findings from the systematic review.DAFNE, Data Food Networking; DIETSCAN, Dietary Patterns and Cancer; EFA, exploratory factor analysis; EPIC, European Prospective Investigation into Cancer and Nutrition; FFQ, food frequency questionnaire; GIFt, gestational intake of food toward healthy outcomes; IDEFICS, Identification and prevention of Dietary-and lifestyle-induced health EFfects In Children and infantS; NEHO, Neonatal Environment and Health Outcomes; ORDET, Ormoni e Dieta nell'Eziologia del Tumore della Mammella; PCA, Principal Component Analysis; ROCAV, Risk Of Cardiovascular diseases and abdominal aortic Aneurysm in Varese.1 The DIETSCAN project included one Italian cohortthe ORDET onewhich recruited women only and it was therefore classified as "nonpregnant women only" instead of "general adults (males and females)".2The Mamma & Bambino birth cohort was also pooled together with MAMI-MED in another study (Magnano San-Lio et al.[65]).

FIGURE 5 .
FIGURE 5. Sensitivity analysis: qualitative assessment of reproducibility for the most recently identified (i.e., latest 10 years of dietary data collection) dietary patterns-dietary patterns identified using principal component analysis or exploratory factor analysis in Italy from 2013 to 2022, in groups based on original text descriptions and loadings.ALA, alpha-linolenic acid; AUFA, animal unsaturated fatty acids; DHA, docosahexaenoic acid; DP, dietary pattern; DPA, docosapentaenoic acid; EPA, eicosapentaenoic acid; LA, linoleic acid; PC, principal component (analysis) (principal component names from original articles); RAE, retinol activity equivalent; SFA, saturated fatty acid(s); VUFA, vegetable unsaturated fatty acids. 1 Dietary patterns that look similar (based on original loadings and text description) were placed one close to the other.When dietary patterns were virtually identical, we synthetized them as one cell.Results were separately displayed for food-based (left) and nutrient-based (right) patterns and for adults and children/adolescents (consistently indicated in violet).Food-based and nutrient-based patterns were juxtaposed based on correlation coefficients between nutrient-based dietary patterns and selected food groups, as provided in most of the original articles.