The sex difference in tumor incidence is related to the female condition: models for Europe and Italy.

A remarkable aspect of cancer distribution in Europe is the large spatial variability of the male-female incidence ratio, from no difference up to 50%. Given the evidence of the predominantly environmental origin of cancer, we studied the ability of a set of socioeconomic indicators of the female condition to model the spatial variability of the sex difference in tumor incidence at two different scales: between countries (Europe) and between provinces (Italy). The sex difference in tumor incidence correlated with female socioeconomic condition indicators at the same extent (r = 0.73) in both situations, but in opposite directions. In the European study the higher the sexual social equality the lower the differential tumor incidence, whereas the opposite result was shown by the between-provinces Italian study. We also investigated the relation of the female condition indicator with other social and cultural descriptors of the same populations, and we suggest explanatory models linking female condition and pathology at the continental and local scales. Overall, our analysis supports the predominantly environmental origin of cancer and stresses the importance of relating cancer patterns to societal determinants. Our analysis also suggests that the sex difference in tumor incidence is a very useful probe for exploring the social-economic cultural correlates of cancer in human populations. We emphasize the need for a thorough analysis of the empirical correlations highlighted in ecologic studies.

The evidence for the predominantly environmental origin of cancer is extremely large and varied. Classical studies have shown that cancer incidences vary to a large extent from one geographic area to another, and that the people who migrate to a country with different habits acquire-in one or two generationsthe pattern of tumors that characterizes their new place (1). Similar conclusions were obtained by studies that focused on cancer incidence in twins. For example, a recent paper has demonstrated that for most cancers, identical twins (i.e., those with identical genes) have similar cancers no more than do fraternal twins (i.e., those with only 50% genetic similarity) (2); similar evidence was provided by a previous study (3). Other investigators have studied whether the genetic differences among European populations reflected similar differences in the incidence of the various types of tumors: Very little correlation existed between genetic settings and cancer incidence (4,5).
Overall, all these results agree in pointing out that the environment is the major determinant of cancer; a commonly shared opinion is that the environment is responsible for at least 50-80% of cancers (1). On the other hand, this is only one facet of the more general influence that societal changes exert on the health of a population, as demonstrated, for example, by a series of studies on mortality and differential sex mortality ratios (6)(7)(8)(9). The complexity of and the reciprocal influences among social, economic, and demographic factors and living conditions is demonstrated eloquently by historical research as well: The temporal sequence of economic recession, food shortage, epidemics, and an increase in mortality has been described in classic historical works (10). An unfortunate current example of this is the situation in some countries of the former Soviet Union (7,11).
Within the above perspective, we have studied the geographic variability of the sex difference in tumor incidence in Europe, 1988Europe, -1992. Together with a large variation of global tumor incidence from area to area, Europe shows a concomitant large variation of sex ratios in tumor incidence, ranging from the absence of any difference (e.g., Denmark) to 50% difference in Calais, France (see Table  1). This variability looks too high to be caused by any plausible genetic difference among the European populations in terms of male-female biology; in our opinion it calls for a global environmental explanation, possibly involving a large range of the socioeconomic and cultural factors that have shaped European differences in lifestyle during its history. In particular, we focused on the available socioeconomic descriptors that best point to the changing role of the female population in European societies.
We presented a preliminary analysis of the geographic distribution of the sex difference in tumor incidence in Europe in a previous paper (12). Here we show that the relation between sex difference in tumor incidence and the socioeconomic description of female condition follows different models in Europe and in Italy. Moreover, we show that this conflict between contradictory results can be solved only by assuming a broad perspective on the history of the studied populations. Thus we need an interdisciplinary effort that combines humanistic and naturalistic competences to successfully approach ecologic epidemiology studies (13).

Data and Methods
Cancer incidence data. The areas analyzed are those relative to the cancer registries present in the 1988-1992 International Agency for Research on Cancer (Lyon, France) compilation (14). The statistical index we used to formalize the cancer incidence data was the normalized difference between male and female global tumor incidence: with PM representing the whole incidence of tumors in males and PF the whole incidence of tumors in females (normalized per 1,0000 inhabitants). Table 1 lists the cancer registries with their ∆N values.
In the study on Europe, the average value of ∆N for each country was computed from the available local cancer registries.
Socioeconomic data. We collected socioeconomic data relative to the female condition (11 variables) ( Table 2) for the 95 Italian provinces, including the 13 areas relative to the Italian cancer registries (see Table  1). We summarized all data relative to the 95 Italian provinces by applying principal component analysis, and we used the scores relative to the 13 cancer registries for the subsequent analyses.
For the European analysis on the female condition, the data collected refer to 37 European countries. These include 16 countries for which incidence data were available (see Table 1), plus Albania, Austria, Belgium, Bosnia, Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Luxembourg, Macedonia, Malta, The Netherlands, Portugal, Romania, Russia, Switzerland, Ukraine, and Yugoslavia. The European variables are listed in Table 3. We summarized the socioeconomic variables relative to these countries by principal component analysis, and used the resulting scores for the 16 countries with cancer incidence data in the subsequent correlation analyses.
To put the constructed indicators of the female condition within a correct perspective, we contrasted them with several general socioeconomic indicators that describe the European countries and the Italian provinces, respectively. We derived and discussed in detail these general socieconomic indices in a previous work (15).

Principal component analysis (PCA).
We used PCA to reduce the variables listed in Tables 2 and 3 into summary scores; in the subsequent step, we used the summary scores for the female condition in the various correlation analyses.
The theory of principal components states that every symmetric covariance or correlation matrix relating p variables x1, x2, … xn can be transformed into particular linear combinations by rotating the matrix into a new coordinate system. This rotation is produced by multiplying each of the original data by appropriate coefficients. The original matrix is rotated such that the axis defined by the first principal component (PC1) is aligned in the direction of greatest variance. This procedure is repeated until a set of ∆N orthogonal (uncorrelated) components is obtained, arranged in descending order of variance. In this transformation, none of the information contained within the original variables is lost, and the derived components can be manipulated statistically in the same way as the original variables. Moreover, the transformation is useful because most of the significant total variance is concentrated within the first few uncorrelated PCs, whereas the remaining PCs mainly contain noise (16,17).

Results
The strategy adopted in this study was the following: The female condition in Europe (country-based analysis) and in Italy (province-based analysis) was separately parameterized by summary indicators obtained by PCA of several variables selected for their relevance to the female condition. As a further check, the specificity of the summary indicators (PCs) of the female condition was controlled against general socioeconomic descriptors relative to the same areas. After this check, the female condition indicators were contrasted with the sex differences in tumor incidence (∆N values).
We had demonstrated previously the time invariance of the incidence data between the 1985-1988 and 1988-1992 periods (4  the existence of a marked country effect in the tumor distribution in Europe (12). The country effect for ∆N (analysis of variance) scored an F value of 59.5 correspondent to p < 0.0001. This reassured us about the correctness of using incidence data averaged at the country level for the European analysis. We used the incidence data relative to the 13 Italian locations as such in the province scale model, which we elaborated separately from the analysis on whole Europe. The Italian study. The female condition in the Italian provinces was described by 11 indicators selected from the 1991 Census data (18); these included a range of different aspects (cultural conditions, welfare, occupation, health). We condensed this information into summary indicators by applying PCA, which produced three components (ITFEM1-ITFEM3), collectively explaining 73% of total variance. ITFEM1 alone explained 43.5% of total variability.
The correlation matrix between original variables and factors (factor loadings matrix) is reported in Table 2. The inspection of the factor loadings indicated that the first factor (ITFEM1) pointed to the general level of economic development and, more important, to the female occupation and the percentage of the female graduates, but it was inversely correlated with the percentage of illiterate females and the rate of unemployment of young females. The first factor was by far the most relevant component (more than 40% of explained variability) of the local differences in terms of female condition. The second factor (ITFEM2) was positively correlated with the rate of infant deaths and negatively correlated with the number of kindergartens for each live birth. Thus, this factor described the level of medical and social assistance in a given area. The third factor was negatively correlated with the divorce rate and positively correlated with the average number of members per household. This factor can be considered a descirptor of the female relational condition (i.e., the changing approach of women to family and extrafamily matters). Thus the first PC (ITFEM1) can be considered the best summary score of the advancement of the female condition in the different Italian areas (provinces).
We checked the specificity of ITFEM1 by contrasting it with other general socioeconomic indicators of Italy. Using PCA in a previous work (15), we analyzed 36 general descriptors of the Italian society and we derived two summary indicators: ITDEM1 and ITDEM2. ITDEM1 is a general indicator of economic development and follows the well known north-south gradient that characterizes many aspects of Italian society; the second component, ITDEM2, was related to the urban-nonurban character of the provinces studied. The correlation coefficient between ITFEM1 and the first component of general (not sex-related) socioeconomic indicators (ITDEM1) was relatively weak (r = 0.47), though statistically significant (p < 0.001). ITFEM1 therefore conveys some important sex-related specific information not simply assimilable to the economic development.
The next step of the analysis was to compare the sex difference in cancer incidence (∆N) with the three factors (ITFEM1-ITFEM3) describing the female socioeconomic condition. Notwithstanding the small sample size (13 provinces), the ∆N values of the Italian provinces were sufficiently widespread, ranging from around 20% sex difference (Ragusa: ∆N = 0.226) to 40% (Varese: ∆N = 0.417). The Pearson correlation coefficient of ∆N with the socioeconomic components was, respectively: r (ITFEM1, ∆N) = 0.729, r (ITFEM2, ∆N) = -0.261, r (ITFEM3, ∆N) = -0.306. Only the correlation between ITFEM1 (already selected as summary descriptor of the female condition) and ∆N was statistically significant (p < 0.005). Figure 1 reports the observed correlation.
As a further check for the specificity of the observed relations, we performed a similar analysis having as pathologic end point the observed incidence of AIDS cases (both sexes) in the 95 Italian provinces. In the case of AIDS, a disease with a natural history completely different from cancer, the female condition components were not correlated with the relative incidence of the pathology, whereas ITDEM1 (as well as, marginally, ITDEM2) was significantly related to AIDS incidence. This result further demonstrates the specificity of the female condition indicator as predictor of the sex difference in cancer incidence.
The positive sign of the relationship between ITFEM1 and ∆N should be noted: its meaning will discussed below in the Discussion section. Table 4 summarizes the main results presented above.
European study. The female condition in the various European countries is described by 15 variables. A PCA of this socioeconomic data set produced four PCs (EUFEM1-EUFEM4), which collectively explain 86% of variance (Table 3). Component 1 (EUFEM1) alone explains 46% of total variability. Inspection of the variables maximally loaded on EUFEM1 shows that this component summarizes the advancement of the female socioeconomic condition occurring in the last decades in the most developed countries: In fact, the reaching of apical positions for women (percentage of female ministers, loading = 0.821) goes hand in hand with the per capita GNP (load-ing= 0.941) and the increase of mother's mean age at birth (loading = 0.858). EUFEM2 is related to the birth and fertility rates, and represents the relative extent and timing of the demographic transition (contraction of the population increase) experienced by developed societies in recent decades. EUFEM3 points to a (probably spurious) relation between population density and female occupation rate, whereas EUFEM4 mainly describes problems linked to unemployment and migration, with no sex connotation.
We compared EUFEM1 (the best summary score for the female condition in Europe) with three general socioeconomic indicators relative to the European countries (EUDEM1-EUDEM3); We had obtained EUDEM1-EUDEM3 in a previous study (15). EUFEM1 scored Pearson correlation coefficients of 0.66 and 0.65 with EUDEM1 and EUDEM2, respectively. This result indicates that, unlike in Italy, the female condition indicator for the rest of Europe was largely coincident with the information carried by socioeconomic indicators that are not directly sex-related.

Articles • Female condition and cancer in Italy and Europe
Environmental Health Perspectives • VOLUME 109 | NUMBER 7 | July 2001   Both EUFEM1 and EUDEM1 scored a significant correlation with ∆N: r = -0.74, p < 0.001 and r = -0.80, p < 0.001, respectively, whereas EUDEM2 was not significantly correlated with ∆N.
None of the above composite indices was able to predict the global incidence of infective diseases, thus confirming the specificity of the measured demographic and socioeconomic indices for tumor pathology. Table 5 summarizes the above correlations. Figure 2 displays the relation between EUFEM1 and ∆N. It shows that the earlier and more pronounced the advancement of female condition (higher values of EUFEM1, Northern Europe), the lower the sex differences of tumor incidence (lower values of ∆N; inverse relationship between EUFEM1 and ∆N). This is exactly what should be expected by the simplest line of reasoning: Progress toward socioeconomic equality between the sexes equalizes lifestyles and thus lowers differences in pathology profiles. This simple picture, though it remains at the European macro scale (between-countries variability) was contradicted at the micro scale studied (Italian within-country variability), where we found a positive relationship between ∆N and ITFEM1 (Table 4, Figure 1). The incidence of tumors in males is uniformly greater than that in females, both in Italy and in all of Europe (see Table 1): High ∆N values point to a relatively better differential condition of females than of males. This trend implies that in Europe the socioeconomic advancement of women meant that they lost the relative benefits of a cancer incidence lower than that of men. In Italy, the positive relation observed between ∆N and ITFEM1 implies that the highest differential between sexes in terms of pathology parallels the improvement of female socioeconomic conditions, which is the exact contrary of what we observed in Europe on a general ground.

Discussion
The Italian result is, at a first sight, paradoxical: Given that progress in the female condition usually is interpreted as reduced socioeconomic differences between sexes, we should observe a parallel reduced heterogeneity in terms of pathology and thus a negative correlation between ITFEM1 and ∆N, as in the rest of Europe. But if we consider the peculiarities of Italian history (19), the observed results are less paradoxical. In Italy, the progressive emancipation of women has followed industrialization, which in turn, until the last 34 years, has practically involved only the northern-central part of the country. This implies that ITFEM1 is an indirect index of the relative precocity and intensity of industrialization. In fact, Table 2 shows that where industrialization was more intense and prolonged, the percentages of occupation are more homogeneous between sexes and, in general, allowed more women to enter the labor force.
However, the emancipation of Italian women followed industrialization two or three generations after industrialization. Only 30 years ago, it was common for the husband to be employed outside the home and the wife to be busy with domestic duties. Thus, men in the more industrialized Italian areas experienced environmental and life-style conditions quite different from those of the female population, given the relatively serious health hazards linked to industrial work (now drastically reduced within approximately one generation) and the concomitant diffusion of such unhealthy habits as cigarette smoking. Conversely, agricultural work provided a much more homogeneous environment for both sexes. Given the latency time of tumor induction (20 years on average) and the fact that our data refer to the 1988-1992 period, our results should be interpreted as a consequence of the different timings of industrialization and female emancipation within the same areas. For Italy, therefore, we are observing a transient phenomenon linked to the over-60s population, the last generation that experienced the industrial environment before female emancipation. This interpretation is strengthened by the extent of the correlation between global cancer incidence in males (PM) and females (PF) at the two scales: Although they are highly correlated at the scale of the Italian provinces (r = 0.94 for Italy), they are independent at the scale of the European countries (r = -0.10 for the whole Europe) (Tables 4 and 5). This points to the coexistence of specific national models, which can be different from the Europe model.
Obviously, the value of the above conclusions depends on the reliability of the measures we used to define the elusive concept female condition. In both cases we analyzed with PCA a set of variables related both to the type of society and to specific characteristics of the female population. We were unable to retrieve the same variables in the existing public databases, so the sets of variables used in the two cases were different. However, the female condition indicators derived for Italy (ITFEM1) and for Europe (EUFEM1) have similar meaning: They point to affluent societies (northern Italy, Nordic countries), with high percentages of employed females (Italy), high female presence in the service industry (Europe), and high female presence in government (Europe). Moreover, both indices are negatively correlated with infant mortality. Thus, we think that both indices are valid measures of the female condition in the two contexts.
Both the European and Italian models proposed are informative, but their differences must be considered in attempting to explain observed empiric correlations. The observed scale effect restricts the generalizability of ecologic studies and points to the need for interdisciplinarity in interpreting them. It is impossible to derive a comprehensive (biologic?) theory that includes society, individuals, and cells because different phenomena and mechanisms act at different levels. Thus, when performing ecologic studies we are dealing with empiric evidence calling for an operational and not a biologically mechanistic interpretation. What is needed is the possibility of expressing the empiric models in terms of operationally modifiable variates, to make a consequent public health intervention possible.   On a more theoretical ground, we share the vision of S. Levine (20): [There] is no single correct scale of investigation.… [The] pattern exists at all levels and on all scales, and recognition of this multiplicity of scales is fundamental to describing and understanding ecosystems.… [T]here can be no "correct" scale level of aggregation.… [A] central challenge in … theory must be an elaboration of … how scales relate, and how the measurement and dynamics of scale phenomena vary across scales.… We must recognize explicitly the multiplicity of scales … and develop a perspective that looks across scales and that builds upon a multiplicity of models rather than seeking a single "correct" one.
As a matter of fact, the investigations aimed at linking different size and temporal scales seem to be the most critical goal of basic research today (21,22).
In this respect, the loss indicators of female condition lose specificity, going from the more detailed Italian picture to the coarser grain European picture. In the Italian provinces data, ITFEM1 is only loosely correlated with the more general socioeconomic variables, whereas EUFEM1 is largely reconstructible from the socioeconomic descriptors of the European countries. In fact, this is a general characteristic of every type of empiric correlations: The coarser the grain of the representation, the less specific the picture. This is the obvious consequence of collapsing all the local models into an average general model, which reflects only what is common to the various local models. This also makes the various scales of analysis largely independent from one another. This very general feature of systems analysis (23) is driven in this case by the different historical determinants that shaped the female condition variability at the macro scale (economic development) and at the micro scale (time delay between economic development and female emancipation). The need to consider simultaneously different scales of phenomena makes the debate surrounding the "real biomedical explanation" of pathologic events largely devoid of immediate applicative interest, whereas practical health problems require an efficient interdisciplinary collaboration between scientists coming from different fields.
Our results confirmed the hypothesis that the sex difference in cancer incidence in Europe is largely attributable to differences in lifestyle and environmental factors. This suggests that the sex difference in cancer incidence can be a useful probe for environmental factors in the epidemiologic studies.