Which technology diversification index should be selected?: Insights for diversification perspectives

Abstract The implacable list of diversification indices allows a wide range of selection opportunities for the researchers. The absence of consensus on the selection of suitable technology diversification index, however, may lead to a lack of objectivity with ample grounds. In this study, we focus on the case of technology diversification using patent to derive empirical implication for selecting suitable diversification index. To obtain the content validity of diversification index, three cases were tested: cross section, single and multiple time periods. As a result, diversification indices are separated into two groups: HHI, Gini-Simpson, 1/HHI, and Entropy for PC1 and Variety and Rao-Stirling for PC2. In this context, technology diversification can be explained by two perspectives of diversification: balance-centered and hetero-centered diversification. The balance-centered diversification implies the proportion of elements are the target of interest while hetero-centered diversification refers to variety and disparity of diversification, which focuses on the degree of differentiation among elements. In applicant-level technology diversification studies, these two diversification perspectives are recommended to be used. Subject classification codes: include these here if the journal requires them

ABOUT THE AUTHOR Keungoui Kim is currently a post-doctoral research fellow at Réseaux Innovation Territoires et Mondialisation, Univ. Paris-Sud. His research interests include technology innovation, diversification strategy, and network analysis.

PUBLIC INTEREST STATEMENT
This paper discusses the selection problem of technology diversification index using patent data. In order to address the content validity of selecting proper diversification indices, the lists of diversification indices with diversity properties and three different time-domain cases were examined. From the result, technology diversification is divided into two perspectives of diversification: balance-centered and hetero-centered diversification. The balance-centered diversification implies the proportion of elements are the target of interest while hetero-centered diversification refers to variety and disparity of diversification, which focuses on the degree of differentiation among elements. Referring to this, a consideration of two diversification perspectives and using representative indices for each are recommended.

Introduction
In management and economics, diversification has been discussed in the context of research, business (Geringer, Tallman, & Olsen, 2000;Singh, Gaur, & Schmid, 2010;Stephan, 2002;Woerheide & Persson, 1992), geography (Geringer et al., 2000;Stephan, 2002), and technology (Chiu, Lai, Lee, & Liaw, 2008;Kim, Lee, & Cho, 2016;Kook, Kim, & Lee, 2017;Lin & Chang, 2015;Wang, Ning, & Prevezer, 2015). Especially, technology diversification has been regarded as one of the important issues in technology management, technology portfolio analysis, and etc. with relation to the contribution of technology on economic performance and growth. As the advent of technology convergence phenomena among different industries and technologies highlights the value of technology competitiveness, technology diversification study can contribute for understanding the diversification activities and establishing a diversification strategy.
Along with the importance of technology diversification, various diversification measurements were introduced and used in previous studies. The variety of available diversification indices provides a wide range of selection opportunities to researchers. At the same time, it implies the lack of a rigorous consensus for choosing an appropriate diversification index either in general or specific situations. In the case of ecology, Morris et al. (2014) pointed out the difficulty of quantifying biodiversity due to the multitude of proposed indices. The absence of consensus on the selection of a technology diversification index (TDI) may lead to justified allegations of a lack of objectivity. Since both interpretation and consequences can differ depending on the characteristics of the index used, a diversification index should be selected with the consideration of the validity of TDI.
There were few attempts to address this problem, but they were limited to certain situations and not of general application. To prove the content validity of TDI, the analytical perspective should not be restricted to limited cases. In order to obtain the content validity, the evaluation should be considered more rigorously in the context of various cases. For instance, the result of diversification can differ according to the measurement, the analytic sample, and the target of interest. A more generalized understanding of diversification indices, therefore, can contribute to the various research areas where diversification is a point of interest.
In this study, the empirical implications of selecting a suitable TDI are derived using patent data. To prove the content validity of the selection of a TDI, the three cases include cross-section, single time period, and multiple time periods, with comparisons among three country groups of 18 countries. Due to the reliability and usability of patents, various cases can be studied and comparisons among them can be made to obtain a more generalized understanding. In addition, the list of diversification indices that are frequently used in previous studies are examined to determine which are better suited for measuring technology diversification. For the evaluation of diversification indices, Stirling (2007)'s general framework for diversity property is used as a reference for understanding the difference among indices. As a methodology, PCA is conducted to determine distinctive relations among the variables and derive the dominant variable of each principal component. Since the relationships between diversification indices do not always satisfy the mathematically predicted patterns (Nagendra, 2002;Stirling & Wilsey, 2001), performing PCA on real data can ensure that the conclusions are valid (Andy Stirling, 2007).
The remainder of this chapter is laid out as follows. Section 2 reviews the content validity of TDI and its related indices in detail. Section 3 describes the data and methods used in this study. Lastly, the results and conclusions are set out in Sections 4 and 5.

The content validity of diversification index
The validity of an index can be studied from the perspective of either face or content validity. Face validity observes whether a test measures what it is supposed to measure while content validity observes whether a measure represents what it superficially appears to measure. Simply put, monthly income indicates face validity of one's financial ability per month. Compared to face validity, however, content validity requires a rigorous evaluation of the measure. In general, content validity requires either recognition by an expert or statistical testing as it deals with superficial measurements. Robins and Wiersema (2003) pointed out the importance of using an appropriate diversification index by emphasizing the content validity of a diversification measure.
The main purpose for conducting content validity of a diversification index is that the lack of a prior assessment of content validity may cause problems in obtaining consistent results due to the sensitivity of using each index. Although the result can differ depending on the diversity measurement (Morris et al., 2014), there is no sufficient consensus on the selection of a diversification measure (Robins & Wiersema, 1995; the sensitivity of diversification measurements may cause contradictory results and unexpected confusion depending on the choice of a diversification index. Robins and Wiersema (2003) examined the content validity of an entropy index and a concentric index to see whether these indices are suitable for use as indicators of portfolio relatedness. Their results showed that the concentric index positively influenced dominant business focus (the relative size of dominant businesses) and negatively influenced pure diversification (the number of businesses); the entropy index exhibited the opposite characteristics. From these findings, they concluded that these sorts of sensitivities can create ambiguities in strategy research. Woerheide and Persson (1992) evaluated five diversification indices for measuring unevenly distributed stock portfolios. Among the five indices-the complement of the Herfindhal index, Rosenbluth's index, the exponential of entropy index (Marfels, 1971), the comprehensive concentration index (Marfels, 1971), and the entropy index (Hart, 1971)-the first was found to be the most adequate for general use. Therefore, rather than simply adopting one, the prior assessment of selecting a proper index is needed.

Technology diversification
Technology diversification refers to the degree of a firm's technology diversity. Although there is consensus on the definition and the sources of technology diversification, there has not been an in-depth discussion on the TDI. In case of technology diversification, Variety, the Herfindahl-Hirschman index (HHI), the modified HHI, and Entropy are some of the generally used indices (Table 1). These indices are widely used in various areas of research and detailed explanations of them will be provided in the following section.  Kim et al., 2016) Technology field Joint occurrence of possible pairs (Breschi, Lissoni, & Malerba, 2003) 1/HHI (Leten et al., 2007) 1-HHI (Bas & Patel, 2005;Garcia-Vega, 2006) HHI (Fan, Li, & Yang, 2017;Gambardella & Torrisi, 1998;Kim et al., 2009) Entropy (Stephan, 2002;Wang et al., 2015) As described in Table 1, patent data are used as analytic material for measuring technology diversification. As an effective and valid indicator for firm-level technological activities, patents allow the grasping of a firm's technological activities rooted in a formal R&D organization (Pavitt, 1988). In relevant studies, it is regarded as a form of technological output that explains a firm's innovative capabilities (Kim, Lim, & Park, 2009;Kim, Jung, Hwang, & Hong, 2018;Lee & Kang, 2015). Since patent data span 80 patent offices worldwide from the 1970s, comparisons across countries and over a wide range of time periods are available.
In general, the IPC is used as a target of technology diversification. Argyres (1996) suggested that patents assigned to different areas of technology can be observed as different technological applications. In this sense, the technology classifications assigned to a patent can be used to distinguish a firm's technological applications. Jaffe (1989) viewed technology as consisting of a number of distinct "technological areas" and used this approach to characterize a firm's technological position. Technology classifications provided by patents offer detailed information on the area of technology, which is relevant for assessing a firm's technological activities (Stephan, 2002). In this sense, a patent is acceptable material as it includes the IPC, which is a type of technology classification that distinguishes different technologies in a hierarchical order. Even though the IPC is selected as a diversification target, its measurement level has to be determined. As described in Table 1, technology diversification is measured either with reference to a sub-class or technology field. The sub-class refers to the hierarchical level of the IPC and technology field refers to a bundle of related industry classifications from the IPCs. Accordingly, the criteria for the technology field may differ depending on either references or countries. Since the sub-class level of the IPC can be used without restrictions across different countries, it is adopted as a diversification target instead of the technology field.

Diversity properties
The main reason that the values measured by each diversification index differ is due to differences in their points of interest. In this vein, Stirling (2007) proposed a general framework for diversity that can be applied in the field of science, technology, and society. He classified three basic properties of diversity: the number of elements (Variety), the distribution of elements (Balance), and the difference between the elements (Disparity). All these properties are necessary but individually insufficient as each property constitutes other two (Stirling, 1998;2007). Thus, it is more likely to assume that the selection of a diversification index should consider multiple aspects rather than only one particular index. In this study, a total of six diversity indices that are widely used in technology diversification studies are selected: Variety, HHI, 1-HHI, 1/HHI, Entropy, and Rao-Stirling (Table 2). Although it was not mentioned in the previous section, Rao-Stirling is included to cover the disparity of diversification. Variety, which is the simplest and the most intuitive diversification index, counts the number of entities (Macarthur, 1965). As already described in its title, it only considers the variety of diversification. In economics, it is often used as a simple enumeration of firms or products (Cohendet, Llerena, & Sorge, 1992;Kauffman, 1992;Saviotti & Mani, 1995). For example, the kinds of IPC subclasses owned by a firm represents the richness of its technology diversification.
Instead of counting a total number, using the proportion of each target of diversification is a more general approach used by most indices. The HHI is one of the most widely adopted indices for the measurement of technological concentration or diversification (Berry, 1971). Stirling's framework (2007) covers the variety and balance properties of diversification. This index was initially adopted in management literature to observe firms' diversification (Geringer et al., 2000;Sambharya, 1995), and also used in the technology diversification (Gambardella & Torrisi, 1998). The results of the HHI are quite intuitive as its formula is designed to treat proportions equally using a square root. For example, an applicant with an equally distributed proportion of IPC subclasses has a lower HHI compared to one with concentrated technologies.
Rather than simply adopting the HHI, modified HHIs are introduced (Bas & Patel, 2005;Garcia-Vega, 2006;Lin & Chang, 2015;Quintana-García & Benavides-Velasco, 2008). It is obvious that a concentrated firm's value is low while diversified firms require higher values of diversification. A firm with a higher diversification value is assumed to be more diversified. Another modification of the HHI is measuring diversification by taking its inverse (Leten, Belderbos, & Van Looy, 2007). The implications of inversed HHI results are similar to those of the 1-HHI, but it emphasizes the apparent differences in diversification. For instance, a comparison between HHI values of 1/4 and 5/8 becomes 6/8 and 3/8 when transformed into the 1-HHI. When they are converted into the inverse HHI, they become 4 and 5/8; the difference between them is greater than in the previous indices.
Similar to the HHI, Entropy also covers the variety and balance of diversification. It was first introduced in the second law of thermodynamics. In thermodynamics, entropy represents the amount of energy that can no longer be reused. An increase in entropy represents chaos at the molecular level where the possibility of transforming energy into work is low. The concept of entropy used in diversification is somewhat similar, but more related to information theory. In information theory, entropy, also known as the Shannon index, deals with the uncertainty (or imbalance) of information (Shannon, 1948). An increased value of entropy indicates an increase in information uncertainty. For instance, if a firm obtains new information, it means that the understanding of its own overall information decreases. Here, the degree of information uncertainty depends on the proportion of new information to overall information. In the case of diversification, an increased value of entropy implies an increase in diversification in the sense of information uncertainty. Unlike the HHI, however, entropy gives more weight to lower values so that it can highlight the imbalance among components.
Lastly, the Rao-Stirling index not only measures the proportion of each entity, but also the Euclidean distance between the entities (Stirling, 2007). Unlike previous indices, it considers disparity in addition to variety and balance. In order to determine the disparity among elements, it conceives of the distance between them in a so-called disparity space (Solow, Polasky, & Broadus, 1993). Disparity space is a unique n-dimensional space where n refers to the number of attributes of elements. Here, the attributes of elements can either be cardinal, interval, or binary terms. In this case, the attributes of elements are obtained by the cardinal terms of an applicant's IPC classes. After normalization and weighting, the Euclidean distance between elements can be scaled to reflect distances in disparity space (Kruskal, 1964).

Data
For the empirical analysis, Worldwide Patent Statistical Database of the European Patent Office (EPO) is used. The EPO patent data have the advantage of minimizing home-country bias as firms are headquartered in different countries (Schmoch, 1999). From previous studies, technology diversification has been discussed in three different cases: cross-section, single period, and multiple period (Table 3). The cross-sectional case observes technology diversification at a certain time. The single-period case observes technology diversification within a specified time period. The multiple-period case separates the time periods. For comparison, three data sets are constructed for the three cases including the three groups of countries (high-, mid-, and low-level) classified by their level of patent applications. For the high-level patent application group, the top-six patent application countries-the United States (US), Germany (DE), China (CN), Spain (EP), Japan (JP), and Korea (KR)-are included. For the mid-level patent application group, Sweden (SE), the Russian Federation (RU), Italy (IT), Brazil (BR), and the Netherlands (NL) are selected. For the low-level patent application group, Bulgaria (BG), Malaysia (MY), Serbia and Montenegro (YU), the Eurasian Patent Organization (EA), Slovenia (SI), and the Philippines (PH) are included. Here, YU is intentionally included to see how diversification indices are observed where the number of patent applications dramatically decreases. Case 1 uses cross-sectional data in 2015 and 10,000 applicants are randomly selected from each country. Case 2 covers a single time period of ten years. Case 3 covers multiple five-year time periods. Here, the empirical cases of the US, the NL, and the PH are chosen. For the latter two cases, applicants who have patents in the beginning and end years are selected. Overall descriptions of the sample data for each case are set out in Table 4.

Technology diversification calculation
Prior to the calculation of TDIs, the data is organized by each applicant's IPC sub-class assigned to the patent. In order to remain consistency among countries, technology diversification is measured with  1986-1990, 1991-1995, 1996-2000, 2006-2011China Wang et al. (2015 1984-1991 US and Europe Gambardella and Torrisi (1998) Multiple periods 1983-1987, 1988-1992, 1993-1997 US, Europe, Japan Stephan (2002) 1988 -1990, 1994-1996 Europe Bas and Patel (2005) IPC sub-classes for all cases. The sub-class level of the IPC is considered as units of technology for the selection of new technologies (Kim, 2013). Basically, technology diversification is measured annually and its accumulated form is used for cases with multiple time periods (Figure 1). One of the requirements for a TDI is that it should be capable of capturing either changes or variations. With this rule, the changes in applicant-level technology diversification can be observed while the overall technological information is secured. Under this condition, diversification indices are derived with the equations mentioned in the previous section.

Principal component analysis (PCA)
PCA is a statistical procedure for reducing the dimensions of a data set. Here, the dimensions of a data set refer to the measurement type. It uses orthogonal transformation to convert a set of possibly correlated variables into a new set of linearly uncorrelated variables. By doing so, the following goals can be achieved (Abdi & Williams, 2010). First of all, the most important information from the data table can be extracted. In addition, the size of the data can be compressed while retaining important information. Lastly, the patterns of similarity in the observations and variables can be analyzed. Given these advantages, PCA has been widely used in big data analysis and in index studies (Morris et al., 2014). One of the key conditions for PCA is the large sample size (Chao & Wu, 2017). For instance, if the number of observations is less than the number of variables, PCA can be influenced by the outliers in the database and the results will not be consistent (Hastie, Tibshirani, & Jerome, 2009). In this study, the result of PCA is reliable as it uses a large data sample with universally accessible patent data.
In index studies, PCA is used for two different purposes. The first is to directly propose a new index using the results of the PCA (Chao & Wu, 2017;Filmer & Pritchett, 2001;Vyas & Kumaranayake, 2006). The outcome of the PCA-the principal component-is a linear combination of variables where the coefficient of each variable indicates its importance or weight with respect to a given principal component. By summing the products of the input variables and the PCA coefficients, a PCA-based index is proposed. This is one of the data-driven procedures of index mining or a systematic search for optimal variable aggregation (Chao & Wu, 2017). However, a principal component has a limitation to be used as a new index. In general, a new index is proposed with a generalized equation and an  PH 1996PH -200020062011 Figure 1. Technology diversification.
explanation on how the equation is designed with scientific sense. The result of PCA, however, does not guarantee the consistent result for all time and it is not always plausible to explain the weight and direction of variables.
Secondly, PCA is used to provide a guideline for selecting indices taking into consideration their similarities and differences (Godshalk & Timothy, 1988;Morris et al., 2014). As indices that are designed for the specific purposes share a certain level of similarity, determining the difference among actual measurements can provide a considerable amount of information in selecting a diversification index. Compared to using a simple correlation test, PCA has the advantages of dealing with a large number of variables and comparing in hierarchical order with respect to the relative importance of principal components. In this sense, principal components obtained from PCA can address the problem of collinearity (Chao & Wu, 2017). With this approach, the unobserved properties of indices that are shared among them can be observed and it can be used as a reference for selecting a diversification index. In this study, therefore, PCA is used to select a diversification index by analyzing their similarities and differences.

Selecting technology diversification index
Once principal components are derived, diversification indices are classified into either PC1 or PC2. In previous studies, the importance value (IV) of each index is measured to determine which is best able to differentiate principal components (Wilsey, Chalcraft, Bowles, & Willig, 2005). Here, IV synthesizes information based on the importance of each principal component and generates a value representing the overall ability of each diversification index to distinguish principal components (Morris et al., 2014). Previous studies, however, proposed using at least two measures; they failed in deriving an ideal index (Heino, Mykrä, & Kotanen, 2008;Stirling & Wilsey, 2001;Whittaker, 1972) as selecting a proper index requires the consideration of diverse situations and conditions. Thus, a comparison between principal components precedes index selection.
Following previous studies, the comparison among principal components is limited to PC1 and PC2 which explain the majority of the data. Between PC1 and PC2, the absolute values of the coefficient of each diversification index are compared. The coefficient of each principal component's variable is also regarded as the weight of the variable. In other words, the coefficient of a diversification index shows its level of importance with respect to that principal component. In this sense, the coefficients of a diversification index can be used to determine the principal component that diversification most affects. For instance, if the absolute value of PC1's variety is greater than that of PC2, it is assumed that PC1 is more influenced by variety than PC2. In this manner, each diversification index can be classified. One of the most important features of an index is whether it can show differences well. Among the list of diversification indices for each principal component, the one with the largest standard deviation (SD) is selected as the representative index. Rather than using all usable indices with similar meanings, selecting a representative index is recommended like any other analysis using a huge data set and calculation. A larger SD of an index, a wider spread of values, implies more an index can show the variation of data. This selection criterion is promising as a variable with the largest SD can be regarded as the one that can explain the changes well compared to others.

Empirical studies
For each case, the summarized results are elaborated in Table 5-7. Firstly, the absolute value of coefficients of indices was compared to distinguish each index into PC1 and PC2. For each index, the principal component which each index is more relying on is elaborated. Secondly, the normalized SD of each index are described in parentheses. Within each principal component group, an index with the biggest SDs is represented with the bold type. The descriptive statistics are included in the Appendix.

Case 1: cross-section
In case 1, applicant-level technology diversifications in 2015 are measured and those of six different countries are compared. YU is excluded as there were no patent applications in 2015 due to their separation in 2006. The result is somewhat interesting because the results of the PCA among the six countries in each of the three groups are very alike. In this cross-sectional data set, the Gini-Simpson and the Rao-Stirling are shown to express the representative indices for PC1 and PC2, respectively. In the high-level patent application group, the 1/HHI is more affected by PC2 in CN. This may be due to the relatively higher average number of IPC sub-classes per applicant and active IPC sub-classes in CN, as the 1/HHI is more correlated to the Variety and the Rao-Stirling. Although the separation between PC1 and PC2 is apparent in the high-level patent application groups, the mid-and low-level patent application groups show few differences in their results as Variety is allocated to PC1 in BE, IT, the NL, the RU, the EA, the PH, and SI. However, these small differences do not influence the overall result as the representative indices are the same for all groups.

Case 2: single time periods
In case 2, applicant-level technology diversification between 1996 and 2015 is measured. Here, the PCA result is consistent to case 1; the same indices are allocated to the same principal components. In the case of the mid-level (low-level) patent application group, the result is almost similar except that the 1/HHI (Variety) of IT (the PH) is classified to PC2 (PC1). However, this can be regarded as a minor difference as the representative indices for each principal component remain the same. In summary, the PCA results of case 2 are similar to case 1; the Gini-Simpson and the Rao-Stirling are the most important indices for PC1 and PC2 (Table 6). Once again, applicant-level technology diversification in single time periods can be conducted from two different perspectives.

Case 3: multiple time periods
In case 3, technology diversifications are measured across four different time periods. Among the 18 countries, the US, the NL, and the PH are selected. Similar to the previous cases, a clear distinction between PC1 and PC2 is observed. In most cases, the HHI, the Gini-Simpson, the 1/ HHI, and Entropy are spread along PC1 while only Variety and the Rao-Stirling mainly change with PC2. This also shows the consistent result to the previous cases that for analyzing diversification within a single country with multiple time periods, two different aspects of diversification indices can be used.

Discussion
As a result, the interesting findings are those related to improving the understanding of technology diversification.
Firstly, a clear distinction among diversification indices is observed in all cases. As a result of the PCA, the diversification indices are separated into two groups based on principal components: the HHI, the Gini-Simpson, the 1/HHI, and Entropy for PC1 and Variety and the Rao-Stirling for PC2. This is somewhat consistent with Stirling's (Stirling, 2007) framework as PC1 represents balance and PC2 represents both variety and disparity. Since theories or existing evidence supporting index outcomes are important (Chao & Wu, 2017), proposing two diversification perspectives is reliable and supported by Stirling's (Stirling, 2007) framework. In this context, diversification can be explained using two perspectives of diversification: balance-centered and hetero-centered diversification (Figure 2). Balancecentered diversification refers to the balance of diversification where the proportions of elements are the targets of interest. On the other hand, hetero-centered diversification refers to the variety and disparity of diversification, which focuses on the degree of differentiation among elements. In addition, the variables with the biggest SD are shown to be the Gini-Simpson and the Rao-Stirling. Since the bigger SD implies that the variations in the indices are more observable, the Gini-Simpson and the Rao-Stirling are recommended for observing PC1 and PC2.
To supplement the selection of these two indices, a comparison among diversification indices for each diversification perspective is conducted. Figure 3 shows the time-series graph of balancecentered diversification in the US. Here, the size of each dot indicates the average number of IPC subclasses used for patent applications. Although all these diversification indices show similar variations, the Gini-Simpson appears to accurately reflect diversification changes and reality. From 2000 to 2001, a great shift in technology diversification occurred, incrementing the use of numerous IPC subclasses. Compared to other indices, the Gini-Simpson increased by a greater value, indicating that it detects diversification changes more clearly. Another interesting point is the period between 2008 and 2009 when the financial crisis occurred; the Gini-Simpson showed a more reasonable outcome as it shows the smallest changes in that time period.
In the case of hetero-centered diversification, Figure 4 shows the time-series graph of heterocentered diversification where the size of each dot indicates the total number of unique IPC subclasses. Compared to Variety, the Rao-Stirling describes hetero-centered diversification more clearly. For instance, the Rao-Stirling experienced a greater increase between 2000 and 2001 and no changes between 2008 and 2009. Since Variety can only determine the number of IPC subclasses used for a patent application, the level of heterogeneity for diversification is underestimated, a characteristic which is highlighted by the Rao-Stirling. In addition, although lesser or no diversification increment was expected in 2008 and 2009, Variety increased while the Rao-Stirling remained the same. This shows that the two representative diversification indices for each diversification perspective not only captures the variation well but also reflects reality in a more reasonable sense.
With these two perspectives of diversification indices, interesting findings can be obtained. If diversification is discussed with a single diversification index, only limited information about diversification is used. For instance, comparing the diversification of applications in each country only tells us the degree to which it either increases or decreases. Two perspectives of diversification clarify how the applicants in each country achieved diversification either in balance or heterogeneity. Figure 5 shows the result of the empirical analysis based on two perspectives of diversification indices with an average of normalized diversification perspective for each five-year time period. Here, the size of each point indicates the average usage of IPC sub-classes per patent    For the high-level patent application group, a clear increase in both diversification perspectives is observed. In the case of DE and EP, slight decreases in 2006and 2011 are observed as a consequence of the financial crisis. The countries (all except CN) familiar with applying for numerous IPC sub-classes for patent applications tend to move more horizontally, seeking balance-centered diversification. On the other hand, CN's technology diversification is more focused on hetero-centered diversification. This is a reasonable finding as CN is known to achieve dramatic increases in technological development in a very short period of time. From these findings, the differences in the trajectories of technology diversification regarding the maturity of technological application are observed.

Implications
In this study, the selection of a TDI has been empirically analyzed by covering the most frequently used cases and indices in previous studies. As a result, two diversification perspectives are proposed. It not only proves Stirling's (Stirling, 2007) framework empirically but also provides generalized insights on technology diversification. Overall, the implications of this study for both practitioners and researchers in the field of technology management are as follows.
For academic research, this study provides a reliable resource for selecting TDI. Due to the difference in calculation and characteristic, it is important to select an appropriate diversification index to satisfy the researcher's intended meaning of diversification and avoid the issue of subjective selection. The two technology diversification indices suggested from this study are already obtained content validity to be implemented in the field of technology management because they were derived through the efforts to generalize the result by adopting different cases and diversification indices. Given the absence of consensus on selecting a proper TDI, this study can be used as a reference for further relevant studies. This study can also be used as a reference for a more systematic mode of index selection, in particular for those interested in the quantitative aspects of science. More importantly, the two diversification perspectives proposed in this study can be used to drive a more extended discussion on technology diversification effect. Previously, technology diversification was only discussed with a single property of diversity. Instead, technology diversification can be discussed either with balance-centered or hetero-centered perspectives. Whether used separately or together, technology diversification measured by Gini-Simpson and Rao-Stirling allows the researchers to consider different diversity properties.
For the practitioners, the two technology diversification perspectives and representative indices are expected to be used for technology development strategy. Referring to the diversification index without an understanding of the diversity property may lead to the faulty decision. By noticing the difference between balance-centered and hetero-centered diversification, the decision makers can evaluate their own level of technology diversification more clearly and establish a more specific direction for their technology.

Limitations and directions for future research
While this study provides a guideline for selecting TDI, it has a couple of limitations. In this study, technology diversification was measured by patent's sub-class IPCs. The level of IPC not only indicates the unit and meaning of technology but also affects the sample size and PCA result. Although using subclass IPC is supported by previous studies, the difference caused by the level of IPC is better to be considered. In addition, technology and industry effect were ignored. The heterogeneity among different technology sectors and industries may lead to an additional discussion on diversification measurement. With relation to the limitation of this study, the additional examination on a selecting TDI with a different level of IPCs is needed. Moreover, a comparison of cases using different technologies and industries can be also discussed. Beside technology diversification, this also can be implemented to other studies or measurements used in the field of management. Descriptive statistics of case 2 (Low-level patent application group) You are free to: Sharecopy and redistribute the material in any medium or format. Adaptremix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms: Attribution -You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.