Knowledge Spillovers in ICT Industry of India: Evidence from the Firm’s Patent Citation Behavior

This study examines the knowledge spillovers in Indian Information and Communication Technology (ICT) industry by foreign multinational corporations using patent citation data from 2013 to 2017. Patents granted by the US patent office to applicants located in India are collected along with their citation counts. In order to examine the subject, this study applies negative binomial regression analysis. The outcome of this study shows that foreign ICT firms cited more patents with higher technology scope as compared to patents with smaller technology scope. Results also confirm the positive relationship between knowledge spillovers and geographical localization. This implies that the first inventor of both cited and citing patents share the same geographical region. However, the result of technological similarities is negative. Further, US firms citing both US and non-US patents, and non-US firms are citing more non-US patents and lesser US patents. This reveals that the knowledge flow patterns of US and non-US firms are significantly different. JEL Classification: O31; O32; O33; O34


INTRODUCTION
Knowledge creation and diffusion are two essential features of the knowledge-based economy 1 . The key sources of innovative knowledge are learning-by-doing, accumulation of human capital, research and development (R&D) or patent activity, and spillovers generated by any institution, firm, or country. [1] Knowledge spillovers 2 occur when knowledge created by an institutional setting for a particular project generates additional opportunities for its application in other similar settings. [2] Knowledge spillover 3 is crucial for economic growth, [3] urban development, [4,5] and the growth of high technology industries. [6] Literature on economic growth state that knowledge spillovers between advanced and less advanced countries are key determinants of cross-country 1 An economy that creates, uses and disseminates technical knowledge for its growth and development. 2 Terms knowledge spillovers and knowledge flows are used interchangeably in the existing literature and so in this study. 3 Knowledge spillovers mean that "firms can acquire information created by others without paying for that information in a market transaction, and the creators (or current owners) of the information have no effective recourse, under prevailing laws, if other firms utilize information so acquired" (Grossman and Helpman, 1991: p.16). [8] convergence. [7] It means that innovations in one sector or one country often build on the knowledge created by innovations in another industry and country.
Knowledge spillover has been examined by patent citations 4 between a citing and a cited patent at country level, [9,10] industry or firm level; [11,12] and university level. [13,14] Recently Rassenfosse and Seligar [15] analyzed patent data as sources of knowledge spillovers in terms of measuring R&D collaboration, technology sourcing, and technology transfer between developed and developing nations. They found that knowledge flows from East Asia are occurring more frequently and concentrated in information and communication technologies (ICT) 5 .
Further, they observed that the USA and Canada had traditionally larger patenting activity with Asia than Europe. It implies that North America is more likely to benefit from the reverse knowledge flows than Europe. It is due to the large shift toward R&D collaboration and technology sourcing with Asian countries in the ICT sector and computer technology. [15] The present study analyzes the patent citation data as a knowledge spillover source in the Indian context. We examined the patent citation behavior of foreign multinational corporations 4 Patent citations refer to prior patents that bear similarities to the technology for which protection is sought. A citation of Patent X by Patent Y means that X represents a piece of previously existing knowledge upon which Y builds. 5 ICT industry is a combination of Information Technology (IT) industry and Telecom industry.
This implies that the first inventor of both cited and citing patents share the same geographical region. However, the result of technological similarities is negative. Thus, this article contributes to the literature on cross-border knowledge spillovers in two ways. Firstly, it highlights the facts about knowledge spillovers between developed and developing economies i.e., from the rest of the world to India. Secondly, it captures information on patent citations as a potential source to analyze knowledge spillovers rather than to document their existence in counts.
Based on the above background, rest of the paper begins with the overview of the Indian ICT industry in Section 2. Further, Section 3 discusses the existing literature related to the use of patent citation data in the context of knowledge spillovers and the hypotheses of present study. Section 4 reports the methodological discussion. Section 5 reports the collection of patent data and description of variables used in the analysis. Section 6 reports the empirical results from the analysis. Finally, Section 7 concludes the study.

The role of Foreign MNCs in India's ICT industry growth
In past 20 years, there has been a transformation in the global economy from an investment-driven economy to an innovation-driven economy. India as an emerging economy is also transforming into a knowledge-based economy that is often taken to mean specifically ICT industry or high-tech industries. In India, ICT is one of the vibrant sectors among all the industries, expanded rapidly over the past few years. In 2018, the Software and IT sector of India accounted for US$ 117 billion of the R&D spending with a recorded growth between 18-19 per cent over 2017. Around US$ 1.6 billion is spent annually on workforce training and growing R&D. [25] The Indian ICT industry is more focused on software as a priority sector and dominated by foreign firms. The government of India started liberalizing the rules for foreign investors in the mid-eighties. In 1985, the first US software company Texas Instruments came to India and established its office in Bangalore. In the 1990s, Indian government introduced several economic reforms, permitting 100 percent foreign equity capital in the ICT sector. [26] Presently, almost all top ICT companies worldwide are significantly investing in India and playing an overwhelming role in its economic growth. Various MNCs have established R&D centers in India to take advantage of the highly skilled and low-cost R&D talent pool, conducive Intellectual Property Rights [27] (IPR) Policy, robust academic and research infrastructure, low cost of operations, and various liberalized schemes.
Moreover, the government launched various attractive schemes and promising policies for science and technology sector viz. Digital India, Invest India, Startup India, Make in India, etc. The aim is to increase private sector investment (MNCs) in the Indian ICT industry in terms of their technology scope and geographical localization of knowledge. A similar approach is also applied by Lukach and Plasmans [16] and Yu and Wu. [17] Rassenfosse and Seliger [15] observed that China followed by India has become prime places in Asia for R&D collaboration and destination of MNCs' innovative activities, specifically in the ICT sector. After the economic reforms of 1991 in India, MNCs' presence and their market share in the Indian ICT industry started rising. A study by Mani [18] stated that ICT is one of the most innovative and R&D intensive industries in the country and has become the second-largest in terms of patent-holding after the pharmaceuticals industry. Despite the increasing importance of developing countries like India in the global technological network, most studies examined knowledge spillovers in the context of developed or highly industrialized economies. [19,20] However, few studies have concerned the innovation behavior of Indian ICT industry with inconsistent findings. [21,22] Existing literature shows that knowledge spillovers can be examined through the patent in two ways. The first type of knowledge spillovers can be captured through patents granted to foreign firms by the Indian patent office. In recent years, foreign firms' patents in India have seen an upward surge. [23] Consequently, market competition among the firms has intensified. The second type of knowledge spillovers occurs when the patent is granted by foreign patent offices to firms located in India (hereafter, these patents are called Indian patents). Such patents are also known as foreign-owned patents with Indian inventors. This study is based on a later approach which examines US firms' and non-US firms' citing patterns in India, using United States Patent and Trademark Office [24] patent backward citation data. We preferred USPTO patent citation data as USPTO is one of the major filing destinations for firms active in India, and USPTO data is freely available. The empirical estimation of knowledge spillovers is based on 50 (36 US firms and 14 non-US firms) foreign multinational corporations (MNCs) in the Indian ICT industry using the negative binomial count data model. Citation data includes characteristics of various citing and cited patents that can be used to examine the citation behavior of the firms, convincingly. For example, to see whether a firm is citing patents from a particular technology or from a particular location then attributes like technology class and inventor/assignee country can be considered.
We found that foreign ICT firms cite more patents with higher technology scope compared to patents with smaller technology scope. It implies that in India, foreign ICT firms are responsible for knowledge spillovers with diversified technology. The result also confirms the positive relationship between knowledge spillovers and geographical localization. The generic term used in various reports and newspaper articles to describe these centers is Global Capability Centers (GCCs). These centers are focused on engineering and product development in the fields of artificial intelligence, machine learning, and data analytics which in turn helps in solving various business problems. Over 1,250 MNCs worldwide have set up GCCs in India till 2020 that has increased significantly, from 981 in 2010. The total number of GCCs are more than 1750, consist of both 'Back-office IT services' and 'R&D and engineering services'. [28] The major chunk of GCCs dealing in ICT is located in Bengaluru, Hyderabad, and Delhi NCR. These GCCs generated engineering and R&D revenue of $15.7 billion in 2019. MNCs choose to operate as wholly-owned subsidiaries if innovation is involved, rather than third-party outsourcing. The Indian IT industry policy has been much more encouraging for domestic ICT firms. However, it gives a larger benefit for overseas companies too. Foreign software companies can invest in India with 100% shares if they have software export operations. India has the largest market share in the global services sourcing industry, recorded around 55% (Invest India, GoI 2020). [25]

Patent statistics
A study by Mani [29] argues that foreign MNCs in India own all the patents in both 4G and 5G mobile technology. Foreign MNCs contributed over 2400 patents (71.53 percent) out of 3355 patents that have been granted to India at USPTO in 2015. [30,31] The share of the ICT industry worldwide is the largest in USPTO patents granted, i.e., 37% of all USPTO patents in 2016. [32] India's performance at USPTO in the ICT sector is shown in Figure 1. India's performance in holding US patents for computer applications has increased progressively.
Additionally, the outbound foreign patenting activities from India have increased speedily. In 2016, a high proportion (45.5%) of total patent applications of Indian origin were filed abroad. The bulk of patent application abroad from India was destined for the USPTO in last one decade. [32] Thus, this study attempts to explore the linkages between the above facts and investigate the knowledge spillovers from US and non-US firms to the Indian ICT industry, using USPTO patent and their backward citations data as a proxy variable.
Grosse [33] indicated that MNCs from developed countries always need a relatively low-cost location for R&D to increase technical activities or develop products for other developed countries. He noticed that the R&D expenditure by foreign MNCs of US to their Indian subsidiaries rose from US$ 1.37 thousand in 2010 to US$ 3.22 thousand in 2015. Mani [18] stated that MNCs undertake R&D activities in India that accounted for a significant contribution to its worldwide patent portfolio. Further, Mani [34] found that non-residents own more than 80 per cent of the patents that are in-force in India at any particular point in time and the patents granted during any particular year. A comparison of a patent owned by Indian and foreign assignees in the ICT sector at USPTO is shown in Table 1.

LITERATURE REVIEW
The knowledge spillover is an intrinsic part of innovation, as learning from innovation and feedback effect enhance further innovation in the economy. [35] Criscuolo and Verspagen [36] applied various economic models to understand the probability of citations by examiner and inventor and observed that the patent citations differ across the borders. They find that geographical distance negatively impacts knowledge spillovers in Europe and the US using patent citation data. Further, they argue that cognitive distance, time, and strategic factors significantly affect citing behavior. They identified that inventor citation is more closely related to patented technology, and therefore, inventor citation should only be considered for measuring the knowledge spillovers. Using European patent citations, Duguet and Macgarvie [37] have shown that the strength and statistical significance of the relationship between patent citations and knowledge flow varies across geographical regions.
Hall et al. [38] explored the statistics from the analysis of inventors to prove that patent citation works as a proxy for knowledge spillovers and is correlated. A study by Jia et al. [39] use patent citation network for measurement of International knowledge flow. Tijssen [40] showed empirical evidence using patent citation data of nation-specific and sector-specific factors showing relation between domestic and cross-border science and technology linkages and knowledge flows. Bacchiocchi and Montobbio [41] examined cross border technology diffusion through knowledge flows using patent citation data. At the same time, Maurseth and Verspagen [42] revealed knowledge flow through patent citations between various European regions. They evidently showed that technology flows are industry-specific and confined by geography, language, and international borders. Caballero and Jaffe [43] and Jaffe et al. [13] measured knowledge flow through citation data by creating a citation function that describes the use of a previously generated idea to produce a new idea. Jaffe and Trajtenberg [44] examined a set of ''potentially cited'' patents whose primary inventor resided in the US, Great Britain, France, Germany, and Japan. Hu [45] examined the knowledge spillovers from foreign firms to local inventors using US patent citation data in Singapore.
MNCs' role in knowledge spillover has been discussed extensively in the literature. [46,47] MacGarvie's [47] study reveals that a 10 percent increase in the foreign direct investment (FDI) flow between countries leads to 3 percent increase in the cross-country patent citation. Keller and Yeaple [48] estimated technology spillover through imports and FDI for US manufacturing firms. The outcome of the study suggests that FDI positively influences the productivity of domestic firms. However, some authors are of the view that only a few MNCs carry knowledge spillover. Iwasa and Odagiri [46] segregate research-oriented firms from sale-oriented firms.
The study finds that a high level of technological progress contributes to innovations among research-oriented firms; however, the same thing does not hold for sale-oriented firms.
The patent citation information is used as a proxy of knowledge linkages to capture the nature and determinants of knowledge flow across the firms, inventors, or the technological group. An important question arises about MNCs is whether these companies are crucial in terms of technology inflows? In any case, if the MNCs are capable of providing incentives to developing countries by any means of knowledge transfer, then they should be encouraged to do so.
A study on the patent citation by Trajtenberg et al. [49] found that patents originated from universities cites lesser patents, that are themselves less cited. However, Von Wartburg et al. [50] found that patents with higher technological value cite more references. In a more in-depth analysis of backward citation, Liu et al. [51] corroborate with the argument of a positive correlation between backward citation and patent value proposed by Von Wartburg et al. [50] They found that higher the backward citation, more strongly will the patent be defended in the court. Overall, the results are unclear on whether the number of citations (backward) reveals the patent's importance. Thus, in this study, we include the backward citation (antecedents) of the patented invention as an indicator of knowledge spillovers. Based on the above discussion, we raise the following hypothesis:

Empirical strategy
Patent citation is commonly used as a proxy for knowledge spillovers. [52] This paper's objective is to analyze the patterns and determinants of knowledge flow for US and non-US ICT firms in India. In this context, knowledge flows are measured by the patent citations of the Indian patents granted by USPTO. The Indian patents are owned by various MNCs, from the US and other western and Asian countries. However, this study focuses only on MNCs active in India in the ICT sector to capture knowledge spillover through citing behavior of the firms. Since we have taken only ICT patents, this study rules out any heterogeneity that might influence the citation numbers. However, we have considered the firm's headquarter to control the firm's location, which might influence the citation count of the firms. To examine the research question, this study applies a count data model.
The count data model is mostly used in health economics, industrial organizations (number of entrants in the market), and the technology management field (number of patent counts). The foundation of the count data model is the poisson regression model. [53] However, the poisson regression model is based on the property of equality of mean and variance, which is also called the equidispersion property of the poisson distribution. [54] The equidispersion property of the poisson model has often been violated; therefore, researchers apply a negative binomial (NB) regression model, which is a more suitable choice for the count data model. The NB model does not have any restrictions as suggested in case of the poisson model. As demonstrated in various studies, the negative binomial model is a more general form of count data model than the poisson model. [55] This study applies a negative binomial regression model to analyze the knowledge flow in terms of patent citing patterns of the firms i.e., citing similar technology patent, localization, and inventor's regional variability. The negative binomial regression model is employed to test the functional relationship which relaxes the equidispersion property of poisson model. Thus, the model can be written as: where λ i is the number of citations by patent i. Dummy variables TM ik and GL ih stand for the technological match and geographical localization. To capture the firm's heterogeneity, we include four dummies; USUS, USNUS, NUSUS, and NUSNUS which stand for US firms citing US origin patents, US firms citing non-US firms, non-US firm citing US firms and finally non-US firms citing non-US firms respectively. Additionally, we provided control for regional heterogeneity of cited patents. The regional heterogeneity is denoted by the EU (Europe), AN (Asia), AM (America), OC (Oceana) 6 and AF (Africa). To understand if patents on broader technologies are more cited, we include a variable namely technology scope of cited patents (TS) in the model. (2) Figure 2 shows that the frequency distribution of patent citation is highly skewed to the right tail, which is well captured by the poisson and negative binomial distributions. However, in 6 Oceania, collective name for the islands scattered throughout most of the Pacific Ocean. Oceania has a diverse mix of economies from the highly developed and globally competitive financial markets of Australia and New Zealand. It includes 14 independent countries and a number of dependent territories.
this study, data is over-dispersed (the variance is larger than the means), and therefore NB is applied. The negative binomial regression result captures both numeric and categorical variables.

Data and variable description
As per WIPO, [27] the patent applications received by offices from resident and non-resident applicants are referred to as office data, whereas applications filed by applicants at a national/ regional office (resident applications) or foreign offices (applications abroad) are referred to as origin data. Furthermore, this report shows the patent statistics based on the origin i.e., residence of the first-named applicant. In addition to that, the USPTO [30] report on 'patent counts by country, state, and year' revealed that the patent count data shown in the report is described as the origin of a patent. Patent origin is determined by the residence of the first-named inventor. By referring to both the document, we are using patent data reported by origin. The technology classification of patents is based on the international patent classification (IPC) developed by WIPO where five major technology groups are divided into 36 subtechnology groups. The technology field of patents is assigned on a subclass basis, i.e., 4-Digit IPC class 7 .
The patent data belongs to the number of patents granted to the Indian affiliates of foreign firms in the ICT industry by USPTO from 2013 to 2017. It is extracted by using two filters: assignee name (company name) and applicant country ('IN', country code of India), as per requirement. The patents which were invented in India but assigned to the foreign firms are referred to as foreign-owned patents. Thus, our data set includes all patents that had the legal address of at least one inventor in India but patent was not assigned to any Indian institution. We extracted the patent data with backward citations for each patent to examine the direction of knowledge flow. We applied several stages to get our final sample size of 50 firms from different countries. It consists of 36 US firms and 14 non-US firms. Non-US firms belong to countries such as China, Taiwan, Ireland, Netherland, Finland, Japan, South Korea, and Germany (see appendix 1). In the first stage, we 7 For details, please refer https://www.wipo.int/export/sites/www/ipstats/ en/statistics/patents/pdf/wipo_ipc_technology.pdf referred to the data given on USPTO's website as Patenting by Ownership Location (State and Country), breakout by Organization and Domestic (US) Inventor Share. It has listed the companies that received 5 or more utility patents from 2011 to 2015 and that were located in India. It includes Indian and Foreign firms, both from different industries. Further, we picked up all the foreign firms from ICT industry and scraped patent grant data with citations for each company. We prepared a list of 74 companies. It was then validated from company registration information available at MCA21 data (a database of the Ministry of Corporate Affairs, [56] Government of India). Later, the list of companies was reduced to 50 because several companies had insufficient information and thus removed during data cleaning. The description of independent variables used in this study is given in Table 2.

Empirical results
To examine the knowledge flow pattern in terms of patent citing behavior of ICT foreign firms, this study applies a negative binomial regression analysis. The results for NB model are presented in Table 3. Before applying the NB model, this study verifies whether data are over dispersed. The result of dispersion value alpha is more than zero, and the P-value is significant at 0.01, which confirms that the dependent variable is over dispersed. This implies that the NB model fits better than the poison regression model. The result of first hypotheses suggests that firms in ICT sector are more likely to cite patents with higher technology scope. This implies that a patent with larger technology scope has a higher probability of being cited by the foreign ICT firms in India. That leads to knowledge spillovers in diversified technologies in India. The result also confirms the positive relationship between knowledge flows and geographical localization. This implies that the first inventor of both cited and citing patents share the same geographical region. However, the result of technological similarities is negative. This shows that MNCs in India do not cite a similar field patent, which means that technology overlaps between cited and citing patents.  To avoid the dummy variable trap, we remove the reference category (one in each set). These include 'non-US firms citing non-US (NUSNUS) patents in the ownership category' and 'cited patents belong to Africa (AF) in the regional category'. Consequently, the estimated coefficient of dummy variable is explained compared to the reference group, which is omitted in the model.   technology but from diverse technology. Another interesting finding of this paper is that US firms in India are citing both US and non-US patents. However, non-US firms in India are citing fewer US patents (Figure 3). In the case of regional distribution of citations, we find that maximum citation comes from the US and Oceana, whereas citation from Europe and Asia are found insignificant (reference category is Africa). Thus, MNCs working in India acquire knowledge mostly from their own headquarter countries.
The results can be summed up by stating that India's patenting activities in the ICT sector are mostly coming from foreign subsidiaries. Since the patenting activities by Indian firms are shallow, it hardly influences the knowledge creation in India. Thus, in order to improve the quality of patents in India, the policymakers need to think about incentivizing R&D in the ICT sector. The improvement in the innovation and patenting activities by domestic ICT companies may lead to have a strong foothold in the technology market. Future studies can be designed to see the impact of foreign subsidiary patenting on learning curve of domestic firms. Figure 3 presents US and non-US firms' knowledge flow patterns in the ICT industry, where non-US firms are citing more non-US patents. However, US firms in India cite both US and non-US patents. It shows that knowledge flows into US firms in India are from both US and non-US firms but in case of non-US firms in India, knowledge flows from US firms are scarcely. There is a possibility that by citing more US patents, US firms might get privilege and increased probability of patent being granted at USPTO (as we used patent granted data). Since cited patents have already been through the USPTO's examination process, it may also take lesser time for cited patents to complete the examination for a favorable outcome.

CONCLUSION
This study explores the patent citing behavior of foreign ICT firms in India. Backward citation is used as a proxy of knowledge spillover. The objective is to understand the citation pattern in the top 50 US and non-US based ICT firms in India. We observe that top 50 India based ICT firms, patenting in the USPTO, belong to either US, Europe or Asia. Surprisingly, none of the Indian companies fall in the list of top 50 patenting companies. Thus, MNCs in the ICT industry are divided into two categories that are US and non-US. The results obtained using the negative-binomial model suggest that MNCs in India cite patents with higher technology scope. Hence, the patent with larger technology scope (technology breadth) are more likely cited.
Similarly, the variable of geographical localization is positive and significant. This implies that knowledge flow is quite localized in MNCs of Indian ICT industry. However, the variable of technology matching is negative significant. It means that knowledge flow is not taking place from the same