Scientometric Analysis of the Application of Artificial Intelligence in Agriculture

Artificial Intelligence (AI) is considered a key element to address the current challenges facing the agricultural sector related to food production and climate change. Since AI is successfully helping to optimize human processes or tasks in several sectors. In this study, we present a scientometric analysis to answer the question, what is the academic overview of the application of artificial intelligence in agriculture? We use references indexed in the Scopus, a scientometric methodology and software tools to perform the research. We identify that the countries with the highest number of publications are China, the United States, India, and Australia through document analysis. The United States is a country with more authors and institutions collaboration. The institution with the highest published number of papers was China Agricultural University, and also that Gerrit Hoogenboom, from the University of Florida, has leadership in publications. Finally, we identified that precision agriculture, smart farming, and smart sustainable agriculture refers to apply artificial intelligence and information technologies in agriculture. Also, we identify that the Internet of Things (IoT) is an emergent topic and that decision support systems and machine learning are the transversal topics.


INTRODUCTION
Agriculture is a primary economic activity dedicated to the tillage or cultivation of the land. Its primary purpose is to obtain food for man and provide raw materials to industries. Agriculture is considered a crucial element for the economic growth of the countries. The statistics of the World Bank in 2018 indicate that the agricultural sector accounted for four percent of global Gross Domestic Product (GDP). In some developing countries, it may represent more than 25% of GDP. [1] However, the agricultural sector is facing a significant challenge. In addition to increasing food production, it must consider using more effective and sustainable production methods and adapting to climate change. [2] New technologies and Artificial Intelligence (AI) have been considered a critical element to face these challenges. AI is a branch of computer science related to the construction of smart entities that can perceive their environment and react appropriately to the information that it perceives. [3] AI has been used to create systems that optimize human processes or tasks, for instance, in healthcare through expert systems that help detect and monitor diseases. [4] In education, intelligent systems' development enables innovative teaching and learning practices focused on the student's profile. [5] In business and manufacturing, systems can transform data into useful and valuable information that allows decision-making, [6,7] to name a few examples.
Regarding the application of AI in agriculture, there are several specific topic literature reviews. [8,9] Those studies presented and described applications designed with emerging technologies (e.g., GPS, UAVs, cameras, and sensors), AI methods, and algorithms to gather information necessary to understand the soil variations and the crops. These methods allow more efficient decisions regarding the distribution of the seeds they are going to plant in this field, they will sow, even to predict the harvest's yield and make appropriate use of natural resources. However, to our knowledge, there is not a scientometric analysis of AI applications in agriculture that provides researchers with a quantitative and qualitative analysis of the scientific production of the application of AI in agriculture. This analysis will enable researchers to identify niches of opportunity to do research, to know the main topics and the emerging issues.
In this paper, we present a scientometric analysis to answer the following research question: What is the academic overview of applying artificial intelligence in agriculture? To answer this question, we answer the following specific questions: (1) How is the distribution of publications by year? (2) What types of papers are found in the review, and which journals are published? (3) From which countries are the authors that publish the most about the application of AI in agriculture? (4) Which institutions have the most extensive participation in the publication of papers on AI application in agriculture? (5) What are the authors that publish the most about the application of AI in agriculture? Furthermore, (6) Which are the themes related to the application of AI in Agriculture?.

METHODOLOGY
To carry out the scientometric analysis. We use the methodology proposed by Michán and Muñoz-Velasco, [10] which consists of five stages, shown in Figure 1.

Recovery
In this stage, we chose the digital database(s). We performed the search, which consists of establishing a generic query using terms, logical operators (e.g., AND, OR), criteria considered in the databases (e.g., language, type of article), and the Selection of the literature that will constitute the studies dataset.

Migration
It includes, extraction of metadata from the selected studies, the transfer of the information, and the loading of this information into a new database or software.

Analysis
It consists of answering the questions of interest, and the quantitative process of the literature. This through ScientoPy and Bibliometrix software and queries to the database. The quantitative strategies used are:

Interpretation
By contextualizing and interpreting the results, research trends can be established. Theoretical, methodological, or social influences and comparisons can be represented concerning a research group, institution, region, country, topic, discipline, or field of knowledge or study model.

RESULTS
This section presents our analysis results by stage of the methodology, as presented in Figure 1.

Information Sources, Search and Selection of literature
The Scopus database was selected to perform the scientometric analysis. Scopus was considered a high-quality, curated multidisciplinary coverage data source for bibliometric and academic research. [11] The search was guided by the central question: What is the academic overview of AI application in agriculture? Based on this, we used the Medical Subject Headings (MeSH) database of the National Center for Biotechnology Information (NCBI) web portal to identify the terms related to the words: "Artificial Intelligence" and "Agriculture." The MeSH database returned the following related terms: AI, artificial intelligence, computational intelligence, computer reasoning, computer vision systems, knowledge acquisition, knowledge representation, machine learning, agriculture, farming development, and agriculture.
The terms were used to form the following string: ((AI OR "artificial intelligence" OR "computational intelligence" OR "computer reasoning" OR "computer vision systems" OR "knowledge acquisition" OR "knowledge representation" OR "machine learning") AND (agriculture OR farming OR "agricultural development")). This string was applied to Scopus's topic search, including the paper title, abstract, and keywords. The search results yielded data from 1939 to 2020 and the research query was conducted on December 3, 2020. The retrieved data consists of various tags related to citation information, bibliographic information, abstract and keywords, funding details, conference information, and references.
Given that the general question was oriented to know the academic overview of AI application in agriculture, the search results yielded 5,143 documents between 1939 and 2021. Of these, the documents published in 2021 (about 95 documents) were omitted because the year has not been completed. Therefore, 5,048 references were selected between 1939 and 2020.

Extraction, Cleaning and Loading
The metadata of the search made in Scopus was stored in the RIS format, a standardized tag format developed by Research Information Systems (RIS) that enabled citation programs to exchange data.
The data curate consists on identifying and eliminating duplicate. Thirty-seven duplicate references were identified, 15 references with the legend RETRACTED were eliminated, and 1,164 documents after reading the title and abstract were considered not related to the topic of interest. To perform this task, we use the tool Rayyan QCRI. This application helps researchers screen title and abstract tasks and identify duplicates in systematic literature reviews. [12] At the end of the stage, the analysis dataset consists of metadata information from 3,832 documents.
The clean Scopus corpus was converted to a CSV file and was loaded into the ScientoPy (version 2.0.1), which is an open-source Python-based scientometric analysis tool; [13] and into Bibliometrix (version 3.0.2), an R tool that allows for quantitative research in scientometrics and bibliometrics. [14] Both tools were used because they include different visualization tools; for example, while ScientoPy offers a table listing the countries' production, Bibliometrix offers a graphical way of displaying this information (e.g., maps).

Analysis, Visualization and Interpretation
This stage consisted of executing the queries related to the questions of interest raised in the introduction and trying to answer the main question to know the overview of AI applications in agriculture. Both analysis and visualization were carried out with ScientoPy and Bibiliometrix software tools.
Next, each section presents the analysis, the visualization is performed (table, map, or graph), and finally, the interpretation of each result is made.
Publication and document growth analysis Figure 2 shows the distribution of papers published by year related to the application of AI in agriculture. The first paper identified was published in 1982. In this paper, Tinney and Estes [15] present the development of a knowledge-based expert system for rice crop Identification. Furthermore, since 2002, the number of publications dealing with AI applications in agriculture has had an increasing trend, so it continues to be an area of interest for academia and industry. In this sense, it is evident that many documents have been published during the last two years (2019 and 2020), with 889 and 918 documents, respectively. At the same time, 63.43% (2,431) of the total publications of AI in agriculture were written in the last five years (2015 -2020). Table 1 shows the number of papers according to their type. The journal paper was the most frequently identified, with 2,256 documents, representing 66.71% of the total number of documents. These publications are followed by conference papers, of which a total of 1,044 documents were identified, which represents 27.24%, followed by 54 89 review papers (2.32%) and book chapters (1.41%). Finally, 89 documents (2.32%) were classified as others: data papers, letter papers, editorial notes, and conference reviews. The dominant language in the documents was English, with 96.68%. Table 2 describes the top ten journals where the papers are published. It is observed that most of them are conference  proceedings, and only four journals were identified, of which Computers and Electronics in Agriculture is the journal with the largest number of AI publications in agriculture.

Country, Institution and Author Analysis
The analysis identified 168 from a total of 194 countries in the world. Figure 3 shows the top ten productive countries in terms of the number of documents published in AI application in agriculture. A total of 8,240 affiliations were identified, where the universities were the most participative. Table 3 shows the top ten institutions with the highest number of documents identified. It is observed that China Agricultural University has the leadership, followed by the Wageningen University and Research, the Netherlands.
A total of 12,223 authors were identified. Table 4 shows the ten authors with the most publications in the area. Gerry Hoogenboom of Institute for Sustainable Food, University of Florida, is the author who has the leadership in publications of AI applications in agriculture. Among his various publications, he is working on crop modeling, decision support systems, agrometeorology, climate change and variability, and Food Security. [17][18][19][20][21]  With respect to collaboration between authors and institutions. It can be seen that the country with the most collaboration is the USA since it has collaborations with 41 countries in Figure 4. Other countries with a more significant number of collaborations are China, India, Australia, Spain, Canada, and France.

Trend Analysis
To identify the topics and trends of the application of AI in agriculture. We performed an analysis from the author   keywords with the Bibliometrix software, from which we obtained a thematic map and a cloud of words (see Figures  5 and 6).
In Figure 5, we present a cloud of author keywords most cited in the papers. The most important keywords were precision agriculture, Internet of Things (IoT), decision support systems, deep learning, remote sensing, and image processing. These words are related to approaches where artificial intelligence is applied in agriculture. For example, precision agriculture is defined by the international society of precision agriculture as: "a management strategy that gathers, processes and analyzes temporal, spatial and individual data and combines it with other information to support management decisions according to estimated variability for improved resource use efficiency, productivity, quality, profitability and sustainability of agricultural production"; [22] while smart farming is defined as: "a development that emphasizes the use of information and communication technology in the cyber-physical farm management cycle". [23] In Figure 6, we present a thematic map based on co-word network analysis and clustering. This map is based on the method proposed by Cobo et al. [24] for detecting, quantifying, and visualizing the evolution of a research field. A thematic map allows four typologies of themes to be defined according to the quadrant in which they are placed. Themes in the first quadrant (I) are known as the motor themes. They are characterized by both high centrality and density. This means that they are developed and essential for the research field.    The main themes were forecast and sustainable development. We identify several research papers that present the use of algorithms and information and communication technologies (e.g., information from satellites) to forecast crop production or health, weather conditions, and soil moisture. [25][26][27][28] Likewise, some applications that use big data analysis to develop smart sustainable agriculture [29][30][31][32] consider the economic, social, and environmental dimensions. [33] In the second quadrant (II), we found the themes known as the highly developed and isolated themes or niche themes. They have well-developed internal links (high density), but unimportant external links are of only limited importance for the field (low centrality). We identify two themes: climatic change and artificial neural networks. Regarding climatic change, we identify papers that present intelligent systems that help to monitor livestock and crops, satellite image studies to control pests, and plan sowing and harvesting. [34][35][36][37] Concerning Artificial Neural Networks (ANN), we identify research works that propose neural network architectures to predict plants' growth, identify the state of maturity of the fruits, predict the harvests, and climatic change, [38][39][40][41][42] to mention some examples.
The emerging or declining themes are in the third quadrant (III). They have both low centrality and density, meaning that they are weakly developed and marginal. We consider that the Internet of Things (IoT) is an emergent topic, which refers to a digital interconnection of everyday objects with the Internet. IoT is considered an essential element for enabling Precision Agriculture and Smart Farming. In comparison, vector support machines (SVM) are a topic that is losing interest due to new algorithms' appearance.
Finally, in the fourth quadrant (IV), we can find primary and transversal themes. They are characterized by high centrality and low density. These themes are essential for a research field and concern general topics transversal to the field's different research areas. The transversal themes identified were decision support systems, the Internet of things (IoT), and machine learning.

DISCUSSION
In this study, we perform a scientometric analysis to answer the following question: What is the academic overview of AI application in agriculture? To perform the analysis, we use the Scopus database and the software tools ScientoPy and Bibliometrix software.
Artificial Intelligence (AI) is an area that emerged more than 60 years ago. [43] However, in our results, it can be seen that one of the first works on the application of agriculture was published in 1982. It was the development of expert systems for crop identification. [15] Furthermore, we identified that since 2002 there had been an increase in the number of studies reporting AI applications in agriculture. These applications make use of sensors or devices to facilitate processes or tasks in agriculture, e.g., systems that help the farmer to detect pests, [44] systems that facilitate the estimation of the yield of a crop. [45] These systems help the farmer to make decisions on the appropriate use of water and soil, [46,47] systems that allow the study of plant genetics. [48] Further, we identify that the approach of precision agriculture and precision farming has increased interest in using machine learning and Internet of Things (IoT) with robots and uncrewed aerial vehicle (UAV) to perform repetitive tasks such as counting and harvesting fruit, [49,50] pollination, [51] as well as for the execution of  dangerous tasks such as the use of chemicals in agricultural fields. [52][53] Also, to improve animal health and welfare and to guarantee the safety of animal-derived products. [54][55][56] Another finding was that China, the USA, and India are the most productive countries on documents related to AI in Agriculture. This is because these countries invest many food production resources, seeking to be self-sustainable in the primary sector. In our results, there are countries in which many documents are published-for example, the United Kingdom, Australia, Italy, France, and Canada. However, most of these authors from these countries appear in one document. Also, the authors are usually from different institutions within the same country, and this results in a large number of documents; however, the results are not cumulative. Therefore, no institutions or authors with leadership in the subject of interest of this study are identified.
Regarding the topics, we identify that most applications that use AI methods are to analyze information from sensors or devices. Also, we identified that AI is an essential element of new emerging areas considering the use of information technologies applied to agriculture, such as precision farming, smart farming, and smart sustainable agriculture.
Finally, a study reports a similar research work, which aims to provide present, past, and future trends of applications of AI. [57] However, new terms or trends related to the application of AI of agriculture such as: "smart sustainable agriculture," "smart farming," "food production," "climatic change," among others, were not addressed because their searches were only based on the terms of "intelligence artificial" and "agriculture." In this study, derived from a search of the words "artificial intelligence" and "agriculture" in the MeSH database, eleven terms related to these search criteria were identified on which we based our analysis. This allowed a broader search spectrum for a more robust analysis of the results. Likewise, derived from this search, we worked with 3,832 papers in one database, while in the previous study, they consulted two databases, resulting in 3,067 papers. In our study, we present clusters of topics related to AI in agriculture and analysis presented concerning main topics, emergent topics transversal, and isolated topics.

CONCLUSION
We present a scientometric analysis based on the 3,832 documents retrieved from the Scopus. The study results enable researchers to identify that documents reporting the use of AI in agriculture have been published since 1982. However, since 2002 the number of publications has been increasing. Most of the research documents published have been at journals and conferences. The journal Computers and Electronics in Agriculture has been the most used to report these works. The countries with the highest number of publications are China, the United States, India, and Australia. The United States is a country with more authors and institutions collaboration. The institution with the highest published number of documents was China Agricultural University, and Gerrit Hoogenboom from the University of Florida has leadership in publications.
Furthermore, we identified three approaches related to the application of artificial intelligence and information technologies in agriculture. These are precision agriculture, smart farming, and smart sustainable agriculture. Regarding those concepts, we consider smart sustainable agriculture could be considered an umbrella concept for precision farming and smart farming because the concept considers the systematization of agriculture using AI and information technologies, but guaranteeing world food security while promoting healthy ecosystems and supporting the sustainable management of land, water, and natural resources. Also, from the analysis, we can conclude that food production is still an opportunity to explore with AI.
Finally, a lesson learned regarding performing a scientometric study with software tools was that these tools help the researcher carry out faster and easier document metadata analysis. However, it is essential to mention that the researcher must also carry out a thorough treatment of the data. This is to ensure that the tools produce accurate results. By processing, we mean filtering the data (e.g., removing duplicate records or documents not related to the topic of interest), verifying that all records contain complete data, and checking the consistency of the nomenclature used (e.g., in the names of authors and institutions).