Data Analysis in Social Networks for Agribusiness: A Systematic Review

The ability of companies to react to changes imposed by the market can be aided by to information acquisition and knowledge generation. Big data technologies, crowdsourcing, and Online Social Networks (OSN) are used for knowledge generation. These technologies have assumed a significant position in agribusiness in recent decades. This work investigates how social network analysis can promote agribusiness to provide a basis for future applications and evaluations. We adopted a hybrid systematic mapping to conduct the investigation. Two hundred twenty-three works that propose solutions for agribusiness were found and categorized. Results showed the most used OSN is Twitter and revealed an increase in the number of studies in this area. The information obtained indicates how social media monitoring can complement traditional decision-making methods in managing and regulating agricultural systems. However, more studies in agribusiness using data analysis tools on social networks are required, considering the importance of social networks on marketing strategies. Based on our results, we discuss some challenges and research directions.


I. INTRODUCTION
Agribusiness is one of the strategic economic sectors worldwide. Its growth is critical and can bring benefits to the world population. One of the strategies to promote the sector is the use of data analysis techniques. According to [1], agribusiness still needs solutions for consumer market analysis to improve the sector (e.g. product adequacy, sales, and cost reduction).
Traditional market research is time-consuming, expensive, and at times incomplete and without representation, considering the different characteristics of the consumer profile and the challenges of the agribusiness sector. Dealing with multiple stakeholders in market research projects often makes quality data collection and communication reliable, lowcost, complex, and challenging. Thus, studies carried out at The associate editor coordinating the review of this manuscript and approving it for publication was Barbara Guidi .
The ability of organizations to react to changes imposed by the market is directly related to the absorption of information and generation of knowledge applied to the organizations' processes. In this vein, modern industry increasingly needs intelligent tools to generate new knowledge [5]. This also applies to the agribusiness domain. Understanding consumer trends, perceptions, or preferences, economically and efficiently, is the focus of consumer science and manufacturing industries [6].
When treating large volumes of data, intelligent tools make use of big data analytics, allowing the discovery of correlations and knowledge derivation [7]. The emergence of Online Social Networks (OSN), such as Twitter, Facebook, and Instagram, has provided the research community with access to voluntary information that was previously unattainable and/or had a high access cost.
In this context, the use of crowdsourcing stands out. Crowdsourcing has become an important component in several research domains, such as climate change [8], natural disaster preparedness and monitoring [9], [10], conservation science [11] and urban sustainability [12]. The term refers to data collected and made available to researchers by non-professional people or organizations [13]. In opportunistic crowdsourcing, non-professionals generate data which is collected and shared by uploading information to webbased social networking sites. In such applications, users are producers and send information as web content to be used by researchers for purposes other than those intended by users.
Social Media Analysis (SMA) and social networking sites (e.g., Twitter, blogs, or forums) can be critical to achieving a better understanding of human interactions and to defining new business models for organizations and environmental management [14]. Gohfar Khan [15] defines SMA as ''the art and science of extracting valuable hidden insights from large amounts of semi-structured and unstructured social media data to enable informed and insightful decision-making''. SMA is concerned with developing and evaluating tools to collect, monitor, analyze, and visualize social media data to extract relevant information and patterns [16]. Thus, SMA is a growing area that encompasses a variety of modeling and analytical techniques (e.g., sentiment analysis, topic modeling, image analysis, and others) [15].
From a structural point of view, OSNs are complex structures composed of vertices, usually representing users, and edges, which represent some form of relationship between the vertices. The study of the different types of relationships between the vertices of OSNs is called Social Network Analysis (SNA) [4]. SNA is generally used to analyze a social network graph to understand its theoretical connections and properties and identify the relative influence of the network vertices. Thus, it allows the modeling of the dynamics and growth of the OSN, i.e., predicting new connections, detecting communities and their associated influences, network density, among others [17]. This type of analysis is commonly used to help carry out targeted marketing campaigns, such as marketing campaigns to promote salest.
From a business perspective, OSN present an opportunity to reach a significant audience for market surveillance [18]. Using SNA and SMA contributes to information that can only be perceived from specific analyses. For example, detecting the most influential user in the network and evaluating their classified comments can mean finding the user most likely to reach a targeted marketing audience when presenting the product or reversing the lack of information to help a company advertise its products [19]. Here is where agribusiness can benefit from these analyses. This work proposes a Systematic Literature Mapping (SLM) to find more accurate and representative directions for Agribusiness marketing, which require less time and allows a forecast of market trends, such as SNA/SMA techniques. In other words, this work will explore how the use of SNA and SMA can support agribusiness.
According to [20], Systematic Literature Mapping (SLM) is an overview of primary studies on a specific topic that aims to identify subtopics that need more primary studies. This study uses a protocol following methodological steps to make the results more reliable. This mapping therefore focuses on research associated with the use of crowdsourcing in activities related to agribusiness. The main element of analysis uses data from online social media, in other words, ''websites and applications that allow users to create and share content or participate in social networks'' [21].
The research problem addressed in this paper is the need to support agribusiness in consumer market analysis to improve the sector. To tackle this problem, we investigate new strategies to improve marketing in agribusiness. Therefore, the main contribution of this paper is an SLM that points out research directions for agribusiness marketing, which require less time and allows a forecast of market trends. We defined the following Research Question (RQ): ''How can the use of SNA and/or SMA support marketing improvement in agribusiness?''.
This article is organized as follows: first is this introduction then Section II presents related works and other literature reviews on the same subject. Section III highlights the research method. In Section IV the results and discussion are presented in detail. Section V discusses research directions and Section VI shows the final considerations.

II. RELATED WORKS
This section highlights systematic reviews and mappings, related to agribusiness and social networks. Before the SLM execution, a search was performed in March/October 2022, looking for other literature reviews dealing with the same topics. We retrieved a set of secondary studies composed of 20 works that were analyzed reading the title and abstract of each one. The selected secondary studies (30%) represent works whose context is social media, whether applied to agribusiness or just cited as an example of an application domain.
In [22], the authors present an overview of sentiment analysis in various domains. They addressed and identified gaps, presenting guidelines for future research in the area. One of the areas that presented opportunities mentioned by the authors is agriculture. An evaluation was also carried out comparing metrics of traditional methodologies to the approaches using a Twitter dataset, such as lexical approach, ontologies, and machine learning. The authors concluded that ontologies, vector machine support and term frequency achieved high precision and provided better results. In [23] the authors discuss the current challenges of machine learning use in domains to help decision-makers plan their actions. With 79 papers analyzed, most use machine learning in the industrial sector (65%), with the agricultural sector VOLUME 11, 2023 representing 5% of the total. Thus, the agricultural sector needs more studies, considering the use of data analysis techniques.
In [24], the authors present a systematic literature review emphasizing social networks' importance in climate change interventions. They concluded that more studies focusing on social networks are needed. These studies can accelerate the adoption of climate-smart farming practices, helping farmers implement adaptive practices related to climate change and decision-making. In another article [25], the authors discuss the main aspects of social network analysis applied to the investigation of the social life of animals in the zooThe paper reviews how SNA can be used to assess the social behavior of animals and highlights directions for future research. The authors conclude that using SNA can directly impact management decisions and help maintain animal welfare standards.
In [26] the authors explore the challenges of using decision support systems in Agriculture 4.0. This paper uses the systematic literature review technique to retrieve representative decision support systems, including their applications in climate change adaptation, water resources management, and food management. The paper does not discuss data collection methods in online social media and does not list the techniques used in these decision support systems.
Despite our search efforts, we did not find a mapping or systematic review that analyzes solutions that use social media, such as OSN, to support agribusiness. Can and Alatas [27] discuss several SNA methods and present a review of state of the art in online social network analysis. However, they do not discuss its application in agribusiness. Some of the methods discussed in [27] can be directly applied in agribusinesses, such as interest mining, stance detection, irony/sarcasm detection, role mining, topic/event detection and causality detection. Others, such as privacy/preserving, can use proximity detection and anomaly detection. However, just interest mining and topic-event detection were investigated in the studies identified in this SLM. Therefore, new studies must be conducted in agribusiness to explore these new research possibilities and benefits.
These initial literature reviews motivated us to address this research gap and merge these three topics of interest: Agribusiness, SNA and SMA. Thus, our mapping study was conducted to examine the state of the art in the use of SNA and SMA to support marketing in agribusiness. Therefore, this SLM's main contribution is identifying, classifying, and analyzing agribusiness research using OSN. We also aim to reveal the state-of-the-art research dealing with OSN and/or SMA to support agribusiness.

III. METHODS AND MATERIALS
According to [20], an evidence-based solution could highlight the need to use secondary studies to investigate and gather evidence on a specific topic. Through a systematic mapping, we identified the importance of analyzing the use of OSN with SMA and SNA, to assist marketing in agribusiness.

A. SEARCHING STRATEGY
We adopt the hybrid SLM method presented by [28] in this work. In this method, the authors propose four strategies that combine database searches in digital libraries with backward snowballing (BS) and forward snowballing (FS) iterative (BS * FS), parallel (BS||FS), or sequential BS+ FS and FS+BS. In addition, the authors conduct a comparative evaluation of traditional digital libraries to find the database with the most significant performance results. The authors considered the Scopus database the most consistent digital library in terms of accuracy. In addition, the library integrates other digital libraries into its search method, increasing the search reach. However, it is necessary to complement the library with the snowballing process.
The hybrid strategy adopted in this mapping is Scopus + BS||FS. In this strategy, an initial set of papers is obtained through Scopus. Then BS and FS are performed in parallel on the same initial set. In other words, articles obtained by BS are not subject to FS and vice versa. We introduced this strategy to increase accuracy without compromising recall [29].

B. RESEARCH QUESTIONS
The following research question (RQ) was formulated to conduct the systematic mapping: ''How can the use of SNA and/or SMA support marketing in agribusiness?''. This RQ aimed to investigate the techniques used and where these solutions are applied so as to understand how the solutions can help in decision-making. This RQ was associated with four secondary questions, as shown in Table 1.
From the research questions, we used the PICOC method [30] to define the scope of work and the terms used in the search string, as illustrated in Table 2. The search string, the main terms of the RQs, synonyms, and acronyms were specified and validated with the assistance of an agribusiness expert and control articles [6], [31], [32], [33]. The identified terms form the following search string: . Two external researchers revised the protocol. Furthermore, we only cover the SNA/SMA studies applied to agribusiness. From the string, we obtained, 234 publications in the period 2017 -10/2022, after the application of the inclusion filters. The search filters used are detailed in Section III-C.

C. INCLUSION AND EXCLUSION CRITERIA
Due to the large number of documents retrieved from the digital library (Scopus), we used the inclusion and exclusion criteria to select only potentially relevant articles returned from Scopus. Then, we applied the criteria to eliminate papers not related to the goals of this mapping. The inclusion and  exclusion criteria used in this work are shown in Table 3. We used the Parsifal 1 tool to support the mapping execution.
We selected studies at two levels: (i) a new search was performed, adding the inclusion criteria as filters in the advanced search by Scopus; and (ii) a reviewer guided by the exclusion criteria performed a selection considering the title and abstract. It was verified whether both explicitly reference social media or social networking services in agribusiness. Records at this level were kept when there was doubt about their relevance -just reading the title and abstract was insufficient for the evaluation. As a result, additional sections of the articles were read.

D. QUALITY ASSESSMENT
According to the guidelines proposed in [20], researchers can develop a quality assessment for primary studies. The evaluation serves as a guide to understanding the results. Each article selected after the inclusion and exclusion criteria was evaluated considering the quality Assessment Questions (AQ). These questions, presented in Table 5, were developed based on [34] and [35]. For each question, we assigned the value 1 if the answer was ''yes'', no value was assigned if ''no'' was answered, and 0.5 value if the answer was ''partially''.
We established a predefined questionnaire (Table 4 presents details) and the AQ questions (Table 5) and highlighted information such as the social media source used and the evaluation metrics to answer the questions. The publications were categorized and tabulated according to the questions, extracting their information related to the questionnaire. This technique helped us to detect and validate the data extraction results and settle any discrepancies. The questionnaire can be accessed in. 2

IV. RESULTS
Based on the inclusion criteria (first level), 234 publications were returned from 2017 until October 2022. Figure. 1 (a) shows an increasing number of articles published between  2017 and 2020. This growth can be due to the growing popularity of OSNs over the years prompting the researcher's own interest. In 2020 there were 3.96 billion active users on OSNs worldwide, and in 2017 there were only 2.79 billion active users -an overall increase of 41.63% Figure. 1 (b). However, in relation to the number of articles published, the year 2021/2022 was lower than 2020. This could be due to the COVID pandemic that started in 2020. According to [36] the searches for studies related to the disease have increased.
To determine study eligibility, all publications that used the identified social media sources, based on the definition given in [21], were considered, including blogs, news, and other user content-sharing sites (e.g., online forums).
At the second level, described in Section III-C, few studies considered OSN a collection source for SNA and SMA in agribusiness. Considering publications that exclusively use SNA or SMA (∼23%), ∼69% used traditional data collection techniques, such as questionnaires and interviews, using OSN as a means of communication. Others (∼2%) did not use social media to collect data. They collect data through direct questionnaires. Therefore, these studies were discarded according to exclusion criterion 1 shown in Table 2. Finally, there were 12 papers that were selected as a set of studies directed to the BS||FS.
A support tool 4 configured to carry out the process using Scopus as a database was used in the snowballing process. The Forward Snowballing (FS) process execution resulted in 13 papers, and the Backward Snowballing (BS) process 408 papers. After applying the inclusion criteria, 12 papers remained in FS set and 30 in BS set. However, no papers were left in FS and BS when using the exclusion criteria. Thus, out of 234 articles, only ∼5% (12 papers) remained in the final set. This significant reduction can be explained by the number of false-positive studies captured in Scopus through the word ''agriculture'' and its variations in the search string. The process can be seen in Figure. 2. The papers selected were analyzed and the data extracted to address the research questions.
RQ1. What are social networks used for data collection in the SNA/SMA agribusiness research community? We separated the social media sources for data collection from the questionnaire used in the Quality Assessment to answer this question. Through this list, the number of studies (∼67%) that use data from a single social media source and the number that use more than one source (∼33%) were identified. We saw that most researchers prefer to use only one social media source for their research. According to the extracted data, Twitter (∼67%) was the most used source, while YouTube, Facebook, WeChat and Sina Weibo were the least used (∼8%). The last two OSNs are popular in China and are growing 5. According to Statista 6 ranking of the most popular OSNs worldwide in January 2022, WeChat and Sina Weibo are among the top ten. Being an OSN used for opinions and considering that short texts are easily mined and processed by its free API, Twitter is the most popular among the research community. Furthermore, it was also identified that some studies (∼27%) consider the use of social media analysis platforms such as Netbase 7 (∼18%), which has several ONS as a data source (e.g., Twitter, Reddit, blogs, forums, and more), LikeAlyzer 8 (∼8%) for Facebook, Twitonomy (∼8%) for Twitter and Meltwater 9 (∼8%), which also has several OSN as a social media source. These analyses can be seen in Figure. 3, which illustrates the results obtained for the analysis of RQ1, showing a ranking of the most used OSNs among the community of researchers in agribusiness.

RQ2. What are the SNA/SMA analysis techniques used in agribusiness studies?
We investigated the studies considering which techniques were used to analyze media and social networks to answer this question. We identified that studies using SNA (∼25%) were the minority. SNA was used in studies as the sole means of analysis (∼8%) or combined with SMA (∼17%). In other words, studies that consider SNA tend to consider the use of SMA to solve problems. These studies use textual analysis of the collected data and perform a topological analysis of the most cited words. Topological analysis was the only identified SNA method through the form used to model a keyword network and find trends [37]. It is possible to observe that the two works that used this type of analysis also used textual analysis. Both works first use textual analysis to find the most frequent keywords. Then, they use topological analysis to understand the relationships that one keyword has with another. As for the SMA methods, the following were identified: (i) time analysis, where a timeline of social media is made, identifying the number of publications during a period [6], [31], [38], [39]; (ii) textual analysis, where social media keywords were studied using natural language processing (NLP) [37], [38], [40], [39]; (iii) statistical analysis, where the studies used hypothesis tests, means and percentages [41], [42], [32], [43]; (iv) sentiment analysis, which aims to identify and extract subjective information from social media by combining NLP and machine learning techniques to assign weighted emotional scores [6], [31], [44], [43], [38], [39]; (v) geographic analysis, where media upload coordinates are used to map their geographic locations [38], [39]; and finally (vi) demographic analysis, where age and gender of the social media user are estimated using their first and last name as input [6]. Figure. 4 illustrates this analysis and shows a radar graph of the popularity of the SNA and SMA techniques. In the graph, it is possible to see in the geographic, topological, and demographic analysis that these areas deserve attention. These areas can be better explored as few works have been identified.

RQ3. What are the evaluation metrics used?
We analyzed the selected papers to extract the researcher's metrics to understand how the methods proposed in primary studies were evaluated. Most articles (∼83%) [6], [31], [32], [38], [39], [40], [41], [42], [43], [45] used statistical methods such as total publications, means, and percentages to describe the results and evaluate the proposal. However, none of them reported an experiment dedicated to evaluating the proposed method. Only one of the primary studies had a section describing the proposal evaluation [44]. This study used the following metrics: precision, recall, F1-score, and Area Under Curve (AUC). These metrics were used to validate the machine learning models to classify media data into a sentiment. Finally, only one study in the final set did not use any method to evaluate the proposed approach [46].
RQ4. Which segments of the agribusiness sector are studied? We analyzed the studies to understand the agribusiness subsectors where the SNA and SMA techniques are applied. We found that each work considered a specific segment in agribusiness. In [6], the authors studied the consumers' view on the production system for eggs and laying hens. The authors also exemplify how monitoring OSN can help decision-makers manage agribusiness marketing food systems through the proposed method. In [31], the authors analyzed agricultural markets over 27 months, searching for agricultural market-related keywords. They provide valuable insights about agricultural markets, including the public's view of the livestock industries and the risk of zoonotic diseases. In [32], the authors examine the engagement of companies in the olive oil sector in the OSNs and compare organic and non-organic operators. According to the authors, organic food products in Spain face commercial problems due to some factors, such as the considerable price differential between organic products and their conventional equivalents.
The study reveals statistically significant differences in the engagement and use of OSNs by non-organic and organic operators. In [41], the authors combine data from 13 sites in 11 low-income countries to study how various social capital scales relate to household food security outcomes among smallholders. The authors conclude that social network theory correlates household food security with multiple social capital scales, both within and outside the household. This social capital can be either a link (within groups) or a bridge (between groups) with different implications for how the structure of social capital affects food security. The article [45] presents the first content analysis on the Czech Twitter OSN in the context of agriculture in general. The authors identified a prevalence in tweets about biofuels, the rapeseed plant, and politics. Furthermore, they conclude that robot accounts created a significant proportion of tweets. In [42], the authors share their experience using the What-sApp platform for communication and data collection to monitor and evaluate the sweet potato value chain. The article [46] drew only on data from public newspaper reports and a sample of the social networks used by urban food networks in Bristol -a city with a well-developed urban agriculture movement -to explore how activists in urban agriculture food use OSNs during 2015. The authors intended to inform debates on urban agriculture and contribute to discussions on its growth. In [40] the authors studied the influence of COVID-19 on China's agricultural economy. In [43], the authors address the problem caused by wild pigs for agriculture and the environment. Through SMA, the authors find evidence of a need for more information on best practices for safety, such as the risk of zoonotic diseases caused by wild pigs. In addition, they describe the importance of understanding the influence of social media on people and opportunities for management agencies, such as messages in public health campaigns. In [38] the authors explore Artificial Intelligence (AI) in agriculture. Based on SMAs, the authors conclude that AI techniques in agriculture are positive. In work presented in [39], the authors focus on how SMA can support government authorities in predicting damages related to the impacts of natural disasters in urban centers. The study uses SMA in Twitter crowdsourced data using the keywords ''Disaster'' and ''Damages''. The study's methodological approach employs the social media analysis method and performs sentiment analysis and textual content of Twitter messages.
Finally, the paper [44] uses public opinion in OSN to determine whether Smart Agriculture or Agriculture 4.0 is implemented in Indonesia. Therefore in answer to RQ4, ten segments were identified: eggs and laying hens, agricultural markets, olive oil, family food security, agriculture in general, sweet potato, urban agriculture, agricultural economy, intelligent agriculture, wild pigs, impacts of natural disasters in urban centers and AI in agriculture. Several agricultural segments have not been explored in the OSN context, such as milk, its derivatives, and other commodities.
There are a few other relevant articles that were not returned by the search string but deserve to be mentioned. Sapountzi & Psannis [47] provided an overview of the techniques and tools used to mine social networks, emphasizing text mining. Plageras et al [48] investigate the technologies involved in using IoT for data processing and analysis in intelligent applications.

B. THREATS TO VALIDITY
This systematic mapping aimed to provide an overview of the literature regarding using OSN, identifying, categorizing, and analyzing SNA and SMA solutions in the agricultural domain. However, some threats to validity and limitations can influence the results.
The search string as a threat to the construction validity should be mentioned (see Subsection III-B). We defined the main terms of the RQ, synonyms, and acronyms considered adequate to make the string as comprehensive as possible. However, some terms may not have been considered in the string. To address this, several tests were carried out with the terms, and as we carried out the searches, versions were generated to decide on the best string. In addition, experts reviewed them, and control papers were used. As a result, this threat was mitigated.
All formulated conclusions, and results found in this SLM, have traceability. However, biased data extraction from selected articles may threaten the conclusion's validity. In other words, we may have included papers in the final selection set that can be a false positive. We used a predefined form of data extraction to mitigate this threat.
We only included articles that could be accessed using our university (UFJF) credentials. This restriction can also be considered a threat to validity. However, with the snowballing technique, this threat has been mitigated.
Regarding threats to internal validity, the SLM selection process (see Subsection III-A) was conducted by only one researcher, and it may not be easy to include all relevant publications in the research. Furthermore, it is challenging to ensure that all topic-related concepts and relevant articles have been included in this study, despite the care and effort taken. However, the snowballing technique was used, which helped include new relevant works.

V. RESEARCH DIRECTIONS
As social media and network analytics evolve in agribusiness, it is possible to identify the various social media sites and analytics platforms used. The most popular, if not the most representative, social media site is OSN Twitter.
We observed that social media data are primarily collected using Twitter as a source, which is used in ∼64% of the selected studies. Thus, the Twitter platform is the most used by researchers. In general, data from OSNs are the most used, either as a single source of collection, in conjunction with another OSN, or through analysis platforms.
The most used analysis technique in agriculture is the machine learning technique in sentiment analysis, which is used in ∼50% of the selected studies. However, we find increasing activity in both time-based techniques and statistics.
Our evaluation metrics observed that precision, recall, and accuracy score are the evaluation metrics most used by the selected studies to evaluate their experiments.
After analyzing the selected studies, we identified some research gaps that are important to be investigated. From the SLM results, it was possible to conclude that the data collected from the Twitter platform is the most used dataset in SNA and SMA applied to agribusiness compared to other social media platforms. This preference for Twitter is mainly due to the ease of access. However, with the growth in importance of other social media and mechanisms for extracting information concerning different media types such as photos and videos, it is necessary to advance the research. This deficiency limits the generalization of the results obtained through the analyzes. Therefore, analyzing other important OSNs such as Instagram, YouTube and TikTok could bring important insights for agribusiness marketing.
Thus, we conclude that research directions should focus on using multiple OSNs combined with the application of information extraction and analysis techniques in the various segments of agriculture, especially the underexplored segments, for example the milk and dairy segments. We can also explore techniques such as topological analysis and use of semantics, which can assist in extracting information from new media such as augmented reality.
Integrating information from various OSNs and the analysis of this data, correlating findings, and directing marketing strategies is an important research direction. Knowledge discovery from this correlation of information can help discover implicit relationships between data. In addition, the use of traceability techniques to verify the integrity of information is also important.
All these research directions are important and should be investigated in future research. This mapping promotes strategies based on new techniques for acquiring knowledge to leverage agribusiness. In addition, the SLM results can also assist researchers in a better understanding of research directions related to the use of media and social network analysis techniques in OSNs applied to agribusiness.
Considering our RQ: ''How can the use of SNA and/or SMA support marketing improvement in agribusiness?'', the results of SLM can provide directions to follow. The investigation using multiple OSNs is a path to follow. Machine Learning techniques are also identified as a technique to be used.
As a result of our investigation, we consider that the following techniques should be investigated: i) the use of multiple OSNs, combining their results in some way, ii) the use of intelligent data analysis techniques, emphasizing ML techniques, but also investigating other techniques such as semantic and structural analysis, given the structure of an OSN, iii) adequate visualization mechanisms to leverage the commercialization of specific products.
Analyzing the interaction between users in multiple OSNs contributes to identifying consumers with similar characteristics in different OSNs, using SNA techniques. It is also possible to investigate content trends for each community of each OSN, enabling the search for the most talked about information about products, using SMA techniques. As a result, with the combination of SMA and SNA techniques, agribusiness producers and researchers can identify specific product demands, and the consumers that must be reached.
More specifically, with techniques such as sentiment analysis on the publications extracted from the OSNs, combined with other analyses, such as community detection, we can identify products wanted by specific communities, which can help with strategies for future product launches and/or shutdown of other products. We can also stratify users, considering gender and generation, allowing the identification of user profiles that talk about a specific product, such as millennials who talk about vegan products, for example. We can also generate a network of users with significant probabilities of being related between communities, making possible the identification of users with the highest chances of disseminating a target product in another target community. Potential influencers capable of developing strategies to disseminate the product in other related communities can be identified. Organic strategies can also be traced, considering the stratification of influencers and products they like the most, among other possible marketing strategies.

VI. CONCLUSION
This paper presents research directions in data analysis practices and tools from social media sources, such as OSN, related to agribusiness. We started with 234 publications returned by the search string in Scopus that then went through an evaluation process during the SLM. As a result, 12 primary studies were selected.
Applications involving summarizing opinions, spotting trends, and sentiment analyses are the most common techniques these applications offer. This study focuses on the analytical approaches and techniques employed to develop such applications and the social media sources used to collect the data. We identified that a combination of different analysis methods such as temporal, sentiment, and topological analysis are often used together to achieve a specific analysis objective more accurately and with an overview of agribusiness.
As a contribution, this mapping study maps the state of art in using SNA and SMA to leverage agribusiness. The study identified online social media sources, platforms, and techniques for analyzing information, assessment metrics, and the agricultural segments covered by the primary studies.
The SLM showed that more primary studies on SNA and SMA in agribusiness are needed. For example, in the SNA context, few works use community analysis and influence analysis techniques. In addition, one of the untreated agricultural segments, which is of significant economic importance, is milk and dairy products. There is a need to assist marketing in the agricultural sector using analytical methods and opportunistic crowdsourcing.
The results obtained by the primary studies selected in this SLM, together with insights into online social media data analysis, reinforce how social media monitoring can complement traditional methods to inform agricultural producers and consumers about marketing opportunities and regulation of agribusiness.
In future work, we can conduct a new SLM to mitigate threats to validity. For example, two or more researchers in the data extraction process and increased bibliographic sources for data collection. We can also reduce the threats to the construct validity of the SLM by updating the search string terms. Over the years, new terms associated with technologies not yet explored in agribusiness may emerge.
NEDSON D. SOARES received the degree in machine learning, data mining, social network analysis, recommender systems, and semantic web technologies control and automation engineering from the Federal Center for Technological Education of Minas Gerais (CEFET-MG) and the master's degree in software engineering and database from the Universidade Federal de Juiz de Fora (UFJF). He is currently a Developer Analyst at Energisa SA, a Full Stack Developer at Digital Zootechnical Residence, and Embrapa Gado de Leite. More than ten years of experience in computing, robotics, and basic mechanic skills.
REGINA BRAGA received the bachelor's degree in computer science from the Federal University de Juiz de Fora, in 1991, and the master's and Ph.D. degrees in computer science and system engineering from the Federal University of Rio de Janeiro, in 1995 and 2000, respectively. She is currently an Associate Professor at the Universidade Federal de Juiz de Fora, working as a Permanent Member of the master's program of computer science. She has experience in computer science, with emphasis on software engineering and databases, specifically in the following topics, such as software reuse, ontologies, data integration, ecosystems, and scientific workflows. VOLUME 11, 2023 JOSÉ MARIA N. DAVID received the bachelor's degree in electrical engineering from the Military Institute of Engineering (IME), in 1983, and the M.Sc. and D.Sc. degrees in computer science from the Federal University of Rio de Janeiro, in 1991 and 2004, respectively. He is currently an Associate Professor at the Universidade Federal de Juiz de Fora, working on the undergraduate program in computer science. He is also a Permanent Member of the master's program of computer science. He has experience in computer science, focusing on software engineering, acting on the following subjects, such as groupware, CSCW, CSCL, software ecosystems, and middleware. . Specialist in the consumer market for milk and dairy products and dairy consumption trends, she has spoken in the main forums of the sector, such as world dairy summit, dairy vision and annual national workshop for dairy economists, and policy analysts. Author of the book In the Age of the Consumer: a Vision of the Brazilian Dairy Market she has already published hundreds of articles analyzing the habits and behavior of consumers of milk and dairy products in Brazil. She was a recipient of the BM&F Award for the best thesis in the area of agricultural derivatives.
VICTOR STROELE received the B.S. degree in computer science from the Federal University of Juiz de Fora, in 2005, and the master's and Ph.D. degrees in systems engineering and computer science program from the Federal University of Rio de Janeiro, in 2007 and 2012, respectively. He is currently an Associate Professor II at the Universidade Federal de Juiz de Fora. He has experience in computer science, with emphasis on data mining and complex network, working mainly on the following themes, such as clustering algorithm, social network analysis, recommender systems, and informatics in education.