Machine learning application for sustainable agri-food supply chain performance: a review

The agri-food supply chain consists of activities in “farm-to-fork” order, including agriculture (i.e., land cultivation and crop production), production processes, packaging, warehousing systems, distribution, transportation, and marketing. Data analytics hold the key to ensuring future food security, food safety, and ecological sustainability. While emerging ‘smart’ technologies such as the internet of things, machine learning, and cloud computing can change production management practices. The current study presents a systematic review of machine learning (ML) applications in the agri-food supply chain. This framework identifies the role of ML algorithms in providing real-time analytical insights to assist proactive data-driven decision-making processes in the agri-food supply chain. It also guides researchers, practitioners, and policymakers on successful management to increase the productivity and sustainability of agri-food.


Introduction
The food industry's success in today's global competition is heavily influenced by using technology, information, and communication. The tougher competition in this global market, the higher customer expectations for products and services, and the shorter product life cycles will force companies to prioritize their supply chains to achieve competitive advantages that can support their business continuity [1]. SCM is a complex activity because it includes all activities ranging from scheduling, procurement of raw materials, production processes, and delivery of products to consumers. All SCM activities require a data management system to manage daily administrative activities, processes, and operations, as well as logistics [2]. This is done because data is an important asset for process design and operation control. Advances in technology and information have led to various changes in SCM practices. SCM actors are required to make improvements and process information from the market quickly. Intelligent use of data has great potential for sustainable SCM practices. Manufacturers are expected to adopt new technologies capable of quickly and intelligently interpreting data. Traditional decision support systems cannot be applied to deal with big data accurately. Therefore, in the era of big data, supply chain professionals are trying to implement smart supply chains to handle big data.
Artificial intelligence (AI) methods are used to address these big data-related challenges. Machine learning, a subset of AI, is widely used to identify hidden patterns in the data [3,4]. Machine learning (ML) algorithms can detect data whose data patterns are unknown and direct researchers to achieve the expected goals. The application of the ML Algorithm in the supply chain has attracted the interest of researchers. So far, researchers have analyzed and interpreted big data using traditional methods. Traditional methods are not capable of analyzing large and unstructured data. In addition, it also cannot identify and predict the most effective factors, especially in the case of supply chain performance. Therefore, researchers began to replace traditional analytical methods by using machine learning techniques. ML has many advantages, such as analysing big data, solving nonlinear problems, and being able to recognize and predict values.
However, to the best of our knowledge, not many studies have been conducted to specially review the status of machine learning applications in Agri-Food Supply Chain (AFSC) management. In this study, we present a systematic literature review on the application of ML in developing sustainable AFSC. The agricultural sector in the future is expected to adopt machine learning as a decision support tool. In addition, this review can be used as a reference by researchers and practitioners to understand the application of machine learning in AFSC.

Material and methods
The literature review was compiled using the systematic literature review (SLR) method. We used selected keywords to specify the conceptual boundaries, such as "decision trees'', "artificial neural network", "clustering", "genetic algorithm", "Bayesian network", "random forest", "agri-food supply chain", "food industry", etc. Publications were analyzed for the period between 2010 and 2020. The final result, 20 articles were used to analyze and conduct the review.

Machine learning (ML)
One of the algorithms developed by many researchers to predict certain data is machine learning. ML was developed to be able to learn on their own without any direction from the user. The basis of ML development is based on statistical science, mathematics, and data mining. It is expected to be able to analyse data without having to do commands or reprogramming. ML builds prediction models referring to historical data [4]. ML used in AFSC is classified into 2 groups, namely supervised learning and unsupervised learning.
Algorithm ML supervised learning uses data labelled. Supervised learning algorithms accept a set of inputs with the right output. This means that the algorithm can modify the model to fit the desired results [5]. Supervised learning is categorized into two groups, namely regression and classification techniques [6]. The ML algorithms that are widely used to analyse SCM performance are decision trees, random forests, Bayesian networks, and regression analysis [7]. Random forest (RF) is an algorithm used for big data classification. The classification process in RF is based on the training of the sample data owned [8]. Decision tree algorithms classify data into smaller subsets, where each set contains a "yes" or "no" response [9,10]. Bayesian networks are widely used to predict outcomes based on Bayes' theorem, namely by calculating the class conditional probabilities and the previous probabilities. Then, regression analysis is widely used to generate equations that show the relationship between input and output parameters.
Unsupervised learning does not use labels in predicting target features/variables. Unsupervised learning uses the similarity of the attributes that are owned. If the attributes and properties of the extracted feature data are similar, they will be grouped (clustering). The number of clusters can be unlimited. The model labels are from these groups, and if new data is to be predicted, it will be matched with groups with similar features [5]. Unsupervised learning consists of artificial neural networks, kmeans clustering, principal component analysis, and deep learning. ANN is an information processing technique inspired by the workings of the biological nervous system, namely brain cells in humans [11]. K-means clustering is widely used to divide data into k clusters. In comparison, deep learning is known as deep ANN because it has many hidden layers [4,5,8,11].

Machine Learning in agri-food supply chain
AFSC covers all "upstream" to "downstream" activities from farm cultivation and crop production, processing, testing, packaging, storage, and marketing [9,10]. Effective AFSC activities require management activities and decision-making processes at strategic, tactical, and operational levels. AFSC has complex characteristics compared to other supply chains due to its perishable, seasonal, fluctuating nature and increased consumer awareness of product quality and safety [12]. AFSC includes four operations: production planning and control at suppliers, food processing at processors (producers), distribution at distributors, and consumption at retailers/customers (see Figure 1).

Figure 1.
Agri-food supply chain operations (designed by author).

Production planning and control.
Process planning and control include all the activities needed to achieve efficient, effective, and economical operations [13]. At this stage, the activities carried out include production planning, supplier selection, predicting supply chain risks, stock market prediction, supplier segmentation, inventory management, demand forecasting, and performance prediction. Table  1 presents a list of different ML algorithms used for production planning and control.
3.2.1.1 Production planning. The use of ML tools will increase accuracy in production planning because it considers multiple constraints. Machine learning tools are more effective than manual methods, especially for companies that rely on build-to-order and make-to-stock production workflows. ML tools can reduce supply chains latency for components and parts [7].

Supplier selection.
Suppliers have an important role as it relates to time, quality, and cost [7]. A potential support vector machine (P-SVM) combines DT to solve the supplier selection problem. The P-SVM can be used to construct binary classification and select various features simultaneously.

Predicting supply chain risk. The SCRM study that uses Machine Learning is Bruzzone, and
Orsoni [14], supply chain risks such as production losses are predicted by Artificial Neural Networks (ANNs). ANN is equipped with specific scenarios such as inputs (production times, capacities, and quantities) and output (cost estimates). Based on the training data, ANNs learn to correlate between input and output to obtain cost estimates for different scenarios [15].

Inventory management.
The inventory model proposed to manage supply with one supplier and several retailers consists of two groups, namely centralized and non-centralized inventory models. The algorithms are compiled according to customer demand [9]. The algorithms used to solve inventory management problems, especially perishable products, are Q-learning and the Sarsa methods.

Lead time.
Purchasing lead time is the time interval between the delivery of the order to the company and the delivery to the supplier. Forecasting the purchase lead time is an important task for the procurement department to plan, manage, and control the production process. ML regression is commonly used to estimate the lead time of purchases in the supply chain using real industry databases [21].

Performance prediction.
Tree-based classification is widely used to maximize profits. It works to accept or reject incoming orders based on production capacity and order needs [22]. The most widely used algorithm methods for optimizing the prediction mix are the hybrid model and simulated annealing [23].

Food processing.
Food processing is the activity of physically or chemically processing raw materials into finished or semi-finished food. Food processing combines raw materials to produce food products that can be marketed and consumed by consumers. The ML developed for the food processing stage is process control, scheduling, and quality control [24].

Process control.
Process Control aims to regulate and improve operating processes to minimize non-conforming situations. Deep NN and Linear Regression can be used to predict throughput time. The system is regulated by workload control to determine reliable due dates to reduce the percentage of tardy jobs [29].

Scheduling
. ML technique has a high effectiveness in predicting scheduling accuracy by balancing build to order and make to stock production workflows. Genetic algorithms are used flexibly to solve job shop scheduling problems, while 2 NNs are used for machine allocation and priority assignment [15]. A real-time scheduling system such as NNs, SVM, and Decision Tree (based on Bagging) are integrated, and a Genetic algorithm can be used for supplier selection in AFSC [25].

Quality control.
Quality control (QC) aims to ensure that product quality can be maintained and improved. Quality control is carried out by testing the final product specifications regularly. ML tools that can be used are hierarchical clustering and SVM [26]. In addition, convolutional NN can classify failure map patterns on wafer products; the tool is integrated into an automatic monitoring system in real-time [2].

Distribution.
The distribution is the stage of distributing goods or services from producers to consumers. ML can be used at the distribution stage for distribution planning, transportation, and food delivery. Distribution and transportation are an important activity in SCM that loads heavy costs on organizations. The results of implementing ML in distribution and transportation systems can lead to the delivery time of a product to the appropriate customers by generating better delivery routes and exploring consumer behaviour [27]. Machine Learning (ML) can help decide delivery routes, predict food demand, supply raw materials, and plan logistics [28]. The delivery route problem can be solved with ML by optimizing the location of the shipping agent. Information about the current upcoming traffic situation is sent to inform them of the best route synchronously. Ensuring efficiency and on-time delivery makes it easier to deliver constant orders and even deal with problems such as running out of delivery agents or late deliveries.

Consumption.
ML techniques such as deep learning and ANN predict consumer demand, buying behaviour, and consumer perception [30]. The Bayesian network was found to be forecasting the consumers buying behaviour of different food products and performing the quality checks [4]. Factors that influence consumer purchasing behaviour for imported ready-to-eat food can be analysed using ANNs and logistic regression techniques.

Machine learning for AFSC sustainability
The review purpose of analysing agri-food data using ML algorithm is to develop efficient AFSCs. The use of ML in AFSC not only to economic benefits but also to and environmental and social performance (see Figure 2). Many companies struggle to manage their production systems due to increasing market uncertainty. While emerging 'smart' technologies such as the internet of things, cloud computing, and machine learning are solutions to transforming traditional production management systems into realtime-based.
The model presented adopts an incremental approach that companies with limited resources can apply to improve their production planning and control process in the context of industry 4.0 and sustainability. The results obtained are that make-to-order companies can obtain greater benefits from a smart product strategy, while make-to-stock companies gain greater profits by implementing a smart process strategy [13]. At the food processing stage, ML algorithm is used for process control, scheduling, and process control. The use of ML provides benefits for social sustainability such as reducing poverty, increasing economic growth while improving public health [31,32]. Thus, increasing social sustainability through the development of active local markets by meeting the customer and market demands. At the distribution stage, the use of ML has resulted in sustainable social and economic performance improvements through improved food safety, quality, economic savings, and customer satisfaction. The use of ANN in demand forecasting leads to economic savings and increased customer satisfaction. Similarly, optimized vehicle routing and fleet management contribute to fuel economy that improves environmental and economic performance in last-mile deliveries.

Figure 2.
Frameworks of the potentials of machine learning for AFSC sustainability (designed by author). Figure 2 presents a summary outline of the great potentials of the ML algorithm for developing sustainable AFSCs. The findings of the review illustrate AFSC span the three foundational elements of sustainability. Practitioners can use the framework developed in the current study as it explains how the ML algorithm contributes to the overall efficiency of AFSCs and mitigates some of the challenges faced by the agri-food industry. The proposed framework will guide these practitioners in deciding the appropriate AFSC resources and AFSC management to improve sustainable performance.

Conclusions
The ML algorithms used to develop sustainable AFSC are grouped into two categories, namely supervised learning (Bayesian network, regression, and decision trees) and unsupervised (ANN, deep learning, GA, and clustering). The main contribution of the study is the ML-AFSC performance applications framework that will further guide academics and practitioners to understand the current states of literature in this agri-food. This framework is based on a literature review that has not been tested empirically. Therefore, future researchers can validate this framework empirically. In addition, researchers can also explore the application of ML to AFSC in different regions.