Computer vision applied to food and agricultural products

Computer vision (CV) has been applied for years to automate many human activities. It is one of the key technologies for the modernization of the agri-food industry towards the fourth industrial revolution (Industry 4.0). In the agricultural sector, CV systems are applied to automate or obtain information from many agricultural tasks such as planting, cultivation, farm management, disease control, weed control or robotic harvesting. It is also widely used in postharvest to automate and obtain objective information in processes such as quality control and evaluation, damage detection, classification of fruits or vegetables in commercial categories or composition analysis. One of the main advantages is the ability of this technology to obtain information in regions of the spectrum that are invisible to the human eye. An example is the case of hyperspectral imaging systems. These systems generate a large amount of data that needs to be processed efficiently, creating robust and repeatable statistical models that allow the technology to be implemented at an industrial level. To achieve this, it is necessary to couple CV systems with advanced artificial intelligence tools such as machine learning or deep learning. The objective of this work is to review the latest advances in CV systems applied to food and agricultural products and processes.


INTRODUCTION
The main objectives of future agriculture are to increase productivity and food quality, reduce operations costs, and optimize input use. Therefore, the development of computer vision and its application in the development of non-destructive methods, precision agriculture, etc., enables us to automate and accelerate field, harvest, and post-harvest operations, essentially creating a new branch of Industry 4.0 called Agriculture 4.0. This type of agriculture integrates data and information to monitor field activities by applying remote and proximal sensing (PALLOTTINO et al., 2019).
Computer vision allows several activities. Recently is being researched fruit count on orchards (although the occlusion), disease plant detection, and defects detection in fruits. CV also improves the robot's capacity to determine the fruit harvest point, and nowadays, CV is further used to identify and estimate fruit weight supermarkets.
In addition, consumer interest in food quality and safety is increasing, primarily owing to international food trade, which requires rapid and non-destructive inspection methods (OK et al., 2019). Similarly, the prediction of quality parameters, identification of adulteration and variety, discrimination of origin, etc., are activities of interest in the evaluation of agri-food products but are currently based on offline and destructive techniques (WANG; SUN; PU, 2017).
The development of sensors has enabled us to obtain big data in a non-destructive manner, reducing analysis costs and time. Several sensors that detect and monitor electromagnetic waves combined with new techniques of image processing, machine vision, and computer science are used to build smart systems for Agriculture 4.0.
Based on this, this review presents state-of-the-art computer vision systems for proximal sensing (food and agricultural products close or in contact with sensors), including the actual type of systems, processing, and applications.
Computer vision applied to food and agricultural products technique. Classification techniques include statistical techniques (STs), neural networks (NNs), support vector machines (SVMs), and fuzzy logic (FL) (MAHENDRAN; AJAY VINO; ANANDAKUMAR, 2016). Machine learning techniques, particularly deep learning, have been used with significant success in computer vision.
Furthermore, the recent improvements in deep learning, such as image classification, object detection, tracking, and image manipulation, enable new explorations of more complex and autonomous machine applications such as self-driving vehicles, humanoids, and drones.

Acquisition Systems
A typical digital image is obtained by recording radiant energy in the visible spectrum into a 2D array of numbers. An example of image formation is the conversion of visible light (absorbed, reflected, and scattered) into a camera's electrical signals (ABDULLAH, 2016). Here, the acquisition system consists of an illumination source, a camera, a frame-grabber for analog-to-digital conversion, a computer, and a monitor to visualize the information.
Recently, the electromagnetic spectrum used in research has been expanded to increase the range of machine vision applications. Initially, only cameras in the visible light range were used; in recent times, research on camera systems that enable the observation of various parts of the electromagnetic spectrum has been conducted. Examples include camera systems such as computed tomography (CT), MRI, nuclear magnetic resonance (NMR), single-photon emission computed tomography (SPECT), positron emission tomography (PET), infrared and radio cameras (ABDULLAH, 2016), multispectral and hyperspectral cameras, biospeckle, and THz cameras.
Various cameras, ranging from the successful charge-coupled device (CCD) cameras to those using complementary metal-oxide-semiconductor (CMOS) technology, have been used. Using a single-chip CCD, monochrome imaging for sensing visible (Vis) or nearinfrared (NIR) electromagnetic waves can be obtained. Color images can also be acquired using a single-chip CCD by modifying the CCD device's pixels for red, green, and blue (RGB) color acquisition. A three-chip CCD camera can be used for color image acquisition (MAHENDRAN; AJAY VINO; ANANDAKUMAR, 2016).
CCD cameras are widely employed in the analysis of food and agricultural products, facilitating the acquisition of exterior characteristics of objects such as color, shape, size, texture, and surface damage (MAHAJAN; DAS; SARDANA, 2015).
Visible and NIR spectroscopy enables the evaluation of the chemical composition and internal structure of agricultural products. These systems are frequently composed of an infrared source, a wavelength isolator, a detector, and a data processor. The most common sources are tungsten, halogen, and quartz-halogen lamps. Ren et al., (2020) used a Vis-NIR spectrograph with wavelengths ranging from 350 to 1100 nm with a spectral resolution of 5 nm and two 150-W halogen lamps to acquire hyperspectral data of tea.
X-ray CT is formed using an X-ray tube, a beam collimator, and a detector. The images are formed after high-energy photon penetration and attenuation of X-ray radiation (CAKMAK, 2019).
The acquisition system for Raman spectroscopy is based on an excitation source at a wavelength range from visible to infrared, a wavelength separator, and a spotter as a CCD.

Color imaging
Advances in artificial vision enable us to obtain new knowledge and increase the efficiency and objectivity of inspection processes. This is because of the increase in the camera capabilities that enable obtaining higher resolution images (even in regions of the spectrum that are invisible to the human eye), a high capacity of computers to process data at high speed, and the evolution of system storage and communications. The automation level has increased exponentially in recent years, while equipment prices have decreased, enabling the creation of practical and complex applications, such as those related to the inspection of agricultural products (CUBERO et al., 2011).
Color cameras are the most widely used for computer vision because they capture images similar to those perceived by the human eye. The technology for acquiring these images is relatively inexpensive and very advanced, and some highly developed techniques to process information from these types of images exist. Color is an important quality characteristic for consumer acceptance, either aesthetic or linked to functional attributes and the stage of product development (PATHARE; OPARA; AL-SAID, 2013). In nature, the perceived color is primarily determined by different types of pigments such as chlorophylls, carotenes, xanthophylls, and anthocyanins, which offer information on the type and state of the plants and their fruits (WALSH et al., 2020). For example, color is used to estimate the ripeness or some internal quality parameters of fruits. Nevertheless, as this is a subjective human perception, tools to measure, quantify, and compare colors are required. These are color spaces that are mathematical models representing colors (DE-LA-TORRE et al., 2019;PALLOTTINO et al., 2019). Frequently, the color space selected in digital images is RGB, which is native to cameras and computers. However, other color spaces, such as CIELAB or hue, saturation, and value (HSV), are also widely used as they attempt to represent human perception (DE-LA-TORRE et al., 2019). Le Nguyen et al. (2020), measured the quality of sweet cherries by measuring the color of the surface as color is closely related to parameters such as anthocyanin concentration, sweetness, and fruit-specific flavor. The hue component was correlated with soluble solids content (SSC), firmness, respiration rate, and weight loss, achieving R 2 values greater than 0.92 in all scenarios. The estimation of the internal quality of pomegranates using the color of the peel was investigated by Fashi et al. (2019). The aril color and size could be predicted with an R 2 of 0.94 using artificial neural networks (ANNs). Huang et al. (2018), evaluated the internal quality of mangoes by integrating textural information obtained using a CCD camera with color information provided by a colorimetric sensor array. The changes along the time of these parameters were related to hardness and total soluble solids (TSS) content. Color indices are some of the most commonly used tools to describe colors and quantify the color of fruits such as citrus fruits or tomatoes. Hadimani and Mittal (2019) compared the traditional citrus color index (CUBERO et al., 2018;VIDAL et al., 2013) with the CIELAB coordinates a* and b*, obtaining better results to describe the color of mandarin cv. "Kinnow" fruit. They also analyzed the relationship between the fruit's exterior peel color and its internal characteristics. Bello et al. (2020), related color indices based on RGB coordinates to quality parameters of tomatoes and maturity stages. In a similar study, Costa et al. (2020), combined color information in RGB, CIELAB, and HSV coordinates to predict the physicochemical quality properties of coffee fruits cv. "Robusta". Cherry, immature, and over-ripe coffee fruits were correctly classified in 100% of the scenarios. One of the principal applications of color measurement is the estimation of the maturity stage. This property was determined by Santos Pereira et al . (2018) for papaya by analyzing twenty-one color features based on the RGB, CIELAB, and HSV color spaces.
When images are acquired, they are processed to obtain useful information. This task requires the development of efficient, robust, repeatable, rapid, and accurate processing algorithms. The analysis of these images provides information on the color, texture, or external properties as well as defects of objects. Among the essential steps of this process are segmentation, which consists of dividing the images into regions of interest (ROIs), and the extraction of characteristics to obtain the desired information from the regions or objects found (RUSS; NEAL, 2018). Segmentation can be performed using different approaches. Some are based on locating regions by searching textures, boundaries, or colors, while others classify individual pixels by attending some previous training. Sharif et al. (2018), used a technique based on the multiclass SVM for citrus disease classification. After segmentation, color, textural, and geometric features were used to analyze images of oranges with a variety of peel defects. Color and textural features were also used by Zhang et al. (2020), to segment images of apple orchards and detected apples with similar colors to the leaves. Segmentation can be performed using a supervised method, with which the user must input some previous knowledge to the model, or an unsupervised method, with which no user intervention is required. Tian et al. (2019), developed an unsupervised segmentation method based on the k-means technique to segment diseased tomato plant leaves. Among the principal features observed in fruit inspection, those related to the quality perceived by consumers, such as size and color, have been the most studied. Liu et al. (2019a), designed a classifier based on computer vision to grade tomatoes based on their color, diameter, and shape using different image processing algorithms. Volume is not as used as a marketing decision but can be used as a weight estimator. The volume of mangoes was estimated by Mon and Zaraung (2020) from the length and obtained through the processing of 2D images. Morphological features from color images have also been extracted and used to individually detect kiwis arranged in clusters. Here calyx detection had an important function in separating and identifying individual fruits (FU et al., 2019).
Because of the ease of imitating the human eye, the development of rapid and efficient algorithms, and the processing power of computers, these systems have been used to analyze agricultural products on inspection lines in real-time, for instance, for mangoes (IBRAHIM et al., 2016), peaches (LI et al., 2016), apples (UNAYet al., 2011), mandarins (BLASCO et al., 2009b and pomegranate arils (BLASCO et al., 2009a). In these electronic sorters, the fruit travels at a very high speed on a conveyor belt. When the fruit passes under a camera, several images are captured while the fruit rotates so that most of its surface is captured. All systems must be synchronized to capture the images in the exact moment and deliver the fruit by the outlet corresponding to the category decided by the inspection software (ALEIXOS et al., 2002).
Computer vision applied to food and agricultural products

Hyperspectral Systems
As stated earlier, systems based on color images are widely used in the industry to estimate the external characteristics of products. However, some internal damages or specific organoleptic characteristics are not visible and cannot be detected using traditional systems. Knowing the composition or internal properties of fruits or anticipating internal damage increases the added value and removes defective products from the production chain, increasing the batch's overall quality. Properties such as soluble solids content, acidity, and texture are some of the parameters used to determine the maturity of fresh products. Among the optical detection technologies, hyperspectral imaging (HSI) has emerged as a potential tool for the non-destructive analysis of the internal quality and safety of agri-food products (LORENTE et al., 2012;LU et al., 2020). HSI combines the advantages of spectroscopy to capture chemical composition (CORTÉS et al., 2019;WALSH et al., 2020) with the advantages of imaging to obtain spatial information ( Figure 1) (JIA et al., 2020).
The information captured by these systems is organized in a 3D matrix (known as a hypercube): 2D axes contain spectral information through the concepts of the line (X) and sample (Y) and the third dimension (λ) contains spectral information. Therefore, according to a specific pixel (x, y), its corresponding vector of spectral values can be obtained in the study's wavelength range (LI et al., 2014). Another critical advantage of HSI technology is its ability to acquire information from spectral regions that the human eye cannot see, such as ultraviolet, NIR, and infrared, generating specific fingerprints according to the composition or condition in evaluation (SIMKO; JIMENEZ-BERNI; FURBANK, 2015).
The acquisition of the images is also slow, depending on the hardware used. Multispectral systems are more straightforward and faster implementations of hyperspectral systems in which a relatively lower number of bands are captured. Several technologies for capturing hyperspectral images exist. Among the most used systems are liquid crystal tunable filters (LCTFs) and image spectrophotometers (GÓMEZ-SANCHIS et al., 2014). An LCTF is an electronically controlled optical filter that permits a selected wavelength to pass through and blocks others. Thus, images in the entire spectral range can be obtained by selecting different wavelengths. The main advantages of LCTF-based systems are their higher spatial resolution and image quality. In contrast, the process of image acquisition is slow, and spectral resolution is unsatisfactory. These systems were used by Munera et al. (2021), and Munera et al. (2019a), to assess the internal quality of loquats and pomegranates, respectively. Thus, the internal components and some properties that are key to the fruit's marketing can be estimated. The residual astringency of persimmon after a detergency treatment was determined by Munera et al. (2017Munera et al. ( , 2019b; to avoid possible fraud, Munera et al. (2018), discriminated externally identical varieties of nectarine but with different internal qualities. Citrus fruits, particularly the detection of non-visible rottenness caused by fungi (FOLCH-FORTUNY et al., 2016), and the maturity of mangoes cv. "Manila" has also been investigated using these systems (VÉLEZ-RIVERA et al., 2014). The combination of LCTF with structured-illumination reflectance imaging (SIRI) was created by Lu and Lu (2017) to detect defects in apples.
An LCTF system combined with a pushbroom imaging spectrometer was combined by Fan et al. (2018), to detect external damages on blueberries. Pushbroom imaging spectrophotometers acquire line-by-line spectral data and require the object to move beneath a camera while the image is being captured. In a camera with a matrix CCD sensor, while the camera captures the spatial information in one line of the CCD, the spectral information is projected onto the corresponding column. These systems are the most used, as they enable the capturing of moving objects, and therefore, in-line inspections. This system was used in the range 400-1000 and 900-1700 nm by Tsouvaltzis et al. (2020), to detect chilling injuries in eggplants. Fernandes et al. (2015), used an imaging spectrograph in the range of 380-1028 nm to determine anthocyanin content, sugar content, and acidity in grape berries. This technology has also been used to determine the internal quality of fruits such as oranges ( However, because of the large amount of data generated by these systems and the relatively long acquisition time of hyperspectral images, these systems have not yet been implemented in the industry to conducted in-line controls of the quality of the products, although the first steps are already being conducted (VÁSQUEZ et al., 2018).

NON-STANDARD TECHNIQUES OF COMPUTER VISION SYSTEMS Biospeckle
Biospeckle is a non-invasive technique that is widely used to assess biological systems. This phenomenon is based on the interference of coherent electromagnetic waves after reflection from a surface, on which it occurs in a dynamic process. If this process occurs in a vegetal or animal tissue, the organelle size, cellular structure, cell growth, and division, biochemical reactions will affect the observed results.
There are several types of research with applications of biospeckle in different aspects of knowledge, such as obtaining information on the contamination of wastewater as an automatic analysis (VIANA; PIRES; BRAGA, 2017), characterizing plant tissue cultures (SCHOTT et al., 2020); monitoring blood flow (ZHANG et al., 2019), assessing seed quality (SINGH et al., 2020;VIVAS et al., 2017), evaluating the fermentation process , and in applications ranging from the health field to agricultural products (AMARAL et al., 2017;HUMEAU-HEURTIER et al., 2012;YOUSSEF et al., 2019).
Different image processing techniques are used to obtain information using biospeckle. Some algorithms return numerical results, such as the moment of inertia (MI) and absolute value difference (AVD) (ANSARI; NIRALA, 2016a; CARDOSO; BRAGA, 2014). Graphical results are also obtained, for example, laser speckle contrast analysis (LASCA), motion history image (MHI), generalized difference, and Fujii (RABAL; BRAGA, 2009). This technique has been developed and combined with AI for applications in agriculture and postharvesting, such as the identification of chilling and freezing disorders in oranges; identification of bruising, maturation, and ripening in fruits and vegetables; and identification of defects and damages in fruits (MINZ; NIRALA, 2014; RAHMANIAN et al., 2020; WU; ZHU; REN, 2020). Minz and Nirala (2014) used biospeckle to measure biological activity in apples, pears, and tomatoes, applying generalized difference and parameterized Fujii. Amaral et al. (2017), applied biospeckle to assess the sorption behavior of freeze-dried passion fruit. Wu et al. (2020), proposed a method for defect detection in apples based on laser backscattering imaging and convolutional neural networks (CNNs), and the method could effectively, nondestructively, and automatically identify the defect regions with a recognition rate of over 90%. Arefi et al. (2017), used biospeckle combined with texture descriptors and ANNs to assess mealiness in apple fruits.
In biospeckle applications, the acquisition system is frequently composed of a laser source, lens, and CCD camera, which are considered simple and low-cost equipment ( Figure 2  Hardware and software for biospeckle technology have been studied to improve image processing, portability of the equipment, and new techniques for obtaining information. Pieczywek et al. (2017), developed a method for the real-time evaluation of biospeckle using a live video stream with the Fujii method. Rivera and Braga Jr. (2020) compared biospeckle data for three different frequency bands of speckle signals and different light intensities. Catalano et al. (2019), performed image acquisition and created apps for image processing on a smartphone for biospeckle analysis. Rivera et al. (2019), created a new method to obtain biospeckle information by employing sound to monitor biological systems.
Despite the wide range of research, Pandiselvam et al. (2020), indicated the challenges in using the biospeckle technique as the lack of a standard in applications and a requirement for commercial equipment for dedicated use; they also mentioned laser penetration, which cannot be used to assess the internal parts of agricultural products. Other limitations are the interference of light and sound vibration, which limit the use of the technique in the field. Pieczywek et al. (2018), compared the biospeckle technique using visual inspection, hyperspectral imaging, and the chlorophyll fluorescence detection method in the early detection of bull's eye rot in apples. They used three different laser wavelengths: 473, 532, and 830 nm. To obtain the information, they used the correlation coefficient, the Fuji index, the moment of inertia, and frequency analysis. Biospeckle exhibited a high level of performance in disease detection compared with hyperspectral imaging and chlorophyll fluorescence. They concluded that biospeckle has considerable potential as a diagnostic tool for detecting apple diseases at an early stage of their development. In comparison with visual inspection, hyperspectral imaging, and chlorophyll fluorescence, the authors indicated the advantages of biospeckle: a more straightforward experimental setup, low cost, and less time consuming with data processing.

Terahertz (THz) Image Systems
The emerging THz technology uses the energy present in the electromagnetic spectrum, from the relatively unexplored range from 100 GHz to 30 THz (LIN; SUN, 2020;LIU et al., 2016;WANG et al., 2019). It has attracted interest owing to the following characteristics: minor photon energy, deep penetration, and molecular resonance responses host ample physical and chemical information of biomolecular interactions; non-ionizing radiation; and the principle that different materials have different spectral fingerprints, which can be employed for identification, particularly for foods (JIANG; GE;OK et al., 2014;REN et al., 2019).
THz technology applications can be categorized into four main groups: sensing, imaging, spectroscopy, and communication, characterized by their non-destructive nature (REN et al., 2019). THz spectroscopy and imaging have had an increasing interest in the application for food quality and safety control, agricultural product analysis and quality inspection, and the inspection of stored food (LIU et al., 2016;WANG;PU, 2017). However, owing to the low efficiency of THz energy sources and detectors and, consequently, the difficulty of building efficient instrumentation in this wavelength range, it was ignored until the mid-1990s (GOWEN; O'SULLIVAN; O'DONNELL, 2012).
THz technology can penetrate food materials deeper than other optical sources can and does not promote molecular motion such as rotation or vibration; similarly, it interacts weakly with nonpolar materials such as Teflon, polyethylene, and polytetrafluoroethylene. Both these properties make the use of THz waves a promising technology for noninvasive and non-destructive evaluation of food packaging and manufactured products ( Equipment is the main obstacle for THz universalization as the cost of THz technology is higher than that of other imaging technologies such as hyperspectral (UV-Vis or NIR range), RGB cameras, and X-rays. However, the new developments in laser technologies, integrated optics, and its application in THz systems have made this technology more accessible for low-cost systems with high performance (REN et al., 2019;WANG;PU, 2017).
THz imaging systems using continuous waves or pulsed systems are acquired by rastering, moving the sample along the x and y dimensions, and recording the THz signal for each spatial position. The scheme of a THz image system is shown in Figure 3. Similarly, these can operate in the transmission or reflection modes and in the time or spatial domains. However, this operation is time-consuming depending on system characteristics and the desired spatial resolution (GOWEN; O' SULLIVAN;O'DONNELL, 2012;OK et al., 2019).
Different applications of THz image systems have been tested for food and agricultural product analysis, being used in the detection of foreign bodies (SHIN; CHOI; OK, 2018), determination of compound (JIANG; GE; ZHANG, 2020), pesticide, and antibiotic residues in agri-food products, characterization of edible oils and genetically modified food, etc. (WANG et al., 2019). Therefore, some studies on foodstuffs were conducted to test the water content of plant leaves (REN et al., 2019); the early detection of germinated wheat grains (JIANG et al., 2016); detection of foreign bodies in noodle flour (LEE et al., 2012)

PROCESSING OF DATA FROM IMAGES
Ultimately, the spectral images are treated to obtain spectral signatures corresponding to vegetative Computer vision applied to food and agricultural products states, the concentration of compounds, and the presence of microorganisms, etc. The extraction of these profiles from a specific area of the image utilizes a segmentation process or manual selection of the ROI.
The relationship between the obtained profiles and the quality parameters is analyzed to extract features from the profiles and develop prediction or classification models. These models, applied to new profiles, enable us to predict quality parameter conditions or obtain chemical images if they are applied to the entire image. The following subsections detail the main techniques used to model relationships.

Data exploration
Information content in images can be summarized into profiles, which commonly means a high number of variables per point, many of which are possibly non-relevant. Consequently, establishing if it will examine complete information or reduce non-relevant variables is necessary, reducing cost and time data analysis (OBLITAS et al., 2020).
Among the primary methods used, the PCA is the most common. It creates new variables (principal components) as a product between the eigenvector and the spectral vector, attempting to represent most of the variability in the data set using a small number of factors Another group of importance in data exploration is those that perform variable selection as those grouped in cluster analysis (CA). Hierarchical cluster analysis (HCA) is one of the most used methods, and it explores the organization of variables into inter-groups and creates a hierarchy through dendrograms and nested cluster diagrams.

Classification techniques
Classification problems involve determining a mathematical model that can recognize samples belonging to specific classes. In images, the aim is to recognize objects or pixels with standard features and separate them into different classes, thereby segmenting the image.
According to Oblitas et al. (2020), classification techniques can be grouped into three main categories: based on distance (of pixels with similar features), such as k-nearest neighbors (KNN); on probability (of a pixel belonging to any class), such as linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), or unequal class models (UNEQ); and on experience (knowledge indicating a pixel belongs to any class), such as the ANN. Moreover, classification methods can be supervised when previous knowledge on the problem is supplied to the algorithm or unsupervised when the algorithm does not require any intervention. Most classification algorithms used in agri-food inspection are supervised because they must involve previous training steps. These are rapid, but because of the vast variability present in these products, they require frequent retraining.
According to Cai et al. (2018), LDA is one of the most popular supervised methods for food analysis; it estimates the multivariable probability density functions for each class. LDA begins with the estimation of the location and dispersion parameters. It was applied in laboratory studies such as that developed by Rahman et al. (2018), on vegetable tissue micrograph for microstructure classification, Shafiee et al. (2016), on determining honey adulteration, comparing different classifiers and obtaining LDA accuracy over 90%, or in remote sensing such as in the study of Furlanetto et al. (2020) for vegetable identification.
Based on variable selection, different techniques can be applied, such as partial least squares-discriminant analysis (PLSR-DA) for object classification in hyperspectral images (ZHANG et al., 2018) and discrimination of polyethylene films (BONIFAZI; CAPOBIANCO; SERRANTI, 2018).
Two techniques, specifically used in classification problems, are SVMs and ANNs, both of which are widely used in the field of pattern recognition for linear and non-linear classification scenarios (LI et al., 2020). Their applications involve images in the RGB format (CASTRO et al., 2019;JIANG et al., 2020a), multispectral imaging (YU et al., 2020), and hyperspectral imaging (SHAFIEE et al., 2016). Kang et al. (2020), compared LDA, SVMs, and softmax regression to classify serogroups of Escherichia coli. Liu et al. (2019b), evaluated least-squares support vector machines (LS-SVMs), backpropagation neural network (BPNNs), and random forest (RF) to predict the content of aflatoxin in soybean oil.

Regression techniques
In food engineering, another type of analysis is related to the prediction of the concentration of compounds of interest in foods; regression methods are used for these tasks. Similarly, for classification, their behavior can be linear or non-linear, and the models must manage this characteristic (OBLITAS et al., 2020).
Some commonly used methods are linear regression (LR) and multilinear regression (MLR), which uses predictor variables without transformation, principal component regression (PCR), and partial least square regression (PLSR), which uses a previous transformation of variables into its principal components.
PLSR is one of the widely used chemometric techniques, for instance, to extract data from hyperspectral images, owing to its capacity to reduce dimensionality in complex systems (JIA et al., 2020). The equation for this technique can be summarized as Y = βX + e, where Y is the matrix of the predicting variable, β is the matrix of beta coefficients, X is the measured variable, and e is the model error (VÁSQUEZ et al., 2018). Its applications have ranged over a variety of image types such as multispectral satellite images (MALLAH NOWKANDEH; NOROOZI; HOMAEE, 2018), multispectral images of unmanned vehicles (GUO et al., 2020), laboratory multispectral imaging (YU et al., 2020), hyperspectral images (XU et al., 2021), and thermal images (ELSAYED et al., 2017).

Advanced techniques (Machine learning)
Machine learning is a subdivision of computer science applied for pattern recognition and computational learning in AI (SWAMYNATHAN, 2019). It is based on the training of neural networks that enables machines to learn from a database and make predictions. The main advantage of this learning is improved performance, as it is exposed to new and larger databases. The categories of machine learning are supervised, unsupervised, and reinforcement learning.
Deep learning is a subfield of machine learning whose algorithms aim to bring machine intelligence closer to the human level, making them capable of solving any problem in a specific subject. Deep learning has been applied successfully to solve computer vision, audio processing, and text mining problems (SWAMYNATHAN, 2019). Examples of deep learning are CNNs, which are advantageously applied in image classification.
Applications based on deep learning have increased over the last decade owing to the significant advances in AI and the increase in computing power since the arrival of graphic processor units). These advances have been motivated by two reasons: the increase in available data (the famous Big Data) and the application of machine learning methods that are key to companies such as Facebook, Google, or LinkedIn. In recent years, a revolution in machine learning with the emergence of deep learning algorithms has occurred. Among them, deep convolutional neural networks (DCNNs) are currently the state of the art in computer vision applications. Until the emergence of deep neuronal models, multilayer neuronal models with more than two hidden layers were considered useless. In the 2000s, no significant research was conducted that used more than two hidden layers. These models had two main problems: a) the initialization of the parameters and b) overfitting. Therefore, the fruit inspection systems that have been developed using these techniques have not been implemented.
Currently, the emergence of deep multilayer and DCNN models has solved these problems. DCNNs are flexible algorithms that have been used successfully in the inspection problems of processed foods (KATO et al., 2019) or fresh fruit (ASHRAF et al., 2019;GIEFER et al., 2019;STEINBRENER;POSCH;LEITNER, 2019). All references found are from 2018, which indicates the novelty of the techniques. However, examples of the use of DCNN in fresh fruit inspection using hyperspectral images have not been found in the literature. This is because the acquisition and labeling of images by an expert can be very tedious. Algorithms based on deep learning require many images for training. Its depth implies many parameters, and, as with other models, this fact involves many labeled samples; thus, the depth is increased by the variability of fruits.

COMPUTER VISION APPLICATIONS IN FOOD AND AGRICULTURAL PRODUCTS Digital Image Systems applied to Agriculture 4.0
Recent studies presented digital images coupled with machines and unmanned aerial vehicle (UAV) devices to monitor crops in the field, recognize crop productivity, localize robotics, and enable the sustainable use of natural resources such as water, agrochemicals, and fertilizers. Andújar et al. (2019), compared aerial imagery with on-ground detection using an RGB-depth camera and Microsoft Kinect v.2, and they observed that UAVs Computer vision applied to food and agricultural products are affordable and can encompass a larger surface area for vineyards. Abdelghafour et al. (2019), applied a proximal image to describe the canopy structure of plants for precision viticulture. A foreground extraction was performed based on color information, pixel-wise feature extraction with texture captured with local structure tensor, pixel-wise classification, and spatial regularization. This system with optical sensors enables the analysis of agronomic data as an automated and non-intrusive technique. Furthermore, the information obtained on plant productivity is essential for precision fertilization and irrigation.
Pinto de Aguiar et al. (2020), utilized feature extraction on vineyards for robotics localization and mapping to locate vine trunks on images. They used low-power and low-cost equipment such as Google's USB Accelerator and NVIDIA's Jetson Nano. Google's USB Accelerator is adaptable with TensorFlow Lite, used in mobile and portable equipment, and can perform image classification, object detection, and semantic segmentation. A small version of You Only Look Once (YOLO) was used to identify vine trunks in real-time. YOLO is used to detect objects in full images and, when applied to a webcam, can detect moving objects (REDMON et al., 2016).

Plant Disease Detection for Smart Farming
The integration of digital images, several sensors, the Internet of Things, deep learning, robust algorithms, UAVs, and smartphones is increasingly enabling the detection of diseases in plants in the field. The inclusion of different types of data enables smarter systems to perform field operations. Ashok et al. (2020), proposed a 98% accuracy method to detect disorders in tomato plant leaves through image processing. Detecting plant leaf diseases in advance is essential to leveraging production and avoiding crop losses. The initial step was pre-processing using a Gaussian filter, and then feature extraction using the discrete wavelet transform with the use of coefficients with sub-bands and the grey level co-occurrence matrix (GLCM) computed correlation. A CNN algorithm was used to extract features that mapped the pixel values and evaluated it using the trained dataset image. Kulkarni (2018) proposed training a CNN model to identify the type of crop and detect diseases in a public dataset composed of normal and damaged crop leaves. MobileNet and InceptionV3 models were used, and accuracies of 99.62% and 99.74% for crop type and 99.04% and 99.45% accuracy for crop disease were obtained, respectively.
Militante et al. (2019), used deep learning techniques to identify and recognize sugarcane diseases. The study consisted of training and testing a deep learning model, including a 13.842 sugarcane image dataset of disease-infected and healthy leaves. The model could detect healthy and unhealthy leaves, classify diseased leaves, and achieve a 95% accuracy with 60 epochs. The methodology consisted of capturing an image dataset using a camera, pre-processing the images, obtaining features from the resized images, and using fully connected layers for classification; for feature extraction, it used convolutional and pooling layers. The main advantage of these techniques is the ability to extract information from large amounts of heterogeneous data. Thus, they are useful for processing hyperspectral images. In this context, Polder et al. (2019), used fully convolutional networks (FCNs) to detect potato virus Y (PVY) in the field. They arranged the camera in a measurement box installed in front of a tractor that drove through row potato fields at a constant speed to capture images. The FCN performed well in predicting the PVY-infected plants despite limited training data. The detection of infected plants was between 75% and 92% (recall values).
Castelao Tetila et al. (2017), proposed a system using simple linear iterative clustering for segmentation to identify plant leaves and describe the features of foliar characteristics, including color, texture, shape, and gradient via UAV images.
Using highly robust algorithms, Zhao et al. (2020), solved the automatic identification of crop diseases using images obtained from the field using deep learning. They used the Internet of Things to collect contextual information as useful features in a modern recognition system to identify crop diseases. Contextual features such as the season, geographic location, temperature, and humidity, were fused with visual features in a state-of-theart crop disease recognition method.
For automatic tractor piloting, a control system based on binocular vision was developed. This system enabled the machine to identify the path and was an excellent in-field operation. Zhang et al. (2018), applied this system to cotton field management.
To reduce chemical inputs in vineyard crops, Kerkech et al. (2020), created a method for disease detection in the vine field, applying deep learning on UAV images. The method used two sensors and combined visible and infrared images. This information was inputted into a fully convolutional neural network to allocate each pixel into shadow, ground, healthy, and symptom. As a result, the technique obtained over 92% of disease detection in grapevines and 87% in leaves, which is a promising application for computer vision. Rahaman et al. (2019), created a smartphone app to obtain and process images of grapevines to detect nutritional disorders using an SVM.

3D Reconstruction
The 3D reconstruction of agricultural products and food has facilitated the automation and application of autonomous machinery activities in the field. With this technique, reconstructing, identifying, and estimating the volume of fruits, parts of plants, weeds, insects, and pests is possible. Robots and agricultural machines can also identify paths and obstacles in the field.
In robotics, the most commonly used methods to convert distances into 3D points are the time-of-flight systems and triangulation techniques. The time of flight measures when a signal reaches a surface and returns to the emitter and an example is the laser measurement system. In triangulation techniques, the distances are estimated by attaching parts of a scene with two different views. The two views must be calibrated to determine the distance (LIU; LEE; CHAHL, 2017).
Algorithms that describe 3D features can be divided into two classes: global feature-based and local feature-based. The first uses a set of features with the geometric properties of an integral 3D object. The second uses features with characteristics of the local region points. The local feature-based method can use a local reference frame. It can also use a histogram or statistics of a normal or curvature to model a feature descriptor (LIU; LEE; CHAHL, 2017). Gao et al. (2019), studied 3D reconstruction of watermelon using a multimedia traceability system. Using sequential pictures captured around a watermelon from different angles, they matched feature points using a scaleinvariant feature transform algorithm to generate a sparse and dense point cloud using the motion method and multi- view stereo method, respectively. Texture mapping and model meshing were conducted using the Poisson surface reconstruction approach. The 3D reconstruction provided parameters such as shape, color, texture, geometric size, and volume. This experiment demonstrated that 3D reconstruction can be used to calculate size and volume with a relative error of approximately 1%, indicating that the volume measurement can be used to detect fruits with atypical densities and pull them out from the traceability system in the production line.
Tao and Zhou (2017) created an automatic system for robot perception in the 3D space, giving trajectory calculation and strategic planning to pick only the fruit in the field.

Determination of Quality Parameters: Mechanical Properties, Composition, and Appearance
Computer vision systems have been used over the last two decades in several studies to predict quality parameters and, based on this, to classify foodstuffs for collecting, processing, and storage (LU et al., 2020). The main quality parameters for agri-food products include flavor, TSS, titratable acidity, sugar content, color, appearance, and firmness. To assess these parameters, a variety of systems have been used as traditional RGB images, multispectral, hyperspectral, terahertz, Raman images, or those that produce intensity images such as fluorescence images, laser-light backscattering, and Xray images. Table 1 summarizes some of the recent works related to the quality parameters of food, using different non-destructive optical technologies. Cakmak (2019) presented a review of nondestructive techniques for the quality assessment of agricultural products. They argued that Raman and surfaceenhanced Raman spectroscopy (SERS) is an essential technique for evaluating agri-food chemical properties.
Computer vision applied to food and agricultural products MSI, HSI system, and NIR spectroscopy determine TSS, moisture content, titratable acidity, sugar content, and firmness of fruits and vegetables. Moreover, NIR spectroscopy has also been used in in-field and portable equipment.

Identification of Defects in Fruits and Vegetables
The primary aim of detecting defects in vegetables is to provide high-quality products for the customer and ensure reasonable prices for the market. The most frequently found defects are mechanical damage, morphological disorders, internal defects, pathological disorders, and physiological disorders, which may be visible or latent and internal (NTURAMBIRWE; OPARA, 2020).
Bruising is a typical damage that occurs during harvest and post-harvest manipulation. Its detection in fruits is primarily performed using manual inspection, which is time-consuming and mistake-prone. Traditional computer vision has been used for bruise detection, but with limited applications. To increase computer vision capacity to identify bruises in vegetables, Du et al. (2020), proposed combining new imaging techniques, such as biospeckle, fluorescence imaging, structural illumination reflectance imaging, hyperspectral/multispectral imaging, X-ray imaging, MRI, and thermal imaging with computer vision. The vision system can also incorporate deep learning methods, ANNs, and CNNs. For future research, the authors proposed that studies should focus on reducing equipment cost and miniaturization.
Some essential and recent research papers on the identification of fruit damage are described below.
Andrushia and Trephena (2019) created a computer vision technique to automatically diagnose surface diseases on mango fruits, adopting an artificialbee-colony-optimized feature set. The processing phases consist of removing the background, extracting the color, shape, and texture features, a metaheuristic approach to select the features, and the classification into good and diseased fruits. Marino et al. (2020), proposed potato defect classification using an unsupervised deep-domainadaptation method based on adversarial training. Chithra and Henila (2020) proposed a new algorithm to obtain images of the defective area of apple fruits in the sorting task. This task is essential to increase the speed and quantity of the sorting process, aiding the farmers to separate healthy fruit accurately and reduce costs in post-harvest operations. The image processing included rapid global thresholding, a discrete wavelet transformation to obtain statistical and texture features, a naive Bayesian classification model, a k-means clustering to segment the damaged area, and an algorithm to calculate the area and perform sorting decisions. Athiraja and Vijayakumar (2020) identified banana diseases at a much earlier stage using computer vision and machine learning. They performed pre-processing techniques and image standardization; color, shape, and texture features were used for feature extraction; finally, they used classification techniques.
Nturambirwe and Opara (2020) presented a review of the novel machine learning approaches applied to diverse sensors to identify damages to agricultural products. They argued that despite the high potential of vision systems to detect internal and external defects in horticultural products, some limitations exist, such as the speed of data processing and acquisition for some techniques, technical limitations, and expense for some types of equipment. They also indicated limitations related to the object interaction with the sensor, such as low contrast in X-ray images in fruit soft tissue and the limitation in infrared penetration in opaque and broad skin fruit. As a recommendation for future research, the authors indicated the standardization of confirmed efficacious procedures and made them feasible for broad applications. Deep learning algorithms enable feature extraction and the accurate detection of mechanical damage in the early development stages. However, the research on deep learning applications should be expanded to other upcoming techniques such as thermography, radiography, and magnetic resonance.

Vegetables Identification and Classification
Many research papers on the recognition of agricultural product recognition and classification are available. This section addresses recent research, processing methods, and the machine learning applied. Rojas-Aranda et al. (2020), presented an image classification method based on CNNs applied to three classes of fruits, inside and outside plastic bags. The input features were RGB color, RGB histogram, and RGB centroid from K-means clustering. Anurekha and Sankaran (2020) performed mango classification by employing a genetic-based ANN combined with the fuzzy inference system. The image processing consisted initially of removing noisy images from the input dataset. Subsequently, feature extraction and feature selection were performed using a genetic algorithm. The output feature trained the neural network, and the system was used for classification and grading with an accuracy of 99.18%. Belan et al. (2016), presented an automatic system for the classification of common beans in Brazil, utilizing skin color by applying a multilayer perceptron neural network. Siswantoro et al. (2020), employed MPEG-7 descriptors and classifiers (naive Bayesian, k-nearest neighbor, linear discriminant analysis, and decision tree) to distinguish fruits from Indonesia with 97.80% accuracy. With MPEG-7 descriptors, color and texture descriptors are obtained directly from the pixels without pre-processing or segmentation, and this system can be used both in general stores and food corporations.

Classification of Ripening Stages
The selection of fruits according to the ripening stages enables post-harvest activities to be conducted automatically, accelerating the production and packaging stages, and reducing repetitive human activities. Thus, some studies reported the application of computer vision for the classification of fruit ripening stages. Mazen and Nashat (2019) used an automatic computer vision system to determine the ripening stages of bananas by employing an ANN-based framework and features based on color, skin spots, and Tamura statistical texture. Comparing the results with other methods (SVM, naive Bayes, KNN, decision tree, and discriminant analysis classifiers), the considered system exhibited the highest recognition rate (97.75%). Jiang et al. (2020b), developed an identification method for tomato maturity by combining color and physicochemical indices. The color was obtained by a modified K-means clustering image processing program, and traditional techniques evaluated the physicochemical parameters such as firmness, soluble solid content, and sensory evaluation. A developed multinomial logistic regression with kernel clustering analyzed the data with accuracies of 84.58 and 90.42%.

CONCLUSION AND FUTURE PERSPECTIVES
1. The current interest in using computer vision systems in agriculture requires obtaining and processing images faster using new algorithms for pre-processing, feature extraction, advances in machine learning, and modeling relationships, always with more robust and intelligent vision systems. Here, a tendency to reduce the requirement for processing is the use of smaller and cheaper hardware; 2. According to each type of application, the sensors will also evolve to be more robust, smaller, and cheaper. In the various machines for in-field activities, post-harvest, and sorting machines, there is a tendency to combine several sensors to compose equipment. This combination makes machines easier and faster to manipulate and appropriate for several applications. Thus, the combination of data from several sensors enables machines to be more autonomous and intelligent. Machines for harvesting, sowing, and pesticides can use vision data combined with global positioning system information and weather data to perform their tasks. Fruit sorting machines can combine sensor data at different wavelengths of the electromagnetic spectrum to more accurately detect information such as chemical components, ripeness, and damage to fruits and vegetables. Similarly, nondestructive techniques may detect the surface and inside of the products, including improving 3D vision techniques that enable reconstructing fruits even with occlusion; 3. Finally, please note that new developments in data science and AI have a decisive effect on computer vision and, thereby, on Agriculture 4.0. Machines are increasingly able to obtain complete information on materials in a non-invasive and non-destructive manner and facilitate the reduction in costs and labor to obtain and analyze food.