Dry fruit image dataset for machine learning applications

Dry fruits are convenient and nutritious snacks that can provide numerous health benefits. They are packed with vitamins, minerals, and fibres, which can help improve overall health, lower cholesterol levels, and reduce the risk of heart disease. Due to their health benefits, dry fruits are an essential part of a healthy diet. In addition to health advantage, dry fruits have high commercial worth. The value of the global dry fruit market is estimated to be USD 6.2 billion in 2021 and USD 7.7 billion by 2028. The appearance of dry fruits is utilized for assessing their quality to a great extent, requiring neat, appropriately tagged, and high-quality images. Hence, this dataset is a valuable resource for the classification and recognition of dry fruits. With over 11500+ high-quality processed images representing 12 distinct classes, this dataset is a comprehensive collection of different varieties of dry fruits. The four dry fruits included in this dataset are Almonds, Cashew Nuts, Raisins, and Dried Figs (Anjeer), along with three subtypes of each. This makes it a total of 12 distinct classes of dry fruits, each with its unique features, shape, and size. The dataset will be useful for building machine learning models that can classify and recognize different types of dry fruits under different conditions, and can also be beneficial for dry fruit research, education, and medicinal purposes. Due to their nutritional value and health advantages, dry fruits have been consumed for a very long time. One of the best strategies to improve general health is to include dry fruits in the diet.

Due to their nutritional value and health advantages, dry fruits have been consumed for a very long time.One of the best strategies to improve general health is to include dry fruits in the diet.
© 2023 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Value of the Data
• Food identification: A dataset including images of many dry fruit varieties could be useful for identifying foods, such as for quality assurance in food production or consumer education.• Machine learning: A dataset of images of dry fruits could be used to train machine learning models for tasks such as dry fruit recognition, classification, and detection.• Marketing and advertising: A dataset of high-quality images of dry fruits could be valuable for marketing and advertising purposes, such as for use in product catalogues, advertisements, and packaging design.• Research purposes: A dataset of images of dry fruits could be valuable for research purposes, such as studying the morphological and structural characteristics of different types of dry fruits.

Objective
High-quality image dataset of major dry fruits under different lighting conditions and background will be useful for training machine learning models and assessing their efficacy in tasks like image classification and quality assessment.The created dry fruit dataset will aid in building machine learning models to automate operations like sorting and quality control in the dry fruit business in real time, which will be helpful to the stakeholders i.e., food industries, wholesalers, and end consumers.

Data Description
The Dry Fruit Image Dataset was created to include high-quality images of major dry fruits that are consumed and exported.It consists of four types of dry fruit each, namely, Almond, Cashew, Dried Fig, and Raisins.Each type of dry fruit is further categorized into three major subclasses.Almond has three subclasses namely, Regular, Sanora, and Mamra.Cashew has subclasses namely, Regular, Special, and Jumbo.Raisin has subclasses namely, Black, Grade 1, and Premium.Fig has subclasses namely, Small, Medium, and Jumbo.Hence, a total of 12 different classes are contained in the dataset.Fig. 1 describes the detailed directory structure of the dataset.The dry fruits were taken in various lighting conditions and backgrounds, namely, artificial light and natural light, while the backgrounds included white, black, green, and human palms.The look of dry fruits, which also impacts their marketability, can be used to judge their quality to a large extent [4] .Although there are many datasets on fruits and vegetables, those working on machine learning models and/or apps need a dataset on dry fruits due to the numerous health benefits they offer [ 1 , 2 , 3 , 10 ].Machine learning models can correctly categorize and identify the type of data when a decent dataset is available [ 5 , 6 , 7 , 8 , 11 , 12 ].The dataset is used for cutting-edge dry fruit-related research, education, and medical applications such as spotting fungal infections in dry fruit [9] .

Experimental design
The process of data acquisition for dry fruits is presented in Fig 3 .The dry fruit images were captured using two different makes of camera, that were Apple's iPhone 13 and Motorola's Moto G40 fusion mobiles' rear camera having high resolution.In all, 11500 + images were captured with a camera and then stored in various folders according to their category and classification.
Four different backgrounds, two lighting conditions, and various angles are used for capturing the images of dry fruit.Images were pre-processed using a Python script and Microsoft Power Automate.The dimensions of the images, 512 × 512 make it easier to build object classification models.Table 1 describes the data acquisition steps.Dry fruits were purchased from the wholesale market of PUNE, INDIA from February to March.Dataset creation and pre-processing are done in April.LEDs were positioned at a 45 °angle relative to the surface of the background setup, one on each side.For further reference, the table provided below presents the detailed specifications of the Artificial LED light sources.Using a Python script, all the original photos, which were all 3042 × 4032 in size, were shrunk to 512 × 512 and then given new names using the Microsoft Power Automate tool.Table 2 describes the classes, number of images taken, lighting conditions, and background in which the images are taken in detail and Table 3 describes the specifications of the artificial light setup.

Fig 1 .
Depicts the directory structure of the Dry Fruit Image Dataset, and a few sample images from the dataset are shown in Fig. 2 .

4 .
Materials or Specifications of the Image Acquisition System 5. Method After the survey in the local stores and wholesaler market, all twelve classes of dry fruits i.e.Almond Mamra, Almond Regular, Almond Sanora, Cashew Jumbo, Cashew Regular, Cashew Special, Fig Jumbo, Fig Medium, Fig Small, Raisin Black, Raisin Grade 1, Raisin Premium were purchased from PUNE, INDIA.Data collection took place in February and March.In the VIIT lab, typical images were taken in a variety of lighting, background, and angle situations.The dataset utilized in this study comprises two primary light sources: Natural Sunlight and Artificial light.Natural Sunlight served as the natural light source, with a range of sunlight angles spanning from 60 °to 120 °.Additionally, two LEDs were employed as the Artificial light sources.These

Table 1
Steps of data acquisition.

Table 2
Dry fruit dataset details.

Table 3
Artificial light specification.