Next Article in Journal
Phenotyping and Identification of Reduced Height (Rht) Alleles (Rht-B1b and Rht-D1b) in a Nepali Spring Wheat (Triticum aestivum L.) Diversity Panel to Enable Seedling Vigor Selection
Previous Article in Journal
Soil Health Check-Up of Conservation Agriculture Farming Systems in Brazil
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Proposal for a Crop Protection Information System for Rural Farmers in Tanzania

1
Interdisciplinary Graduate School of Agriculture and Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
2
Faculty of Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
*
Author to whom correspondence should be addressed.
Agronomy 2021, 11(12), 2411; https://doi.org/10.3390/agronomy11122411
Submission received: 15 October 2021 / Revised: 12 November 2021 / Accepted: 22 November 2021 / Published: 26 November 2021
(This article belongs to the Section Farming Sustainability)

Abstract

:
Crop protection information, such as how to control emergent and outbreak crop diseases and pests, as well as the latest research, regulations, and quality control measures for pesticides and fertilizers, is important to farmers. Rural smallholder farmers in Tanzania have traditionally relied on government agricultural officers who visit them in their villages to provide this crop protection information. However, these officers are few and cannot reach all the farmers on time. This means that farmers fail to make critical farming decisions on time, which can lead to low crop productivity. In this study, we aim to provide farmers with reliable and instant crop protection information by developing a system based on the Short Message Service (SMS) and the Web. This system automatically replies to farmers’ requests for the latest crop protection information in the Swahili language through SMS on a mobile phone or a Web system. The findings reveal that our proposed system can provide farmers with crop protection information at lower cost (500 times cheaper) than the existing Tigo Kilimo system. Furthermore, our proposed system’s deep learning model is effective in understanding and processing Swahili natural language SMS queries for crop protection information with an accuracy of 96.43%. This crop protection information will help farmers make better critical farming decisions on time and improve crop productivity.

1. Introduction

In Tanzania, approximately 75% of the labor force is employed in the agriculture sector [1]. Most Tanzanians in the agriculture sector are smallholder farmers (farmers who own and cultivate pieces of land between 2 and 10 hectares in size) living in rural areas and who depend on crop farming as their primary economic activity [2,3]. Although most of these rural farmers have access to basic farming knowledge, such as how and when to prepare farms, plant crops, treat common crop diseases, use fertilizers, and harvest and store their crops, which is useful in their farming activities [4], access to crop protection information, such as how to control emergent and outbreak crop diseases, weeds, and pests, has remained a challenge to them. To meet this need, the Government of Tanzania has employed professional agricultural officers who visit these rural farmers physically in the villages to provide them with crop protection information and in some cases basic farming knowledge (e.g., for young and new farmers). However, due to various challenges, such as few agricultural officers and poor transportation infrastructure, these agricultural officers are not able to reach all the farmers on time, meaning that some crop protection information is accessed too late by farmers or not accessed at all if agricultural officers do not visit them [4,5]. As a result, farmers fail to make critical farming decisions on time, sometimes leading to low crop productivity and poor crop quality. For instance, a farmer might fail to control an emergent disease and lose all their crops or a farmer might use a fake pesticide (which has been researched as ineffective and banned by the government) and have a poor harvest; likewise, a farmer might use a fake fertilizer and also suffer a poor harvest. Therefore, to address this issue, technological solutions that can provide these farmers with relevant and timely crop protection information are needed.
Several systems have been developed to provide farmers with agricultural information. For instance, because most of the rural farmers in Tanzania have access to low-end mobile phones that have basic functionalities such as voice calling and text SMS, several companies have established Unstructured Supplementary Service Data (USSD) systems to provide farmers with basic farming knowledge on their mobile phones. In the USSD system, a farmer dials a code into his or her mobile phone and the system returns a menu, after which the farmer selects and submits a menu item and the system returns another menu. This process continues until the farmer receives the final information in the SMS format. For instance, in Tanzania, the Tigo and Zantel mobile operators launched USSD-based systems called Tigo Kilimo [6,7,8] and Zantel Kilimo (mentioned in [7]), respectively, which provide farmers with basic farming knowledge. Although these USSD systems are deployed in Tanzania, they are expensive due to the high cost of accessing the USSD service [7]; they are difficult to use, especially for rural farmers due to menu navigation and timeout problems [6]; and they can provide irrelevant and complex information that is not useful and difficult for the rural farmers to understand [6].
In contrast, other studies have integrated Web and SMS to develop systems that provide farmers with agricultural information. For instance, in our previous work [9], we developed and evaluated an SMS- and Web-based system, which allows rural farmers in Tanzania to request market prices of crops and contact crop buyers by SMS; then, the system automatically replies to farmers with market prices and crop buyer contact information by SMS. This allows farmers to avoid accessing market prices and crop buyers through the middlemen who normally exploit them. Although the findings showed that the system helped farmers to access market prices instantly, sell crops at better prices, and improve incomes, the system provides only market price information and not crop protection information. Furthermore, this system allows farmers to write SMS queries in keyword format and thus lacks the flexibility of writing SMS queries in natural language. In another study, Mutuku et al. [10] proposed a system in which agricultural officers in Kenya use a Web system to forward research information that is received by farmers via SMS. However, in this system, farmers can only receive information, and they cannot request it; as a result, farmers can receive irrelevant information. In another study, Ninsiima [11] proposed a system in which rural farmers in Uganda use local language to send agricultural questions by SMS to a Web system, after which agricultural officers answer the farmers’ questions and send answers back to farmers through SMS. Although the findings of this study showed that use of the local language increases the chance of system adoption, this system lacks flexibility and availability because farmers’ questions are interpreted and answered by human beings.
Other studies have proposed mobile applications to provide farmers with agricultural information. For instance, Iraba et al. [12] developed a mobile application to provide basic farming knowledge to rural farmers in South Africa. Other studies have developed chatbots that respond to farmers’ natural language queries for agricultural information through artificial intelligence. For instance, Jain et al. [13] proposed an audio-based android conversation application called FarmChat that uses IBM’s Watson conversation service to provide farmers in India with agricultural information in Hindi. Mostaço et al. [14] developed a chatbot called AgronomoBot using IBM’s Watson conversation service, which allows users to request data acquired from a wireless sensor network deployed in a vineyard. Naman et al. [15] proposed a mobile application chatbot called AgriBot that uses a sentence embedding model to provide agricultural information to farmers in English. Although these mobile applications can be useful to farmers, they require farmers to have smartphones and the Internet, which are expensive and not affordable by the majority of rural smallholder farmers, especially in Tanzania. Furthermore, most of the chatbots use foreign languages (e.g., English and Hindi), which are difficult to use for most of Tanzanian rural farmers who understand only Swahili.
Several existing systems provide farmers with access to agricultural information. However, most of these systems have weaknesses that make them unsuitable for providing crop protection information to Tanzanian rural farmers. Therefore, the objective of this study was to develop a suitable information system that will provide rural farmers in Tanzania with cheaper, timely, and relevant crop protection information that will help them to make critical farming decisions on time and hence improve crop productivity. Our proposed crop protection information system uses SMS to overcome the challenges of USSD; ensures farmers can access affordable and relevant crop protection information on any kind of mobile phone without the difficulties of menu navigation and timeouts; and uses Swahili, the national language in Tanzania, which they can easily understand and use. Furthermore, farmers with no access to the Internet can use SMS to request crop protection information. The proposed system uses SMS querying in which SMS requests from farmers are automatically processed and replied to by the system to ensure system availability.
Because of the current travel restrictions related to the global COVID-19 pandemic, we could not travel to Tanzania and deliver the implemented system to users in Tanzania for them to test and use it, and provide feedback for improvement. We, therefore, limited the scope of this study to developing a prototype of the system that provides the farmers with crop protection information in Swahili and evaluating its accuracy in understanding and processing farmers’queries for crop protection information. Based on this, we intended to answer two key research questions in this study: First, what system design can automate processing and response of Swahili queries for crop protection information via SMS and Web platforms? Second, to what extent is the system accurate in understanding and processing Swahili natural language queries for crop protection information?

2. Materials and Methods

2.1. Study Design

Note that the first author (I.G.T.) is Tanzanian. Before starting to develop the proposed system, we needed to collect user requirements and information needs from farmers in Tanzania. Kyimo ward which is located in Rungwe district, Mbeya region, Tanzania, was chosen as a study area because it is one of the leading banana cultivating wards in Mbeya region. Mbeya is one the leading banana cultivating regions in Tanzania. The first author prepared structured questionnaire guide in Swahili language to facilitate interaction with farmers. The questionnaire included questions to collect primary data from farmers including biographic data (such as age and gender), user requirements, information needs, and sample Swahili natural language queries. Because of the current travel restrictions related to the global COVID-19 pandemic, we could not travel to Tanzania. Therefore, the first author asked two research assistants, whom he had worked with previously, to conduct a household survey with 100 banana farmers (banana is one of the major food crops in Tanzania) selected through purposive sampling in Kyimo ward. Before conducting the household survey with the selected farmers, research permits were obtained from the local ward government. The first author gave a transport allowance equivalent to 43 US dollars to each of the two research assistants to facilitate their transportation when visiting farmers. The research assistants used questionnaires to interact with the farmers. The survey was conducted for a duration of two months from 12 July 2021 to 12 September 2021 to collect user requirements, gather information needs, and sample natural language queries from farmers. After finishing collecting data in Tanzania, the collected sample natural language queries were analyzed by using deep learning model while other data such as farmers’ preferred way (short format or natural language) of requesting crop protection information via SMS were analyzed by using descriptive statistics tools such as cross-tabulation. Fisher’s exact test was used to analyze the significance of association between age groups ((40 years or below) and above 40 years) and preferred way of requesting crop protection information via SMS (short format or natural language). Table 1 shows timetable of the field work.

2.2. Requirements Analysis

Key information needs collected from farmers are described below.
  • Information on emergent and outbreak banana diseases: Latest information on emergent and outbreak banana diseases and their control measures, as well as latest research information on emergent and outbreak banana diseases.
  • Information on banana pesticides: Latest information issued by the government of Tanzania on the quality of pesticides, including bans on low-quality pesticides, as well as latest research information on effectiveness and side effects of pesticides.
  • Information on banana fertilizers: Latest information issued by the government of Tanzania on the quality of fertilizers, including bans on low-quality and fake fertilizers, and latest research information on effectiveness and side effects of fertilizers.
Users also requested features such as system accessibility through text SMS and Web systems, short response time, system availability 24 h a day, and system security. Furthermore, farmers requested that the system should allow them to send SMS queries for crop protection information both in short format and in natural language in Swahili language to ensure flexibility in requesting crop protection information via SMS, with the majority of young farmers (farmers aged 40 years or below) preferring short and simple SMS queries, while the majority of older farmers (farmers aged above 40 years) preferring writing SMS queries in natural language without any format or restrictions as shown in Table 2, with Fisher’s exact test showing significant association (p value < 0.05) between age of farmers and preferred way of accessing crop protection information via SMS.
To meet user requirements, we made several key design decisions. For instance, to meet the requirement of accessing crop protection information by simple and short SMS queries from young farmers, we included a functionality to request crop protection information by using keyword-based SMS queries where users need to follow a format of writing queries in SMS by writing a keyword followed by single words separated by spaces to represent the crop protection information they are requesting. To meet the requirement of accessing crop protection information by natural language SMS query from old farmers, we included a functionality to request crop protection information by using natural language-based SMS query that uses a deep learning model where users can ask any question regarding crop protection information in their own way without following any format or restriction.
To include functionality to send SMS queries in natural language, the system needed a deep learning model that can predict the type of crop protection information requested by a farmer by looking at the sentence written by the farmer in the natural language SMS query. This required collecting samples of text-based Swahili natural language queries from farmers for training the deep learning model and testing its accuracy to correctly predict the crop protection information requested. A total of 2100 samples of natural language queries (refer to Data S1 in the Supplementary Materials) were collected from the 100 banana farmers (each farmer asked 21 natural language questions in Swahili) from Kyimo ward, Rungwe district, and Mbeya region, Tanzania, from 12 July 2021 to 12 September 2021.
We used Unified Modeling Language diagrams to analyze user requirements. For instance, Figure 1 shows a use case diagram with functionalities for different system users. For example, the key responsibilities of government agricultural officers is to update the system with dynamic crop protection information from different sources, such as fertilizer and pesticide quality control directives and regulations from the ministry of agriculture in Tanzania [16], and research articles from the leading agricultural research institutes, such as the Tropical Pesticides Research Institute [17] and Sokoine University of Agriculture, as well as government agricultural quality control bodies, such as the Tanzania Bureau of Standards [18]. Figure 2 shows a sequence diagram with steps of how a farmer uses SMS to request crop protection information from the system. First, the farmer must decide the type of SMS query that he or she wants to send, namely, whether it is a simple (keyword-based) query or a natural language query. Then, the SMS from a registered farmer is received by the system and SMS content is extracted and inspected by the system. All keyword queries need to start with a special keyword as the first word in the SMS to differentiate between keyword and natural language queries. If it is a keyword query, the keyword in the SMS is extracted and compared against stored keywords. If a match is found between a keyword in the SMS and a keyword stored in the system, the corresponding query for that keyword is executed. If a match is not found between the keywords, a keyword error message is generated. If it is a natural language query, the SMS content is fed into a deep learning model that has already been trained (there is no retraining, training is performed only once) to predict the correct response. Finally, the response from the executed query, the predicted response, or the error is sent back to the farmer via SMS.

2.3. Deep Learning Model

During requirements collection, farmers mainly requested three types of crop protection information (crop diseases, pesticides, and fertilizers), as mentioned in Section 2.2. In natural language queries, there is no specific format for writing queries to request crop protection information. Farmers are free to write questions and formulate sentences any way they want and no format or predefined words are specified; in other words, farmers choose the words they use to request crop protection information. The following is a simple realistic example. Three different farmers want to request crop protection information on outbreak banana diseases, so they write Swahili (Swahili uses the same alphabets as English language except for the letters “q” and “x”) natural language queries in their own way without following any format. For example farmer 1 writes “nahitaji taarifa kuhusu magonjwa ya mlipuko ya ndizi na jinsi ya kuyadhibiti”, which means “I need information on outbreak banana diseases and how to control them”; farmer 2 writes “naomba kujua magonjwa ya mlipuko ya ndizi na uthibiti wake”, which means “I want to know banana outbreak diseases and their control measures”; and farmer 3 writes “jinsi ya kuthibiti magonjwa ya mlipuko ya ndizi”, which means “how are banana outbreak diseases controlled?” In this case, all three farmers want the same information on outbreak banana diseases and their control measures although they have written sentences with different words. Therefore, we proposed a deep learning model (refer to Figure 3) that can look into the natural language query sentence, word by word, and be able to predict the type of crop protection information that particular farmer wants.
The deep learning model is used to recognize patterns in the sequence of words in the natural language query and correctly predict the label (type of crop protection information requested). The assumption is that, each natural language query from farmers can be associated with only one of three labels (the three types of crop protection information) and also, only one crop (banana) is considered. The deep learning model is explained below.
  • Text data cleaning: Because we used word embeddings in our deep learning model, we decided not to use classical text cleaning methods, such as case normalization and removal of punctuation or stop words. This is because word embedding encodes each individual word into a dense vector that captures information about its relative meaning in the training data. This implies that different forms of the words, such as spelling, case, and punctuation, will automatically be learned to be similar in the embedding space.
  • Preprocessing: The deep learning model needs training data for it to be trained to correctly predict the type of crop protection information requested by farmers and test data for evaluating its effectiveness. For training and testing purposes, each datum needs a label. Analysis of all 2100 natural language queries from farmers showed that farmers requested three main types of crop protection information (outbreak diseases, pesticides, and fertilizers). This means that each natural language query collected could be categorized into only one of those three types of crop protection information. The authors labeled each sample natural language query to identify the type of crop protection information requested. This led to three labels (classes) represented as abbreviations and integers (in parentheses): “MG” (0) for information on outbreak and emergent crop diseases, “DW” (1) for information on pesticides, and “MB” (2) for information on fertilizers. Each sample Swahili natural language query was labeled with one of the three labels. Deep learning models only process numeric tensors (vectors) as input and not the raw text, so we first transformed the raw text in the natural language queries into integer indices. We used a dataset of 2100 samples of Swahili natural language queries collected from farmers (Table 3 randomly shows 15 of them, with their corresponding English translations shown in Table 4; the whole dataset is provided in the Supplementary Materials). A dataset of all 2100 sample Swahili natural language queries was shuffled and tokenized into individual words, which were then represented as integer indices by assigning a unique integer to a unique word (token). Only the 10,000 most common words were taken into account to avoid very large input vector spaces. Then, each Swahili natural language query was represented as a sequence (list) of the integers based on the words it contains. Because we needed to pack the sequences into a single tensor for deep learning processing, all the sequences needed to have the same length. Because each sample Swahili natural language query contained fewer than 100 words, we decided that each sequence should have 100 integers, with sequences having fewer than 100 integers padded with zeros to reach 100 integers. The labels were converted from class vector (integers) to a binary class matrix.
  • Embedding layer: The embedding layer maps the integer indices, which stand for specific Swahili words, into dense vectors that can be processed by the deep learning model. Word embeddings are low-dimensional floating-point vectors that are learned from the dataset of Swahili natural language queries and associate a Swahili word with a vector. The objective was to map Swahili words into a geometric space where geometric relationships between word vectors indicate the semantic relationships between the words. For instance, the Swahili word “kudhibiti”, which means “to control”, and the Swahili word “udhibiti”, which means “control measure”, are expected to have similar word vectors. The embedding layer takes two arguments: the number of possible words (10,000) and the embedding_dimensionality. The input to the embedding layer is a 2D tensor of integers of shape (samples, length_of_sequence), where each entry is a sequence of integers. The “samples” is number of sequences, which is 2100, and “length_of_sequence” is 100. The output of the embedding layer is a 3D floating-point tensor of shape (samples, length_of_sequence, embedding_dimensionality). Initially, Swahili word vectors were randomly chosen by the embedding layer; however, as the training process proceeded, these Swahili word vectors were adjusted via backpropagation [19] to be more informative.
  • LSTM layers: To understand the type of crop protection information requested by a farmer in the natural language query, we chose Long Short-Term Memory (LSTM) [19], which is a special type of recurrent neural network due to its high performance in processing sequential data, such as text and time-series data [20,21,22]. The LSTM network processes the Swahili natural language query word by word by iterating (looping) through the words of the sequence while keeping a state (memory) of the words it has seen so far and saving information for later use to prevent older signals from gradually vanishing (vanishing gradients) during processing, resulting in better understanding of the meaning of the Swahili natural language query. Important arguments in the LSTM layer include output_dimensionality (number of units), which indicates the dimensionality of the LSTM layer output space, dropout rate indicating the fraction of units to be dropped during linear transformation of the inputs, and recurrent_dropout rate indicating the fraction of units to be dropped during linear transformation of the recurrent state. The dropout and recurrent_dropout rates are used to prevent overfitting and are applied to the input units of the layer and recurrent units, respectively. The LSTM layer processes batches of sequences and takes the 3D input tensor of the shape (batch_size, timesteps, input_features) and returns either the full sequences of successive outputs for each timestep (a 3D tensor of shape (batch_size, timesteps, output_features), as in the case of the first LSTM layer in our proposed model) or only the last output for each input sequence (a 2D tensor of the shape (batch_size, output_features), as in the case of the second LSTM layer in our proposed model). The batch_size indicates the number of samples to be processed in batch, timesteps corresponds to the length_of_sequence, input_features indicates the dimensionality of the input feature space, and output_features indicates the dimensionality of the output feature space.
  • Dense layer: This is the top layer; it contains a SoftMax activation [23] and is trained to classify the Swahili natural language query into one of the three labels. For each natural language query, the model returns a vector of probability distribution with three probability scores that sum to 1. Each probability score is the probability that the natural language query belongs to one of the three labels. SoftMax activation is shown in Equation (1), where the exponential function is applied to each element s i of the LSTM output score vector S, and then the exponential values are divided by the sum of all exponential values to ensure that the elements of the output probability distribution vector f ( S ) sum to 1.
  • Loss function: Categorical cross-entropy loss H (refer to Equation (2)) measures the difference between the probability distribution output by the model f ( S ) and the true probability distribution t of the label. The RMSprop optimizer [24] minimizes the loss during training. By minimizing this loss and learning the appropriate model weights, the model is trained to output scores that are close to the true scores.
    f ( S i ) = e s i j = 1 3 e s j ,
    H = i = 1 3 t i l o g ( f ( S i ) ) .

2.4. K-Fold Cross-Validation

It is important to tune hyperparameters to construct an effective deep learning model. During the training process of our deep learning model, we kept adjusting parameters, such as number of layers, size of layers, and number of epochs, by observing the performance of the model on the validation data set. We used K-fold cross-validation [25], where available data are divided into K partitions. For each partition, the deep learning model is then trained on the remaining K 1 partitions and evaluated on that particular partition. The validation score for the deep learning model is then the average of the K validation scores obtained. In our case, we used 4-fold cross-validation, meaning we chose 4 as the value of K. The 2100-sample Swahili natural language queries were first divided into a general training set (1680 queries) and a test set (420 queries). Then, we further partitioned the general training set into training and validation sets by cross-validation; each of the 4 partitions had 420 queries. Then, during the cross-validation experiment, for each partition, the deep learning model was trained on the remaining 3 partitions and evaluated on that particular partition.

2.5. System Design

2.5.1. System Architecture

The system was designed based on a 3-tier architecture (refer to Figure 4). In the case of the keyword SMS query, the stored procedures in MySQL and Ozeki NG SMS Gateway [26] interpret the SMS keyword query, authenticate the user (farmer), execute SQL queries to retrieve crop protection information from the MySQL database, and send a reply automatically back to the farmer by SMS. In the case of natural language SMS query, the stored procedures in MySQL and the Ozeki NG SMS Gateway authenticate the farmer and execute SQL queries that activate the MySQL trigger to pass the SMS natural language query to the Python script of the deep learning model. This model in turn predicts the correct label, and then the crop protection information for that label is retrieved from the MySQL database and sent back to the farmer by SMS. Authentication ensures security, while automatic reply ensures availability and short response time. In the Web system, farmers use a Web browser to interact with the Web system, confirm their authentication, and search for crop protection information by a form with dropdown menus, after which the system processes the request and displays the crop protection information. The use of Hypertext Transfer Protocol Secure (HTTPS) in Web requests, authentication, and PHP sessions ensures security while the Web requests are automatically processed and answered by PHP scripts, an Apache Web server, and MySQL to ensure availability. In the Web system, farmers can access crop protection information with text and pictures, but in SMS, pictures are removed and only text is accessed.

2.5.2. Keyword Query Processing

To successfully request crop protection information through a keyword SMS query, a farmer needs to compose an SMS on a mobile phone with three words separated by spaces and send it to the system’s phone number. The first word should be the keyword, the second word should be the name of the crop, and the third word should be the type of crop protection information requested. An incoming SMS from a farmer is received by an Android-based Ozeki SMS SMPP application [27], which then passes the SMS to the Ozeki NG SMS Gateway for comparison of the keyword against keywords stored in the system. If the keyword in the farmer SMS matches one of the stored keywords, the corresponding MySQL-stored procedure for that keyword is called with the second word and third word in the SMS as parameters. Finally, crop protection information is retrieved from the MySQL crop protection information table and sent back to the farmer via SMS by the OZEKI NG SMS Gateway. For instance, to request crop protection information on outbreak and emergent banana diseases, a farmer needs to write the following SMS query: “KQUSHAURI Ndizi Magonjwa”, where the first word “KQUSHAURI” is the keyword, the second word “Ndizi” is a Swahili name for banana crop, and the third word “Magonjwa” is a Swahili word for “diseases”. For crop protection information on pesticides, the farmer writes “KQUSHAURI Ndizi Dawa”, where the Swahili word “Dawa” refers to “pesticides”. For crop protection information on fertilizers, the farmer writes “KQUSHAURI Ndizi Mbolea”, where the Swahili word “Mbolea” means “fertilizer”. Note that the names “Ndizi”, which means “banana fruit”, and “Mgomba”, which means “banana plant/tree”, both have a general meaning of “banana” when it comes to requesting crop protection information among banana farmers in Tanzania and are always used synonymously.

2.5.3. Natural Language Query Processing

To successfully request crop protection information by a natural language SMS query, a farmer needs to write the natural language question in the SMS in free form, in the way that she or he prefers, and send it to the system. The incoming SMS query from the farmer is received by the Android Ozeki SMS SMPP application, which then passes the query to the Ozeki NG SMS Gateway, which in turn calls the MySQL stored procedure responsible for natural language queries. Then, the MySQL-stored procedure executes the corresponding SQL query that inserts the natural language query into a MySQL request table, after which the MySQL trigger immediately calls and executes the Python script of the deep learning model with the natural language query as the parameter. The deep learning model (which has already been trained and does not need retraining) takes the natural language query as input and predicts the correct label for that natural language query. Finally, the crop protection information corresponding to that label is retrieved from the MySQL crop protection information table and inserted into the MySQL response table. The Ozeki NG SMS Gateway polls the MySQL response table every second, so if a new response is inserted, it will be sent back to the farmer instantly via SMS. To avoid sending the same latest crop protection information redundantly to the same farmer, whether the crop protection information has already been sent to the same farmer is checked; in this way, it is not sent again. The flowchart in Figure 5 shows the algorithm for processing both keyword and natural language SMS queries once they arrive in the system.

2.6. Prototype Implementation and Testing

A prototype of the proposed system was successfully implemented with all functionalities. Because of the current travel restrictions around the world, we could not travel to Tanzania to deliver the system to farmers and agricultural officers in person. Therefore, to test the functions of the system, the first author assumed the role of agricultural officer and registered one type of crop protection information in the system for testing purpose. The first author also asked a fellow student to assume the role of a farmer and send SMS queries to request crop protection information from the system. There has been an outbreak of banana Xanthomonas wilt epidemic disease in east African countries, including Tanzania [28,29], which was new to farmers. Through the Web system, the first author registered symptoms, control measures, and preventive measures of banana Xanthomonas wilt in Swahili based on information provided by a recent study [28]. This crop protection information was prepared in text format so as to allow farmers to access the information through SMS (text message) on their low-end mobile phones. After its registration, the farmer could access the crop protection information by using SMS keyword query (refer to Figure 6a), through SMS natural language query (refer to Figure 6b), or through the Web system if he or she has access to the Internet with a computer or smartphone (refer to Figure 7).

3. Results

3.1. Cost of Accessing Crop Protection Information

Because we could not travel to Tanzania and deliver the system to farmers, we relied on secondary data to evaluate the cost of using our system. Currently in Tanzania, the cost of using SMS is extremely low. For instance, the cost of 1 SMS transmission with an SMS bundle from the Vodacom mobile network operator is 0.2 Tanzanian shillings (Tshs) [30]. In our proposed system, a farmer can use any mobile operator that he or she likes. Taking this 0.2 Tshs as the cost of requesting and receiving one set of SMS messages for crop protection information in our proposed system, it can be observed that the cost of accessing crop protection information in our proposed system is 500 times cheaper compared with the average cost of requesting and receiving basic farming knowledge in the Tigo Kilimo USSD system, which is 100 Tshs per SMS message when making a USSD request, according to a recent evaluation study [7].

3.2. Performance of Deep Learning Model

3.2.1. Hyperparameter Tuning Experiments

During the training process, we tuned the deep learning model by changing several hyperparameters. We used 4-fold cross-validation to evaluate the performance (accuracy) of the model on validation data and then made changes to the hyperparameters accordingly to try to improve the validation accuracy of the model. We repeated this process several times to find the best hyperparameters for our deep learning model. For instance, Figure 8 shows the average validation accuracy in a 4-fold cross-validation experiment for each epoch during the training process. In this way, we obtained the following best hyperparameters: embedding_dimensionality of 32, 2 LSTM layers, output_dimensionality of 32 for the first LSTM layer and 64 for the second LSTM layer, dropout rate of 0.5, recurrent_dropout rate of 0.5, batch_size of 16, learning rate of 0.001 for the optimizer that minimizes the loss (RMSprop), and 100 epochs.

3.2.2. Final Training Experiments

After tuning the hyperparameters, we configured the deep learning model according to the best hyperparameters, and conducted final experiment by training the model on the whole general training set (1680 queries). Figure 9 shows the final training accuracy of the model. The deep learning experiments were conducted with Keras version 2.3.1 as the deep learning library and TensorFlow version 2.0.0 as the backend on a Windows 10 Desktop computer with 3.60 GHz Intel (R) Core (TM) i7 processor and 16 GB RAM.

3.2.3. Accuracy on Test Set

To evaluate the performance of our deep learning model, we evaluated its accuracy of recognition on a dataset that it had not seen before (the test set of 420 queries). Our deep learning model obtained an accuracy of 96.43%. This indicates that our deep learning model is very effective and has high accuracy for predicting the label of a Swahili natural language query that it has never seen before.
In contrast, the evaluation of the English-based AgriBot [15] indicated that it had an accuracy of 86% in predicting the correct response to English natural language questions. These findings imply that, apart from improving usability for Tanzanian rural farmers by using Swahili, our deep learning model is also sufficiently effective because it has accuracy comparable to that of English-based chatbots for agricultural information, such as AgriBot [15], that are currently being used by farmers in different parts of the world.

4. Discussion

4.1. Cost Savings of Crop Protection Information Access

The findings show that the use of SMS in our proposed crop protection information system can dramatically reduce the cost of accessing crop protection information compared with the use of the USSD system. These cost savings can lead to increased profit for smallholder farming activities or increased investment in farming activities and improved livelihoods.

4.2. Improved Usability and Flexibility of Accessing Crop Protection Information

The use of SMS in our proposed system allows farmers to access crop protection information without the difficulties of timeouts and menu navigation compared to the USSD system, where timeouts and menu navigation issues are common. This shows that SMS can give farmers a better usability experience than can USSD. Furthermore, the proposed system helps farmers access crop protection information in more flexible way by keyword SMS query, natural language SMS query (for those who cannot remember the keyword query format, such as older farmers), or Web system query and hence make critical farming decisions on time to reduce crop loss.

4.3. Impact of Deep Learning Models on Processing Swahili Natural Language Queries

The findings show that the LSTM-based deep learning model in our proposed system is very effective in processing and understanding Swahili natural language queries for crop protection information and can predict with high accuracy the correct crop protection information requested by farmers in a Swahili natural language query that it has never seen before. This shows that LSTM-based deep learning models are very effective in processing and understanding Swahili natural language sentences that request information in the context of crop protection information. These findings will help to fill a wide information gap in the effectiveness of deep learning models processing Swahili natural language queries for crop protection information.

4.4. Study Limitations

Due to a combination of factors such as inability to travel to Tanzania due to the current pandemic as well as limited budget, the field validation activity involved only three stakeholders (the first author and the two research assistants) which may be a limitation of this study. In future works, to further improve the quality of field survey, we plan to involve more stakeholders in field validation activity.

5. Conclusions

In this study, we developed a crop protection information system that uses SMS to overcome the challenges of the existing Tigo Kilimo USSD system in providing rural farmers with crop protection information in Tanzania at a lower cost than that of the USSD system. Furthermore, the deep learning model in our LSTM-based system can understand and process Swahili natural language queries with high accuracy for crop protection information that it has never seen before.
As part of policy recommendation, we advise the Government of Tanzania to recruit (for instance, through tax incentives) investors who are willing to implement and deliver information systems that can use SMS in low-end mobile phones to provide rural farmers in Tanzania with access to crop protection information. Furthermore, the dataset of 2100 Swahili natural language queries that we collected and prepared in this study can help other researchers in the area of Swahili natural language processing.

Supplementary Materials

The following is available online at https://www.mdpi.com/article/10.3390/agronomy11122411/s1: Data S1, Dataset of 2100 sample Swahili natural language queries for training, validation, and test purposes, with their corresponding labels, for the deep learning model.

Author Contributions

Conceptualization, methodology, software, validation, formal analysis, investigation, visualization, writing—original draft preparation, I.G.T.; resources, data curation, writing—review and editing, I.G.T., K.A., H.Y., T.K., and N.O.; supervision, project administration, funding acquisition, K.A., H.Y., T.K., and N.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by JSPS KAKENHI Grant Numbers JP18K11268 and JP21K11849.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article and Supplementary Materials.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. URT [United Republic of Tanzania]: National Agriculture Policy. Available online: http://extwprlegs1.fao.org/docs/pdf/tan141074.pdf (accessed on 10 October 2021).
  2. The World Bank: Rural Population. Available online: https://data.worldbank.org/indicator/SP.RUR.TOTL.ZS?locations=TZ (accessed on 10 October 2021).
  3. URT [United Republic of Tanzania]: Household Budget Survey (HBS), 2011/12—Key Findings Report. Available online: https://www.nbs.go.tz/index.php/en/census-surveys/poverty-indicators-statistics/household-budget-survey-hbs/148-household-budget-survey-hbs-2011-12-key-findings-report (accessed on 10 October 2021).
  4. Misaki, E.; Apiola, M.; Gaiani, S. Technology for small scale farmers in Tanzania: A design science research approach. Electron. J. Inf. Syst. Dev. Ctries. (EJISDC) 2016, 74, 1–15. [Google Scholar] [CrossRef]
  5. Mattee, A.Z. Reforming Tanzania’s Agricultural Extension System: The Challenges Ahead. J. Afr. Study Monogr. 1994, 15, 177–188. [Google Scholar]
  6. GSMA: Case Study Tigo Kilimo, Tanzania. Available online: https://www.gsma.com/mobilefordevelopment/wp-content/uploads/2015/02/GSMA_Case_Tigo_FinalProof02.pdf (accessed on 10 October 2021).
  7. GSMA: Tigo Baseline Report Executive Summary. Available online: https://www.gsma.com/mobilefordevelopment/wp-content/uploads/2014/03/TIGO-Baseline-Report-final.pdf (accessed on 10 October 2021).
  8. GSMA: Tigo Kilimo Impact Evaluation. Available online: https://www.gsma.com/mobilefordevelopment/wp-content/uploads/2015/09/GSMA_Tigo_Kilimo_IE.pdf (accessed on 10 October 2021).
  9. Tende, I.G.; Kubota, S.; Yamaba, H.; Aburada, K.; Okazaki, N. Evaluation of farmers market information system to connect with some social stakeholders. J. Inf. Process. 2018, 26, 247–256. [Google Scholar]
  10. Mutuku, L.; Kirui, K.; Kamau, M. ShambaConnect: Case Study on the Hybrid Design of an Application for Kenyan Extension Officers. In Proceedings of the 13th Participatory Design Conference: Short Papers, Industry Cases, Workshop Descriptions, Doctoral Consortium Papers, and Keynote Abstracts, Windhoek, Namibia, 6–10 October 2014; Machinery: New York, NY, USA, 2014. [Google Scholar]
  11. Ninsiima, D. “Buuza Omulimisa” (ask the extension officer): Text messaging for low literate farming communities in rural Uganda. In Proceedings of the Seventh International Conference on Information and Communication Technologies and Development (ICTD ’15), Singapore, 15–18 May 2015; Machinery: New York, NY, USA, 2015. [Google Scholar]
  12. Iraba, M.L.; Venter, I.M. Empowerment of Rural Farmers through Information Sharing Using Inexpensive Technologies. In Proceedings of the South African Institute of Computer Scientists and Information Technologists Conference on Knowledge, Innovation and Leadership in a Diverse, Multidisciplinary Environment, Cape Town, South Africa, 3–5 October 2011; Association for Computing Machinery: New York, NY, USA, 2011. [Google Scholar]
  13. Jain, M.; Pratyush, K.; Bhansali, I.; Liao, Q.V.; Truong, K.; Patel, S. FarmChat: A Conversational Agent to Answer Farmer Queries. J. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2018, 2, 170. [Google Scholar] [CrossRef]
  14. Mostaço, G.; Campos, L.; Souza, Í.; Cugnasca, C. AgronomoBot: A Smart Answering Chatbot Applied to Agricultural Sensor Networks. In Proceedings of the 14th International Conference on Precision Agriculture, Montreal, QC, Canada, 24–27 June 2018; International Society of Precision Agriculture: Monticello, IL, USA, 2018. [Google Scholar]
  15. Naman, J.; Pranjali, J.; Pratik, K.; Jayakrishna, S.; Soham, P.; Jayesh, C.; Mayank, S. Agribot: Agriculture-specific Question Answer System. IndiaRxiv 2019. Preprint. [Google Scholar] [CrossRef]
  16. The United Republic of Tanzania, Ministry of Agriculture. Available online: https://www.kilimo.go.tz/index.php/en/resources/view/pesticides-registered-in-tanzania (accessed on 10 October 2021).
  17. Tropical Pesticides Research Institute. Available online: https://www.tpri.go.tz/2019-publications (accessed on 10 October 2021).
  18. Tanzania Bureau of Standards. Available online: https://www.tbs.go.tz/pages/banned-products (accessed on 10 October 2021).
  19. Sepp, H.; Jürgen, S. Long short-term memory. J. Neural Comput. 1997, 9, 1735–1780. [Google Scholar]
  20. Wang, J.-H.; Liu, T.-W.; Luo, X. Combining Post Sentiments and User Participation for Extracting Public Stances from Twitter. Appl. Sci. 2020, 10, 8035. [Google Scholar] [CrossRef]
  21. Al-Laith, A.; Shahbaz, M.; Alaskar, H.F.; Rehmat, A. AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus. Appl. Sci. 2021, 11, 2434. [Google Scholar] [CrossRef]
  22. Yasar, H.; Kilimci, Z.H. US Dollar/Turkish Lira Exchange Rate Forecasting Model Based on Deep Learning Methodologies and Time Series Analysis. Symmetry 2021, 12, 1553. [Google Scholar] [CrossRef]
  23. Goodfellow, I.; Bengio, Y.; Courville, A. Softmax Units for Multinoulli Output Distributions. In Deep Learning; MIT Press: Cambridge, MA, USA, 2016; pp. 180–184. [Google Scholar]
  24. Dauphin, Y.N.; Vries, D.H.; Bengio, Y. Equilibrated adaptive learning rates for non-convex optimization. In Proceedings of the 28th International Conference on Neural Information Processing Systems—Volume 1 (NIPS’15), Montreal, QC, Canada, 7–12 December 2015; MIT Press: Cambridge, MA, USA, 2015; pp. 1504–1512. [Google Scholar]
  25. Refaeilzadeh, P.; Tang, L.; Liu, H. Cross-Validation. In Encyclopedia of Database Systems; Liu, L., Özsu, M.T., Eds.; Springer: Boston, MA, USA, 2016. [Google Scholar]
  26. OZEKI NG SMS Gateway. Available online: https://ozekisms.com/ (accessed on 10 October 2021).
  27. OZEKI NG Android SMS. Available online: https://ozeki.hu/p_1617-ozeki-android-smpp-sms-gateway.html (accessed on 10 October 2021).
  28. Uwamahoro, F.; Berlin, A.; Bylund, H.; Bucagu, C.; Yuen, J. Management strategies for banana Xanthomonas wilt in Rwanda include mixing indigenous and improved cultivars. Agron. Sustain. Dev. 2019, 39, 22. [Google Scholar] [CrossRef] [Green Version]
  29. Shimwela, M.; Ploetz, R.; Beed, F.; Jones, J.; Blackburn, J.; Mkulila, S.; Van Bruggen, A. Banana xanthomonas wilt continues to spread in Tanzania despite an intensive symptomatic plant removal campaign: An impending socio-economic and ecological disaster. Food Secur. 2016, 8, 939–951. [Google Scholar] [CrossRef]
  30. Vodacom Tanzania. Available online: https://vodacom.co.tz/voice-sms#\protect\leavevmode@ifvmode\kern-.1667em\relax (accessed on 10 October 2021).
Figure 1. Use case diagram showing user functions.
Figure 1. Use case diagram showing user functions.
Agronomy 11 02411 g001
Figure 2. Sequence diagram of requesting crop protection information “knowledge” by SMS.
Figure 2. Sequence diagram of requesting crop protection information “knowledge” by SMS.
Agronomy 11 02411 g002
Figure 3. Deep learning model for processing Swahili natural language queries.
Figure 3. Deep learning model for processing Swahili natural language queries.
Agronomy 11 02411 g003
Figure 4. System design based on 3-tier architecture.
Figure 4. System design based on 3-tier architecture.
Agronomy 11 02411 g004
Figure 5. Algorithm for processing SMS queries.
Figure 5. Algorithm for processing SMS queries.
Agronomy 11 02411 g005
Figure 6. Farmer requesting and receiving banana crop protection information (symptoms and control measures of banana Xanthomonas wilt) through SMS: (a) Keyword SMS query. (b) Natural language SMS query.
Figure 6. Farmer requesting and receiving banana crop protection information (symptoms and control measures of banana Xanthomonas wilt) through SMS: (a) Keyword SMS query. (b) Natural language SMS query.
Agronomy 11 02411 g006
Figure 7. Banana crop protection information (symptoms and control measures of banana Xanthomonas wilt) with elaboration picture (visual symptoms) in the Web system.
Figure 7. Banana crop protection information (symptoms and control measures of banana Xanthomonas wilt) with elaboration picture (visual symptoms) in the Web system.
Agronomy 11 02411 g007
Figure 8. Average validation accuracy in 4-fold cross-validation experiment.
Figure 8. Average validation accuracy in 4-fold cross-validation experiment.
Agronomy 11 02411 g008
Figure 9. Final training accuracy of the deep learning model.
Figure 9. Final training accuracy of the deep learning model.
Agronomy 11 02411 g009
Table 1. Timetable of the fieldwork.
Table 1. Timetable of the fieldwork.
NoDatesActivity
15 July 2021~11 July 2021Preparation: The first author video called research assistants to discuss study budget, deliverables, and sampling. Afterwards, 100 banana farmers were selected by research assistants.
212 July 2021~4 September 2021Data collection: Two research assistants visited farmers in their households to collect data using the questionnaire guides prepared in Swahili language. Each research assistant surveyed 50 farmers.
35 September 2021~12 September 2021Data entry: The two research assistants entered the collected data in the questionnaires into Microsoft Word files (electronic format). The two research assistants emailed the Word files and scanned copies of the questionnaires to the first author.
413 September 2021~19 September 2021Data processing: The first author compared raw data in the questionnaires with electronic data in the Word files to check accuracy and consistency.
Table 2. Association between preferred way of requesting information via SMS and age of farmers.
Table 2. Association between preferred way of requesting information via SMS and age of farmers.
Natural LanguageShort FormatFisher’s Exact Test
Above 40 years669p < 0.0001
40 years or below520
Table 3. Sample Swahili natural language queries collected from farmers.
Table 3. Sample Swahili natural language queries collected from farmers.
Swahili Natural Language QueryLabel
Tafiti zilizofanywa kuhusu magonjwa ya ndizi ni zipi?MG
Magonjwa ya mlipuko yanapaswa kudhibitiwa kwa njia gani?
Magonjwa gani ya mlipuko yameathiri sana zao la ndizi?
Je, ni aina gani ya ndizi zinaathirika sana magonjwa ya mlipuko?
Ni kwa kiasi gani magonjwa ya mlipuko yanadhuru kilimo cha ndizi?
Dawa za magonjwa ya ndizi zimefanyiwa utafiti?DW
Dawa za tunda la ndizi zinawezaje kumdhuru mlaji?
Je kuna dawa za zao la ndizi sio nzuri kwa mlaji?
Je, dawa katika zao la ndizi zina nguvu sawa?
Dawa za ndizi zenye madhara kidogo kwa binadamu ni zipi?
Udhibiti wa mbolea feki za zao la ndizi ukoje?MB
Madhara yanayosababishwa na mbolea feki ni yapi?
Nitazijuaje mbolea zilizokatazwa kuwekwa kwenye migomba?
Mbolea iliyokatazwa kutumika kwa kilimo cha ndizi inatambulikaje?
Mbolea gani zimethibitishwa hazina madhara kwa mlaji wa ndizi?
Table 4. Corresponding English translations of Swahili natural language query samples in Table 3.
Table 4. Corresponding English translations of Swahili natural language query samples in Table 3.
English Translation of Swahili Natural Language QueryLabel
What studies have been conducted on banana diseases?MG
How are banana outbreak diseases controlled?
Which banana outbreak diseases have major impacts?
Which banana cultivars are more impacted by outbreak diseases?
What extent of impact do outbreak diseases have on bananas?
Have banana pesticides been researched?DW
How can banana pesticides harm end consumers?
Are any banana pesticides harmful to end consumers?
Are banana pesticides equally effective?
Which banana pesticides are less harmful to end consumers?
How are fake banana fertilizers controlled?MB
What are the effects of fake fertilizers?
How can I identify banned banana fertilizers?
How is banned banana fertilizer identified?
Which banana fertilizers are certified as harmless to end consumers?
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Tende, I.G.; Aburada, K.; Yamaba, H.; Katayama, T.; Okazaki, N. Proposal for a Crop Protection Information System for Rural Farmers in Tanzania. Agronomy 2021, 11, 2411. https://doi.org/10.3390/agronomy11122411

AMA Style

Tende IG, Aburada K, Yamaba H, Katayama T, Okazaki N. Proposal for a Crop Protection Information System for Rural Farmers in Tanzania. Agronomy. 2021; 11(12):2411. https://doi.org/10.3390/agronomy11122411

Chicago/Turabian Style

Tende, Isakwisa Gaddy, Kentaro Aburada, Hisaaki Yamaba, Tetsuro Katayama, and Naonobu Okazaki. 2021. "Proposal for a Crop Protection Information System for Rural Farmers in Tanzania" Agronomy 11, no. 12: 2411. https://doi.org/10.3390/agronomy11122411

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop