Validation of Arima Model on Production of Papaya in India

The present study was carried out to validate the Auto Regressive Integrated Moving Average (ARIMA) model on production of papaya in India. In this study, we used secondary data for the period from 1950-51 to 2017-18. The ARIMA (p, d, q) models were fitted for the period from 1950-51 to 2009-10 and the remaining periods i.e. 2010-2018 were used for validation of the ARIMA model. The validity of the models were tested by using standard statistical criterias (i.e., R 2 , RMSE and MAPE). Among all the ARIMA (p, d, q) models, ARIMA (3, 1, 4) was found to be best fitted model for validation and forecast the future values. The forecasted values of papaya production from 2010-11 to 2017-18 were approximately similar to original values.


INTRODUCTION
The Papaya (Carica papaya L.) which belongs to member of the Caricaceae family and is the most economically important species in the family. Papaya is the third most cultivated tropical crop world-wide. Brazil and India are the largest producers of papaya although Mexico is the main exporter. Among common fruits, papaya is ranked first on nutritional scores for the percentage of vitamin A, vitamin C, potassium, folate, niacin, thiamine, riboflavin, iron and calcium, and fiber. Moreover, fruits, stems, leaves and roots of papaya are used in a wide range of medical applications and papain production. In India total cropped area under papaya is 138 thousand ha and production of 5989 thousand MT which is highest in the world. The average productivity of papaya in the country is 43.3 MT per ha. India contributes about 42.6% of world's papaya production. Only 0.08% of the total domestic production is exported and rest (99.92%) all is consumed within the country. Papaya is mostly cultivated in the states of Andhra Pradesh, Karnataka, Gujarat, Orissa, West Bengal, Assam, Kerala, Madhya Pradesh and Maharashtra. Andhra Pradesh is the leading producer of papaya with production of 1687.82 thousand MT. Three notable varieties are Pusa Delicious, Pusa Dwarf, Pusa Nanha.
Mahesh and Jain (2013) studied to determine the area, production and productivity of papaya in Raipur district and Chhattisgarh state by using Compound Growth Rates and they were found that the area, production of papaya would be tremendously increasing and productivity has showed the fluctuations during their study period. Debnath et al. (2013) were studied forecasting the cultivated area and production of cotton in India using Autoregressive Integrated Moving Average (ARIMA) model by using the data from 1950-51 to 2010-11. They revealed that ARIMA (0, 1, 0) ARIMA (1, 1, 4) and ARIMA (0, 1, 1) are the best fitted models for forecasting of cotton area, production and yield in India respectively. Prabakaran et al. (2014) analyzed the Pulses Area and Production in India during the period from 1950-51 to 2011-12 by using ARIMA model and he found that ARIMA (1, 1, 0) and ARIMA (2, 1, 1) models were best fitted to forecast Pulses Area and Production in India.
Ramana Murthy et al. (2018) studied the appropriate ARIMA model for forecasting sunflower Production in India and he found that ARIMA (4, 1, 4) model was found to be a best fitted model to forecast Sunflower Production in India.
The objective of the present study is to fit and validate different ARIMA (p, d, q) models for the time series data on Production of Papaya in India.

MATERIALS AND METHODS
The data of the study for a period of 60 years (1950-51 to 2009-10) and the validation period from 2010-11 to 2012 -18 pertaining to the production of papaya were collected from the source of indiastat.com. In order to examine and fitted various ARIMA (p, d, q) models on Production of Papaya in India by using SPSS 20 version.

Auto Regressive Integrated Moving Average (ARIMA)
The ARIMA methodology is also called as Box-Jenkins methodology (Box and Jenkins 1976). The Box-Jenkins procedure is concerned with fitting a mixed ARIMA model to a given set of data. The main objective in fitting ARIMA model is to identify the stochastic process of the time series and predict the future values accurately. This method shave also been useful in many types of situations which involve the building of models for discrete time series and dynamic systems. However the optimal forecast of future values of a time series are determined by the stochastic model for that series. A stochastic process is either stationary or nonstationary. The first thing to note is that most time series are non-stationary and the ARIMA models refer only to a stationary time series. Since the ARIMA models refer only to a stationary time series the first stage of Box-Jenkins model is for reducing non-stationary series to a stationary series by taking the differences.
The ARIMA (p, d, q) process is given by

RESULTS AND DISCUSSION
In the present study, the data for Papaya Production for the period of 60 years (1950-51 to 2009-10) were used for the study.

Model Identification.
Among several methods studies the goodness of fitted models were examined by highest R 2 value, lowest RMSE (Residual Mean Square

Model Estimation and Verification
The parameters of the model were estimated by using SPSS 20 package. The ARIMA (

CONCLUSION
The study revealed that among all ARIMA (p, d, q) models, the ARIMA (3, 1, 4) model was found to be the best fitted model for forecasting of future values of production of papaya in India. Based on p-value of Ljung-Box Q statistic, the model ARIMA (3, 1, 4) was good. The forecasted values of validation period were approximately closed to be actual values. Time series analysis and forecasting is an active research area over the last few decades. The accuracy of time series forecasting is fundamental to many decision processes and hence the research for improving the effectiveness of forecasting models. With the efforts of Box and Jenkins (1970), the ARIMA (p, d, q) models has become one of the most popular methods in the forecasting research and practice.