Application of ARIMA models in forecasting livestock products consumption in Tanzania

Poverty is a major problem Tanzania is facing, which depends on agriculture as the main economic activity. Different stakeholders have involved themselves in boosting agricultural productivity, especially in semiarid regions, where their main focus is on drought tolerant crops such as sorghum and millet. If this support is not linked with market opportunities, commodity prices may be depressed and discourage farmers. This paper determines prospect for increasing utilization of animal feed as the market opportunity for farmer by forecast consumption of livestock product such as eggs, milk, chicken and cow meat. Autoregressive integrated moving average models were used for forecasting with the data from FAOSTAT. The result shows that consumption of all livestock products will increase, hence expected demand for animal feed. This paper calls for more research works in analyzing factors that may affect consumption of livestock products such as population increase and change of consumption behavior toward livestock products. Subjects: Agricultural Economics; Agriculture and Food; Statistics for Business, Finance & Economics; Production Research & Economics

Application of ARIMA models in forecasting livestock products consumption in Tanzania Joseph Frank Mgaya 1 * Abstract: Poverty is a major problem Tanzania is facing, which depends on agriculture as the main economic activity. Different stakeholders have involved themselves in boosting agricultural productivity, especially in semiarid regions, where their main focus is on drought tolerant crops such as sorghum and millet. If this support is not linked with market opportunities, commodity prices may be depressed and discourage farmers. This paper determines prospect for increasing utilization of animal feed as the market opportunity for farmer by forecast consumption of livestock product such as eggs, milk, chicken and cow meat. Autoregressive integrated moving average models were used for forecasting with the data from FAOSTAT. The result shows that consumption of all livestock products will increase, hence expected demand for animal feed. This paper calls for more research works in analyzing factors that may affect consumption of livestock products such as population increase and change of consumption behavior toward livestock products.

ABOUT THE AUTHOR
Joseph Frank Mgaya is an assistant lecturer at the department of Accounting and Finance in the University of Dodoma. For the past nine years, he has been teaching different courses in the department related to accounting and finance. His research interest is mainly in rural development, especially in rural finance, agricultural improvement and agribusiness. Recently, he has gain interest in different econometric models such as forecasting models which are crucial in policy planning and development. This paper has reported an opportunity in the feed industry which can be utilized by farmers in semiarid regions to involve themselves in drought resistance crops such as sorghum and millet. As maize is preferred than sorghum and millet, linking farmers in feed industry is important. Substituting maize with sorghum and millet will be advantageous to the society as it will eliminate competition between animals and humans in the consumption of maize.

PUBLIC INTEREST STATEMENT
Poverty is a major problem Tanzania is facing, which depends on agriculture as the main economic activity for more than 75% of the population. Different stakeholders have involved themselves in boosting agricultural production especially in regions with less rain. Their main focus is in crops that need less rainfall such as sorghum and millet. If this support is successful, production will increase; however, if this increase in production is not linked with the market, prices may drop and discourage farmers. This paper determines the possibility of future increasing in usage of livestock product as the market opportunity for farmer in regions with less rainfall. The result shows that consumption of all livestock products will increase in the next five years. This means that there is an opportunity of farmers in regions of less rainfall to cultivate crops such as sorghum and millet and sell them in animal feed.

Introduction
Poverty is a major problem many African countries are facing, especially in the south of Sahara desert. Tanzania is one of the countries, with a population of about 45 million people (Tanzania, 2014). The main economic activity is agriculture which employs about 75% of the total population. Even though agriculture is the main activity, its structure is dominated by smallholder farmers who cultivate average farm size of .9-3 ha each with the use of very low technology. It is estimated that 70% of crop area is cultivated by hand hoe, 20% by ox plough and 10% by tractors (Chauvin, Mulangu, & Porto, 2012;Chauvin et al., 2012;Group, 2011;Majid, 2008). Due to poor technology and rainfall dependence system, the agricultural sector has shown poor performance and hence promotes more poverty to those who depend on it.
Both the government of Tanzania and other stakeholders have involved themselves in different projects to promote the generation and adoption of technologies that are capable of boosting agricultural productivity. In semiarid region, initiatives have been mainly focusing on droughttolerant crops such as sorghum and millet rather than tropical crops such as maize, which requires a more even distribution of rainfall (Blum & Sullivan, 1986). This support of government and donors to crops like sorghum and millet is important in increasing agricultural productivity in semiarid regions. But these projects could depress commodity prices and farmers' incomes if they are not linked to market opportunities for farmers, especially with the presence of preferred substitute crops such as maize (Amani, 2005).
Feed industry is a potential market for sorghum and millet to substitute maize in the market (Rohrbach and Kiriwaggulu, 2001). According to Abdulkadir, Na-Allah and Bala (2013) and Issa, Jarial, Brah, Harouna and Soumana (2015), sorghum can absolutely or partially replace maize in animal feed especially in poultry industry. As maize is preferred than sorghum and millet due to availability and price stability, linking farmers in feed industry is important. Substituting maize with sorghum and millet will be advantageous as it will eliminate competition between animals and humans in the consumption of maize. This paper estimates prospects for increasing utilization in feed concentrate market by forecast production of livestock product such as eggs, milk, chicken and cow meat.

Methodology
Autoregressive integrated moving average (ARIMA) models were used to forecast production of livestock products. These models were used because they consider only one variable under each observation. Scholars in different discipline such as Jose and Sojan (2013), Yeboah, Ohene and Wereko (2012), Wang and Yang (2017), Manoj and Madhu (2014), Mandal (2005) and Raymond (1997) used these models for forecasting. The main assumption of these models is that in time series analysis, there is an aspect of past patterns, which continue to remain in the future (Ramasubramanian, 2001;Heng, Zhang, & Yang, 2017). These models capture the pattern and use it to forecast future expected values.

Data collection
Data for all forecasted animal consumptions were collected from the Food and Agricultural Organization database (FAOSTAT, 2014). The range of the data is from 1961 to 2013. According to Box and Jenkins who were the pioneers of time series modeling, at least 50 observations are necessary to perform time series analysis.

Model description
An ARIMA model is divided into three components depending on the type of data. The first component is the autoregressive (AR) time series model which consists of past observations of the dependent variable (i.e. the variable of interest) in the forecast of future observations. The first-order autoregression model (AR (1)) is represented by the Independent Pricing and Regulatory Tribunal of NSW as follows: where y t is the dependent variable, a is a parameter, y t−1 is a lagged dependent variable and ε t is the random or white noise term which represents a shock which cannot be explained.
The second component is the moving average (MA) models which include past observations of the white noise process (i.e. past forecast errors) in the forecast of future observations of the dependent variable. The first-order MA model (MA (1)) is represented as follows: where b is the parameter and ε t and ε t−1 are forecast error and the lagged forecast error, respectively.
The combination of the two models MA and AR gives ARMA model, which is stationary. In case the data used are nonstationary, a third component is used to convert the data to achieve stationarity by differencing (integrating (I)) original series, which according to Rohrbach and Kiriwaggulu (2001), and Nau (2018) is represented as where y′ t is the future consumption, y t and y t−1 are original series and lagged original series, respectively (stationary data are defined below).
The combination of the three first-order models gives models which can be used to estimate future consumptions of livestock product.

Stages in ARIMA model building
The Box-Jenkins methodology for analyzing and modeling time series (Box, Jenkins, & Reinsel, 1994) is characterized by three steps: model identification, parameter estimation and model validation.

Model identification
In building time series models, data used are suppose to be stationary. If nonstationary data are used in a model, the results may indicate a relationship that is misleading (Baumohl & Lyocsa, 2009). So before identifying the model, time series data have to be tested for stationarity. Stationary data are the ones whose statistical properties do not change over time (Studenmund, 2016). More formally, a time series is stationary if it is characterized with constant mean and variance, and an autocovariance that does not depend on time (Ramasubramanian, 2001;Studenmund, 2016). If any of these characteristics are not met, the data are declared nonstationary. The autocorrelation function (ACF) will be applied in the data to determine this problem. If ACF plot is positive and shows a very slow linear decay pattern, then the data are nonstationary (Nau, 2018). This problem of nonstationarity can be corrected through appropriate differencing of the data if caused by mean or model transformation if caused by variance (Nau, 2018;Ramasubramanian, 2001). The next step is to find the initial values for the orders of seasonality and nonseasonality (p and q). In this step, the ACF and partial ACF (PACF) are fundamental analytical tools used. To calculate an autocorrelation of lag k mentioned as one characteristic of stationary data, we compute correlation between Y t and Y t−k over the n-k pairs in the data set where Y t is the original series, Y t−k is a lagged version of original series and µ is the mean of the data (Studenmund, 2016).
The PACF is defined as the linear correlation between Y t and Y t−k , controlling for possible effects of linear relationships among values at intermediate lags. The first-order partial is equal to firstorder autocorrelation but the second order can be calculated as follows: The ACF gives the order of AR (p) and PACF gives the order of MA (q).

Parameter estimation
After identifying suitable order for ARIMA(p,d,q), we attempt to find precise estimates of parameters of the model using least squares as described by Box and Jenkins. The parameters are obtained by maximum likelihood which is asymptotically correct for time series. Estimators are usually sufficient, efficient and consistent for Gaussian distributions and are asymptotically normal and efficient for several non-Gaussian distribution families (Mandal, 2005;Nyoni & Bonga, 2019). In this study, the parameters are estimated using SPSS.

Model validation
Models chosen in the last stage are validated using two methods which include Bayesian information criteria (BIC) and plots of ACF residuals.
2.3.3.1. BIC. The BIC is an index used in Bayesian statistics to compare which model is a best fit between two or more models. It is also known as the Schwarz-Bayesian criterion or the Schwarz information criterion. BIC is defined as where k is the number of parameters which the model estimates, N is the number of observations, θ is the set of all parameters and L(θ) is the likelihood of the model tested in a given data, when evaluated at maximum likelihood values of θ (Neath & Cavanaugh, 2012).
When using BIC, the model with the lowest BIC value is considered best fit. BIC is closely related to Akaike information criterion.
2.3.3.2. Plot of ACF residuals. Another method is by plotting the ACF of residuals to examine the goodness of fit. If most of the sample autocorrelation coefficients of the residuals are within the limits of ±1.96/√N where N is the number of observation in which the model is build, then the model is a good fit (Ramasubramanian, 2001;Wang-Chuan et al., 2017). For this study, the number of observation is 52; hence, the limit is ±.2718.

Model building
As stated above, three procedures were followed to build ARIMA models for consumption of animal products.

Model identification
In identifying the model, the ACF is applied in egg consumption data to check if data are stationary. Figure1 shows a very slow linear decay pattern which is a sign of nonstationary time series. This problem can be corrected by a first-degree order of differentiation. After applying autocorrelation, both ACF ( Figure 2) and PACF plots (Figure 3) show significant spike only at lag 1 meaning that higher order autocorrelations are explained by the lag 1 autocorrelation. This implies that there is no need of adding AR (p) or MA (q).

Parameter estimation and validation
After trial and error, four models are chosen and tested to find a model with low normal BIC. These models include ARIMA (1,1,1), ARIMA(0,1,1), ARIMA(1,1,0) and ARIMA(0,1,0). SPSS is used to  estimate each model parameter using ARIMA tab in expert modelers. ACF plots for all the models are within the limit of .2718 (see Figures 4-7), but ARIMA(0,1,0) is observed to have the lowest normal BIC and is chosen as a better model to forecast the consumption of eggs (see Table 1).
ARIMA(0,1,0) model is represented as where y t is the current production, y t−1 is the lagged value and ε t is the error term or the current shock.
From Table 3, the model is presented as y t ¼ y tÀ1 þ 463:654 þ ε t Figure 3. PACF plot after firstorder differencing of eggs consumption data. Model statistics and parameters for ARIMA(0,1,0) are shown in Tables 2 and 3. The forecast data for egg consumption are presented in Table 4.

Model identification
The procedure for building a consumption of meat model is the same as in the eggs model. The ACF is applied in cattle meat consumption data and again a very slow linear decay pattern is observed (see Figure 8). This is a sign of a nonstationarity problem which can be corrected by a first-degree order of differentiation. After applying autocorrelation, both ACF ( Figure 9) and PACF plots ( Figure 10) show significant spike only at lag 1 as in egg consumption data. This implies that there is no need of adding AR (p) or MA (q).

Parameter estimation and validation
Four models are chosen and tested to find a model with good fit. These models include ARIMA (1,1,1), ARIMA(0,1,1), ARIMA(1,1,0) and ARIMA(0,1,0). The ACF plots of these models are shown in Figures 11-14. All models are within the limit which indicates good fit. Results from SPSS presented in Table 5 are in favor of ARIMA(0,1,0) which has the lowest normal BIC and is chosen as a better model to forecast consumption of cattle meat.
From Table 7, the model is presented as y t ¼ y tÀ1 þ 4222:712þ ε t Model statistics and parameters for ARIMA(0,1,0) are shown in Tables 6 and 7. Forecast data for the cow meat consumption are presented in Table 8.

Model identification
To identify the model, the first step is to plot the ACF of the data. The plotted figure shows a very slow linear decay pattern which is a sign of nonstationary time series (Figure 15). This problem can be corrected by a first-degree order of differentiation. Figure 16 shows that autocorrelations are significant for large number of lags. This is because of the propagation of the autocorrelation at lag 1 and can be confirmed in Figure 17 which shows a significant spike only at lag 1 meaning that higher order autocorrelation is explained by the lag 1 autocorrelation. It implies that autocorrelation can be explained more easily by adding AR (p) than MA (q).

Parameter estimation and validation
Four models are chosen and tested to find a model with a good fit. These models include ARIMA (1,1,0), ARIMA(2,1,0), ARIMA(3,1,0) and ARIMA(3,1,1). The ACF plots in Figures 18-21 show that in the models, the correlation coefficients are within a limit. But the results from SPSS in Table 9 are in favor of ARIMA (3,1,0) which has the lowest normal BIC.
ARIMA(3,1,0) is represented as follows: y 0 t ¼a 1 y 0 tÀ1 þa 2 y 0 tÀ2 þ a 3 y 0 tÀ3 þε t where y' t is the dependent variable; a 1 , a 2 and a 3 are the parameters; y' t−1 , y' t−2 and y' t−3 are lagged differentiated dependent variables and ε t is the random error or white noise term (forecast error).
Model statistics and parameters for this model are shown in Tables 10 and 11. Forecast data for the cow meat consumption are presented in Table 12.

Model identification
The plot of ACF is created from the data. The plotted figure shows again a sign of nonstationary time series (Figure 22). This problem can be corrected by a first-degree order of differentiation. Figures 23 and 24 show a single spike at lag 1 which is a sign of MA(q) term. Also we can add a second-order differentiation but with no constant (Nau, 2018).

Parameter estimation and validation
Four models are chosen to be tested after trial and error which include ARIMA(0,1,1), ARIMA* (0,2,1), 1 ARIMA(0,2,1) and BROWN models. The Brown model is tested because it is equivalent to ARIMA(0,2,1) and if there is an element of liner trend in the data, smoothing models may be better in forecasting (Alwashali, Fares, & Mohamed, 2015;Nau, 2018). The ACF's of the four models are plotted and the results are shown in Figures 25-28. The results show that all ARIMA models have some coefficients that have passed over the limit of .2718. Only BROWN model has all coefficients    Figure 15. Autocorrelation plot of cow milk consumption data used to test for stationarity. Figure 16. ACF plot after firstorder differencing of cow milk consumption data. Figure 17. PACF plot after firstorder differencing of cow milk consumption data. Figure 18. Residual plots for ACF and PACF after estimating ARIMA(1,1,0) for cow milk consumption. Figure 19. Residual plots for ACF and PACF after estimating ARIMA(3,1,0) for cow milk consumption.
within the limit. Again Table 13 shows the results of the four models in which BROWN model is observed to have lower normal BIC.
The model is represented as follows: where ŷ t+1 is the forecast value (weighted moving average of all past observation), α is the series weight for each observation and y i−k are the lagged observations (Ostertagová & Osterta, 2011). After inclusion of the data from  Tables 14 and 15. Forecast data for the cow meat consumption are presented in Table 16.        Overall results from all four estimated models show that production of all livestock products will increase (Table 17). Cow milk is going to be consumed most in the future, followed by chicken and cattle meat. This is expected to increase demand for animal feed and is an opportunity for sorghum and millet farmers to capture new markets in animal feed if they can produce enough to keep up with the increase in demand.

Conclusion
The animal feed industry is one of the potential opportunities for sorghum and millet market. As population and income grow, the demand for livestock products (such as meat, milk and eggs) is expected to grow. As a consequence, the demand for animal feed is expected to increase also.
At the moment, efforts should be done to encourage the use of sorghum and millet in animal feed which will help to reduce humans and livestock competition over maize. As long as there is market for sorghum and millet in feed industries, smallholder farmers in semiarid region will be encouraged to abandon traditional crops which are rain dependent. This will improve their livelihood and reduce poverty in these areas.
Forecast done in this study does not consider some other factors that may influence consumption of livestock product. For instance, in recent years, population growth rate has been very high which will also have positive effect on the consumption of livestock. Another factor which may  have an impact is a change of eating behavior. In recent years, people have learnt much about eating healthy which may have effect on the consumption of livestock products which are argued to be unhealthy, especially red meat. Hence, more research works are needed to understand by how much both of these two factors may affect the consumption of livestock products so as to measure real opportunity for sorghum and millet farmers to capture market in feed industry.

Funding
The author received no direct funding for this research.