Statistical analysis and forecasting of cotton yield dynamics in Kashkadarya region of Republic of Uzbekistan

. There are phenomena that are significant to research because of how they grow and change through time in practically every discipline. One could attempt to direct a process, forecast the future using knowledge of the past, or characterize the distinctive aspects of a series using a finite quantity of information. The techniques used to handle time series are heavily influenced by the techniques created by mathematical statistics for distribution series. The most basic to the most complicated time series analysis techniques exist in statistics today. The article discusses the statistical analysis of a time series, specifically the average yield of cotton in the Kashkadarya region, Uzbekistan, and the Republics, using data from the Central Statistical Office of Uzbekistan from 2001 to 2020. The study involved constructing point and interval estimates for the average cotton yield with a 95% guarantee, identifying different types of trends, and predicting future yields for the region. Through the use of the Durbin-Watson statistical criteria, it was discovered that there is an autocorrelation dependence in the average cotton yield, indicating that the yield for the current year is dependent on yields from past years. The methods used in this study can be applied to further research conducted by students and scientists.


Introduction
In almost every field there are phenomena that are important to study in their development and change over time. One can, for example, strive to predict the future on the basis of knowledge of the past, to control a process, to describe the characteristic features of a series on the basis of a limited amount of information. When processing time series, the methods are largely based on the methods developed by mathematical statistics for distribution series. To date, statistics has a variety of methods for analyzing time series from the most elementary to the most complex [1][2][3][4][5].
The statistical analysis and forecasting of cotton yield dynamics involves studying the pattern of change in the yield of cotton over time using statistical methods. This type of analysis is commonly performed using time series analysis, which involves analyzing data collected over a period of time [7][8][9][10][11]. Time series analysis can help identify trends, seasonal patterns, and other patterns in the data, which can be used to make predictions about future yield. In the context of cotton yield dynamics, statistical analysis may involve collecting data on the average yield of cotton in a particular region or country over a period of time [8][9][10][11][12][13][14][15]. This data can then be analyzed using various statistical methods, including regression analysis, ARIMA modeling, and exponential smoothing. These methods can help identify trends and patterns in the data and make predictions about future yield [7][8][9][10][14][15][16][17].
In addition to predicting future yield, statistical analysis of cotton yield dynamics can also be used to identify factors that affect yield. For example, researchers may investigate the impact of weather patterns, irrigation methods, or other agricultural practices on cotton yield. By identifying factors that affect yield, researchers can develop strategies to improve yield and reduce losses [11][12][13][14]. Overall, the statistical analysis and forecasting of cotton yield dynamics is an important tool for agricultural researchers and policymakers. By identifying trends, making predictions, and identifying factors that affect yield, this type of analysis can help improve cotton production and ensure food security for communities that depend on cotton as a source of income and sustenance [17][18][19].
There are three main tasks in the study of time series. The first of them consists in describing the change in the corresponding indicator over time and identifying certain properties of the series under study [7][8][9][10][11]. To do this, they resort to a variety of methods: the calculation of a general indicator of changes in levels over time and the average growth rate; the use of various smoothing filters that reduce fluctuations in levels over time and allow you to more clearly present development trends; selection of curves characterizing this trend; identification of seasonal and other periodic and random fluctuations; measuring the dependence between the members of the series (autocorrelation) [11][12][13][14][15]. The second task of the analysis is to explain the mechanism for changing the levels of the series; to solve it, regression analysis is usually used. In the third, the description of the change in the time series and the explanation of the mechanism for the formation of the series are often used for statistical forecasting, which in most cases comes down to extrapolation of the detected development trends [16][17][18][19].

Materials and methods
The above mentioned tasks were solved using various methods: 1) The study of the yield of agricultural processes, as a discrete dynamic series and forecasting their yield based on experimental data, play an important role in determining the economic efficiency of farming, dekhkan farms; 2) In this work, the processing and analysis of cotton yield for the observation period 1991-2020 in the Kashkadarya region of the Republic of Uzbekistan was carried out as a discrete time series; 3) Using the methods of statistical analysis of time series, point and interval estimates for the average yield of cotton were constructed , explicit types of trends were determined and the yield was predicted for subsequent years, and various statistical hypotheses were tested [1][2][3][4][5][6][7].
In general, the time series { , ∈ }consists of four components: trend; fluctuations relative to the trend; seasonality effect; random component.

Results and discussion
The text is describing a research study that focuses on the cotton yields in the Kashkadarya region for the period between 1991 and 2020. The data collected from this study is presented in a table, and the researchers have used this data to analyze ̅ −the average yield of cotton in the region. The researchers have created four different graphical representations of this data: a scatter plot, a pie chart, a histogram, and a chart with areas.
Each of these graphs displays the data in a different way, which allows for a more comprehensive analysis of the cotton yield dynamics in the region over the given time period (Fig. 1). The geometric image of the observed data, the coordinate system give grounds, in the first approximation, to assume the hypothesis that the trend part of the process (the general direction of the development of the process) has a linear dependence where the unknown parameters are determined by the least squares method: based on experimental data, solving the following system of normal equations (Table 1): Solving the system equation (1) and using the calculations in Table 1, we have:  (2) ( ) = 1 + 0 the main hypothesis 0 ∶ 1 = 0 was rejected and an alternative hypothesis 1 ∶ 1 ≠ 0 with a significance level was accepted = 0,05.
In many observational problems, the observation sample is statistically independent, but in time series they are usually dependent, and the nature of this dependence can be determined by the position of the observations in the sequence. Autocorrelation is a correlation between successive and preceding members of a time series. It was found that the presence of autocorrelation in the average cotton yields in the region was checked and the following one was obtained: Y t \u003d Y t-1 + t , For further research, it was necessary to calculate the following finite differences on the observed data (Table 2).
The results of the difference in the observed data is given in Table 2. Afterwards, there was: It was reported that the coefficients of variation of differences and found that 1 ≈ 2 ≈ 3 . Therefore, first-order finite differences eliminate the linear trend (Table 3). Using Table 3, the formulas from the literature [1][2][3][4][5] determine the values of the autocorrelation coefficients = 1,2,3,4,5 (where: , time shift, i.e. the time interval of one phenomenon lagging behind the other associated with it): The difference of the value from zero gives reason to believe that there is a significant autocorrelation between the yield of cotton. It was reported that it was indented to check the hypothesis of the existence of an autocorrelation dependence between the yield of cotton using the Durbin -Watson criterion: Therefore, the Durbin -Watson criterion of 95% also proves with a guarantee that the average cotton yield in the region has an autocorrelation dependence Y t = Y t-1 + t.  According to the results of the calculation, significance level was accepted = 0,05 cm (Table 4).

Conclusions
Based on the above statistical analyzes, the dynamics ̅ −of cotton yields in the Kashkadarya region in the Republic of Uzbekistan as a discrete time series with reliability = 0,95 (Table-4), the following conclusions can be drawn: 1. point and interval statistical estimates for ̅ − average cotton yield in Kashkadarya region were constructed. In particular, the average ̅ −cotton yield in Kashkadarya region will be with a 95% guarantee, accounted for 24.37 and 26.17 q/ha; 2. Explicit types of the trend were determined and its linearity was established ( ) = 0,54 + 25,27; 3. Using the Durbin -Watson criterion, it was found that the average cotton yield in the region has an autocorrelation dependence Y t = Y t-1 + t , where = Cov (Y t ,Y t + 1 ) = M [( − ̅ )( +1 − ̅ )]. In general, it has been proven that the dynamics of the average cotton yield in the Kashkadarya region forms a non-stationary time series.