SeedCalc, a new automated R software tool for germination and seedling length data processing 1

– The need to optimize seed quality assessment using new, more accessible, and modern computational resources has led to the emergence of new tools. In this paper, we introduce SeedCalc, a new R software package developed to process germination and seedling length data. The functions included in SeedCalc allow fast and efficient data processing, offering greater reliability to the variables generated and facilitating statistical analysis itself since the data are already processed with the appropriate structure to be statistically analyzed in the R software. SeedCalc is available free of charge at https://CRAN.R-project.org/package=SeedCalc.


Introduction
Electronic spreadsheets are used in a generalized manner to tabulate data, perform the most diverse experimental calculations, and process information. Nevertheless, errors in formulas or in handling of spreadsheets can significantly compromise the information generated (Powell et al., 2009). Thus, development of tools that minimize these risks and can increase the reliability of results is important.
Tests for evaluation of seed quality are routinely used in experimentation. One of the most prominent of these routine tests is the germination test, nearly always associated with some vigor test, such as a seedling growth test (Nakagawa et al., 1999). In the germination test, daily counting of the number of seeds germinated is common, and these data are frequently used for calculation of the Germination Speed Index (Maguire, 1962) and Germination Speed (Edmond and Drapala, 1958). This same procedure is applied to the seedling emergence data. Nevertheless, various other indexes can be calculated based on these data, which may be useful for making inferences regarding different aspects of seed germination response, such as uniformity, coefficient of variation, synchrony, among others.
Regarding seedling growth data, variables such as shoot, root, and total length are those most used, when image analysis software that contains other indexes that are automatically calculated based on these parameters is not used. Thus, several indexes may be calculated from these data, which could be a complicated task in dealing with a large number of lots/treatments, as occurs, for example, in phenotyping studies (Joosen et al., 2010), when a large number of seed lots or seeds from different cultivars are analyzed.
Among the tools that are able to serve the above purposes is SeedCalc, a new R package. R (R Core Team, 2019) is a powerful software used for statistical data analysis that has grown in importance in seed research. In this respect, several packages are available for this platform, which increases its applicability for analysis of the most varied types of data. SeedCalc was developed from the need to optimize acquisition of data from seed quality indicators. The package is available free of charge through CRAN (https://CRAN.R-project.org/ package=SeedCalc), and it can be used on various platforms, such as Windows, MacOS, and Linux.
For calculations, SeedCalc uses data from daily counting of germination/emergence and seedling length measurements to automatically generate a series of variables related to seed physiological quality. It includes calculations of germination/emergence percentage, time required for germination or emergence (T10, T50, T90, and mean germination time), speed (germination speed index, mean germination rate), variability or heterogeneity (coefficient of variation of germination time, variance of germination), uncertainty, and synchrony. In addition, through seedling measurements, some indexes can be generated, such as growth, uniformity, and vigor indexes.
Thus, considering the importance of developing systems that allow calculation of seed quality indicators in a simple and automatic way, the aim of this study was to present the SeedCalc package and its applicability to analysis of data obtained from germination and seedling growth tests.

Materials and Methods
All the functions developed were written in R programming language and, therefore, can be carried out in the R environment. R can be installed in the Windows, Linux, or Mac systems. Thus, this package can be freely used by the scientific community, regardless of the operating platform used.
The SeedCalc package can be installed simply by inserting the following command in the R software: > install.packages("SeedCalc") The following command must be typed in to load the package:

> library(SeedCalc)
The functions inserted in the package and the respective equations used, which were obtained from the scientific literature in the seed area, are described as follows.
GermCalc: applies all the functions related to data of seed germination and seedling emergence (Table 1).
PlantCalc: applies all the functions related to seedling analysis (Table 2).
To use these functions, the original data files can be saved in text files or electronic spreadsheets. However, they must meet some requirements. For the GermCalc function, the file must contain the first column with the time (any unit of time -days, hours, etc.), and the rest of the columns, identified with the treatment/lot, the number of seeds germinated in a cumulative manner (see Figure 1). For the PlantCalc function, the file must contain four columns in the following order: identification of the lot/treatment, seedling (identified from 1 to n, with n being the total number of seedlings), shoot length, and root length. The title of the columns, contained in the first line of the file, can receive any name, as long as the order of the columns respects the sequence of reference (see Figure 2A).
For generation of the corrected vigor index (Medeiros and Pereira, 2018), an additional file is necessary, composed of two columns: the first identifies the treatments/lots and the second, the germination percentages (see Figure 2B). The identifications of the treatments/lots must be identical in both files.
As an example of practical application of the SeedCalc package, real data were used from an experiment conducted with five commercial lots of soybean seeds. The seeds were tested for germination in accordance with the Rules for Seed Testing (Brasil, 2009). Four replications of 50 seeds each were placed on paper toweling (Germitest ® ) moistened with distilled water at the rate of 2.5 times the weight of the paper. Then rolls were made, and they were kept in BOD at 25 ºC in a program of 8 hours of light and 16 hours in the dark. Daily counts of the number of normal seedlings were made. For the seedling growth test, the seeds were placed in a linear arrangement on the upper third of the paper toweling (Germitest ® ) and were maintained under the same conditions described for the germination test. At three days after the beginning of the test, the shoot and root length were measured using a millimeter ruler. The data obtained in the germination and seedling growth tests were processed through the SeedCalc package.

FGP
Final germination percentage FGP=(n/N)×100 n is the number of seeds germinated, and N is the total number of seeds.
ISTA (2015) GI Germination speed index n is the number of seeds germinated on each day of daily counting up to the last count, and t is the number of days after the beginning of the test in each count.

MGT
Mean germination time ni is the number of seeds germinated per day (not the accumulated number, but the number corresponding to the i-th observation), and ti is the time since the beginning of the germination test up to the i-th observation.

VarGer
Variance of germination time t is the mean germination time, ti is the time between the beginning of the experiment and the i-th observation (day or hour), ni is the number of seeds germinated in time i, and k is the last count of the germination test.

Labouriau (1983)
CVt Coefficient of variation of germination time CV t =(S t /t ̅ )100 St is standard deviation of the germination time, and ̅ is mean germination time. Carvalho et al. (2005) Sinc Germination synchrony Z= ∑ , 2 / Cni, 2 = ni (ni-1)/2 and = ∑ (∑ − 1)/2 Cni is the combination of the seeds germinated in time i, two by two, and ni is the number of seeds germinated in time i.

Primack (1980)
Unc Uncertainty log 2 f i with fi given by, fi is the relative frequency of germination, and ni is the number of seeds germinated on day i.

Labouriau and
Valadares (1976) Germination speed Nichols and Unc Uncertainty log 2 f i with fi given by, fi is the relative frequency of germination, and ni is the number of seeds germinated on day i.

CVG
Germination speed coefficient fi is the number of newly germinated seeds on day i, and xi is the number of days from sowing.

Uniformity of germination
UnifG=( T90-T10) T90 is the time required for germination of 90% of the seeds, and T10 is the time required for germination of 10% of the seeds.
Demilly et al. n HL is the length of the shoot of each seedling, and n is the total number of seedlings evaluated. Nakagawa et al. (1999) mean_raiz

Mean root length
n RL is the length of the root of each seedling, and n is the total number of seedlings evaluated. Nakagawa et al. (1999) mean_total Mean total length Mean total = ∑ SL k i=1 n SL is the total length of each seedling, and n is the total number of seedlings evaluated. Nakagawa et al. (1999) mean_razao Mean of the root/shoot ratio n RRA is the ratio between the root and the shoot of each seedling, and n is the total number of seedlings evaluated.

Unif_1
Uniformity index Xi is the length of the seedling analyzed, X is the mean length of seedlings of the seed lot analyzed, n is the variable of total number of seedlings evaluated, ndead is the number of ungerminated or dead seeds present, and ntotal is the total number of seedlings. Christiansen (1942), adapted by Castan et al. (2018) Unif_2

Growth=[(mean(h)×wh)+(mean(r)×wr)]
mean(h) and mean(r) are the arithmetic means of shoot length and root length, respectively. wh and wr are adjustable weights in the formula for shoot and root, however, with reference values of 10 and 90, respectively.

Vigor=(Growth×wg)+(Uniformity×wu)
Growth is the growth index, and Uniformity is the uniformity index chosen by the user. wg and wu are adjustable weights in the formula for growth and uniformity, however, with standard values of 70 and 30, respectively.

Vigor=(Growth×wg)+(Uniformity×wu)
Growth is the growth index, and Uniformity is the uniformity index chosen by the user. wg and wu are adjustable weights in the formula for growth and uniformity, however, with standard values of 70 and 30, respectively.

Corrected vigor index
G is the percentage of germination of the seed lot.
Medeiros and Pereira (2018) Figure 1. Presentation of the germination count data organized for processing by SeedCalc.

Results and Discussion
The collection of functions available in the SeedCalc package implements various methods to describe the duration of the germination process in terms of germination indexes, as well as indexes related to seedling development. Thus, the data obtained in the germination and seedling growth tests were used to exemplify the application of the SeedCalc package. The use of the functions contained in the package and the interpretation of their outputs are better presented in the form of an example of application with real data.
As described, SeedCalc has two main functions: GermCalc and PlantCalc. The GermCalc function allows analysis of the daily counting data and receives the "Nseeds" argument, which refers to the number of seeds used in the germination test for each replication. The expression used for calculation of the germination indexes, with the use of the data presented in Figure 1, follows: > dados_ger <-read.table ('dados_ger. txt', h=T) > GermCalc(dados_ger, NSeeds = 50)

> GermCalc
Through this function, the variables presented in Figure 3 were generated.
The PlantCalc function allows analysis of the data of seedling length and, optionally, can receive the "Ger" argument, which refers to the mean of germination registered for the lots/treatments that is necessary to generate the Corrected vigor index. The expression used to carry out the analysis, using the data presented in Figure 2, follows:

> PlantCalc
The variables presented in Figure 4 were generated through these commands. These variables (Figure 4) are normally generated automatically on specific software for image seedling analysis, such as Vigor-S ® (Castan et al., 2018), SVIS ® (Sako et al., 2001), Groundeye ® (previously referred to as SAS) (Pinto et al., 2015) and SAPL ® (Medeiros and Pereira, 2018). However, the few systems developed have the disadvantage to be limited to analysis of restricted number of species. Thus, making these quality indexes available would facilitate their use for these and other species.
With SeedCalc, the indexes are implemented automatically, and can be used to generate more detailed information regarding the vigor of seed lots of any species, regardless of the way the data were acquired and without the need to use specific seedling analysis systems. That way, this information becomes more accessible to the scientific community, and has a direct impact on the seed sector. To illustrate, in Figure 5, soybean seedlings at three days after the beginning of the germination test can be seen from two samples, with their respective quality indexes. Lot B exhibits seedlings of shorter length and an irregular pattern, which is reflected in lower growth, uniformity, and vigor indexes compared to Lot A.
SeedCalc constitutes an innovative and efficient analysis tool to calculate indexes of seed germination and of seedling performance. The functions developed allow fast and efficient data processing, with a view to offering greater reliability to the variables generated and to facilitating statistical analysis itself, since the processed data have a suitable structure for analysis in R software (R Core Team, 2019). The use of these functions on the R software ensures they can be used freely by the scientific community.  This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Conclusions
The SeedCalc package is a free access tool and generates indexes based on daily counting data from germination/ emergence and seedling growth tests. It represents a powerful tool for research and will establish new computational approaches within the seed technology sector.