Characterization dataset of oil palm empty fruit bunch (OPEFB) fibers – Natural reinforcement/filler for materials development

Natural fibers used as reinforcements or fillers for materials development greatly affect properties and performance of end-use applications. As a consequence of conditioning processes such as grinding and sieving, average fiber length varies significantly. It is thus necessary to estimate the length as statistical data distribution rather than a single mean value. This approach implies length measurement of a significant number of fibers; however, a very high number of data points requires not only long-time frames but also significative amount of work. To address these issues, this article details a facile methodology to measure the length of a large number of natural fibers of oil palm empty fruit bunch (OPEFB) together with a statistical analysis to verify the correspondence between theoretical distributions and experimental data. Moreover, further information related to spectrophotometric, physico-chemical, mechanical, thermal, and morphological characteristics of OPEFB fibers coming from oil palm cultivation in Ecuador are presented. The data will contribute to comprehensively and rigorously describe the overall effects of natural fiber lengths on material properties.


a b s t r a c t
Natural fibers used as reinforcements or fillers for materials development greatly affect properties and performance of end-use applications. As a consequence of conditioning processes such as grinding and sieving, average fiber length varies significantly. It is thus necessary to estimate the length as statistical data distribution rather than a single mean value. This approach implies length measurement of a significant number of fibers; however, a very high number of data points requires not only long-time frames but also significative amount of work. To address these issues, this article details a facile methodology to measure the length of a large number of natural fibers of oil palm empty fruit bunch (OPEFB) together with a statistical analysis to verify the correspondence between theoretical distributions and experimental data. Moreover, further information related to spec-trophotometric, physico-chemical, mechanical, thermal, and morphological characteristics of OPEFB fibers coming from oil palm cultivation in Ecuador are presented. The data will contribute to comprehensively and rigorously describe the overall effects of natural fiber lengths on material properties. © 2022 Escuela Politécnica Nacional. Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Value of the Data
• The testing method for fiber length can be used by researchers to measure the dimensions of a large number of fibers. Statistical analysis of OPEFB fiber length data led to establish models that can be applied to other natural fibers. • Fiber length distribution after milling process is a particularly important parameter in natural fiber reinforced composite materials, adsorbents in batch/continuous adsorption processes, and as briquettes for power generation. • The characterization data provide valuable information on the physico-chemical, morphological, and dimensional properties of OPEFB fibers from biowaste widely generated worldwide.

Data Description
The non-homogenous nature of OPEFB fibers has paramount influence on properties and performance of end-use applications [ 1 , 2 ]. In this regard, average fiber length varies markedly, not only in OPEFB fibers, but also in a great number of natural fibers, which magnifies and diversifies the aforementioned influence [ 3 , 4 ]. Particularly, as a result of conditioning processes such as grinding, the fiber length dispersion increases considerably. In the case of natural fiber sieving, the retention is not solely determined by sieve aperture size. The heterogenous dimensions of fibers, superficial defects, static electricity, and so on combined by random movement during the process leading to a very complex mechanism of classification. In this context, it is necessary to characterize the length as a range or distribution rather than a single mean value. However, getting numerous measurements of fiber length in shorter time frames is difficult to achieve due to the amount of work required.
On the other hand, it is also important to complement information regarding main properties of OPEFB fibers coming from different geographical zones worldwide. Since physical, chemical, and morphological properties of oil palm cultivation may change depending on edaphoclimatic conditions, further characteristics are essential to describe and contrast the effects of using their wastes on end-use applications. In this article, raw OPEFB fibers coming from oil palm cultivation in Ecuador were evaluated to obtain reference data on their length distribution, functional groups, as well as on their chemical, mechanical, thermal, and morphological properties. Heterogeneity of non-sieved OPEFB fibers, as well as of OPEFB fibers retained on meshes No. 20, 30, 40, and 50 can be seen in Fig. 1 , while fiber length data are available at Mendeley data repository (Link: https://data.mendeley.com/datasets/4y32ks2t62/2 ) [5] . Additionally, summary statistics of fiber length from each group studied are shown in Table 1 .
A theoretical distribution was determined for each data set in order to characterize the fiber length, as can be seen in Fig. 1 . The non-sieved fiber length data fitted an exponential distribution ( Eq. (1) ): where l is the OPEFB fiber length, and l 0 and θ are the location and scale parameters of the exponential distribution, respectively.
Meanwhile, the length data of OPEFB fibers retained on meshes No. 20, 30, 40, and 50 fitted three-parameter gamma distributions ( Eq. (2) ): where l is the OPEFB fiber length; l 0 , θ , and k , are the location, scale, and shape parameters of the gamma distribution, respectively, and (k ) is the gamma function ( Eq. (3) ). The parameters of the gamma distribution are related to the mean μ and variance σ 2 by the Eq (4) and Eq (5) , respectively.
Besides, the parameters of the theoretical fitted distributions for each group studied of the OPEFB fiber length, their standard errors for the maximum likelihood estimation (MLE), and pvalue for Kolmogorov-Smirnov (KS) test are presented in Table 2 .
On the other hand, the physico-chemical and mechanical properties are illustrated in Fig. 2 a. and 2 b., respectively. It was determined that lignocellulosic materials (lignin, cellulose, and hemicellulose) were the main components of the fibers, with the cellulose content predominating. Furthermore, Table 3 shows the data from Fourier transform infrared (FTIR) spectroscopy, where the wavelengths of the most important bands and the functional groups to which they were associated are described. These functional groups were present in cellulose, hemicellulose, and lignin.
Moreover, the data from thermogravimetric analysis (TGA) in terms of thermal stability of OPEFB fibers are shown in Fig. 3 ., where four stages of weight loss were observed around the following temperature ranges: 25-150 °C, 150-250 °C, 250-350 °C, and 410-600 °C.
Lastly, scanning electron microscopy (SEM) micrographs of OPEFB fibers at 425, 450, 475, 635, and 2500 × magnification are displayed in Fig. 4 . The fibers exhibited a porous surface and the presence of elementary fibers and microfibers. Additionally, a variation in diameter along fiber length was observed.

Experimental Design, Materials, and Methods
OPEFB waste was collected from palm oil extraction industries located in the northwest of Ecuador. The waste was dried at room temperature, ground in a blade mill (SHINI model SG-2348E), and then classified by a sieve shaker (HUMBOLT model H-4325) and a set of ASTM laboratory analytical sieves of five aperture lengths (mesh No.: 16, 20, 30, 40, and 50) as can be seen in Fig. 5 . Thereafter, the fiber length groups studied, which were dried separately at 103 °C for 3 h, consisted of non-sieved OPEFB fibers and OPEFB fibers retained on meshes No. 20, 30, 40, and 50 due to unknown size limits related to retained fibers on mesh 16.

OPEFB Fiber Length Measurement
OPEFB fiber length was determined using a methodology that led to more data being collected in less time. Initially, about 40 photographs of OPEFB fibers were taken from each of the length groups mentioned above. The OPEFB fiber length measurement was carried out by processing the photographs with the ImageJ® software. Images were taken keeping focal length (5.58 mm), aperture (f/1.8), sensitivity (ISO 100), and resolution (4608 ×3456 pixels) constant. About 53,0 0 0 measurements were made for non-sieved OPEFB fibers, while 21,50 0, 17,80 0, 23,0 0 0, and 33,0 0 0 for OPEFB fibers retained on meshes No. 20, 30, 40, and 50, respectively. The measurement scale was established with a standard precision graduated ruler. The methodology for conditioning OPEFB waste and measuring the fiber length is depicted in Fig. 5 .

Statistical Analysis
Statistical analysis of the distribution fit of length data was performed assuming that, during the grinding process of OPEFB waste, the fibers were randomly cut into smaller ones, and the number of these elements was expected to decrease exponentially with respect to their length; similar to what has been reported in other mechanical processes involving natural fibers [8][9][10] . In addition, it was assumed that a particular OPEFB fiber was randomly cut along its length by the blade mill, being (m ) the average cuts per unit length and ( ml ) the average cuts in a fiber of length (l) . In like manner, the number of cuts (n ) for a particular fiber of length (l) was assumed to follow a Poisson distribution ( Eq. (6) ). On the other hand, the probability of the event in which a fiber segment of length (l) was not cut into smaller fibers by the blade mill was considered to be: (7) Taking into account that the aforementioned probability was the same as the probability of obtaining a fiber longer than a length (l) after the grinding process, the cumulative distribution function for the fiber length was: (8) Then, the distribution for the fiber length was obtained by differentiating Eq. (8) , as follows: Considering the previously mentioned about (m ) being the average cuts in a fiber per unit length, the average fiber length (θ ) after the grinding process was: Consequently, the distribution for the fiber length became: (11) Finally, a minimum fiber length ( l 0 ) obtained from the grinding process was considered due to the blade mill features, such as the screen diameter. Therefore, the distribution for the fiber length ( Eq. (11) ) was converted into Eq. (1) , where the variable ( l − l 0 ) followed an exponential distribution with mean (θ ) .
Likewise, since the gamma distribution is a generalization of the exponential distribution but with a shape parameter of ( k = 1 ) , it was assumed that the length data of the sieved OPEFB fibers followed, in general, three-parameter gamma distributions, as detailed in Eq. (2) .
All the distribution parameters, both for the exponential and for the gammas, were determined using MLE.
On the other hand, in the case of the two-parameter exponential distribution, the standard errors ( σ l 0 ) and ( σ θ ) were calculated considering the variances V ar( l 0 ) and V ar(θ ) [11] , which are presented in Eqs. (12) and (13) .
where, the standard error ( σ l 0 ) was inversely proportional to the number of observations, and the standard error ( σ θ ) was approximately inversely proportional to the square root of the number of observations.
Whereas, for the three-parameter gamma distribution, the standard errors ( σ l 0 ) , ( σ k ) , and ( σ θ ) were calculated using the Fisher information matrix ( I( k, θ ) ) presented in Eq. (14) ; from which, the variance of each parameter corresponded to each entry on the diagonal of the inverse matrix ( I −1 ) [12] . (14) where, being the inverse matrix proportional to ( 1 n ) , the standard errors (i.e., square root of variance) of each parameter were proportional to ( 1 √ n ) , e.g., doubling the number of observations reduced the error by a factor of √ 2 .
All the standard errors of the distribution parameters, both for the exponential and for the gammas, were determined with the Python library SciPy® using the scipy.stats.expon and scipy.stats.gamma methods.
Moreover, the proposed theoretical distributions were tested against a myriad of other distributions using residual sum of squares (RSS) with the Python package distfit®; from which, the exponential and gamma distributions were the best fit to the OPEFB fiber length data.
To determine whether the fiber length data differed from the theoretical distributions obtained, the KS test was carried out considering a level of significance of 0.01. Nevertheless, since the original KS test is no longer valid when distribution parameters are directly estimated from data [13][14][15] , a parametric bootstrap procedure was carried out to approximate the null distribution [ 16 , 17 ]. Being non-sieved fibers a mixture of lengths i.e., long and short fibers, the operator errors during length measurement increased, as well as the differences, but in much less intense, between the observations and the proposed theoretical distribution. However, since no other theoretical distribution validated by the KS test was found, it was assumed that there was no other better than the exponential distribution to model the length data of non-sieved fibers. Whilst, for sieved fibers retained on mesh No. 40 (for which, the p-value was lower than the assumed level of significance, and the best fit found was a generalized gamma distribution), it was considered appropriate to keep the three-parameter gamma distribution in order to have a general model [18] .

Complementary Characterization
Chemical composition of OPEFB fibers was determined in order to quantify the content of lignin (ASTM D1106), hemicellulose/cellulose (ASTM D1109), and extractives (ASTM D1107-ASTM D1110). In addition, moisture (ASTM D4 4 42) and ash (ASTM D1102) content were accomplished. The mechanical properties (ASTM D2256) of OPEFB fibers were carried out with a universal testing machine (INSTRON model 5544). A load cell of 100 N and a speed of 30 mm/min were used during the test. The mean values of Young's modulus, elongation at break, and toughness were determined using 30 specimens.
Infrared characteristics were obtained by FTIR spectroscopy in attenuated total reflection mode using a spectrometer (JASCO model FT/IR-C800). Twenty scans were completed in a range between 40 0 0 and 60 0 cm −1 with a resolution of 4 cm −1 . Furthermore, TGA was accomplished using a thermobalance (SHIMADZU model TGA-50). OPEFB fibers were heated at 10 °C/min under a nitrogen flow of 50 mL/min. The mass loss (%) of OPEFB fibers was analyzed between 20 and 600 °C.
SEM was performed on an electronic microscope (FEI Phenom model FP3950/00). OPEFB fibers were fixed on metallic supports with carbon surfaces, and then coated with Au/Pd in an argon atmosphere for 135 s under a current of 18 mA using a Sputter Coater SC7620 (Quorum). The distance between the samples and the coating source was 35 mm. Afterwards, samples were placed in the SEM chamber, and micrographs were taken at different magnifications.

Ethics Statements
Hereby, authors consciously assure that for the manuscript "Characterization dataset of oil palm empty fruit bunch (OPEFB) fibers -Natural reinforcement/filler for materials development", the following is fulfilled: • This material is the authors' own original work, which has not been previously published elsewhere. • The paper reflects the authors' own research and analysis in a truthful and complete manner.
• The paper properly credits the meaningful contributions of co-authors and co-researchers.
• The data are appropriately placed in the context of prior and existing research.
• All sources used are properly disclosed (correct citation). Literally copying of text must be indicated as such by using quotation marks and giving proper reference. • All authors have been personally and actively involved in substantial work leading to the paper, and will take public responsibility for its content.
We agree with the above statements and declare that this submission follows the policies of Data in Brief as outlined in the Guide for Authors and in the Ethical Statement.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.