Transcriptome Analysis of Jojoba (Simmondsia chinensis) during Seed Development and Liquid Wax Ester Biosynthesis

Jojoba is one of the main two known plant source of natural liquid wax ester for use in various applications, including cosmetics, pharmaceuticals, and biofuel. Due to the lack of transcriptomic and genomic data on lipid biosynthesis and accumulation, molecular marker breeding has been used to improve jojoba oil production and quality. In the current study, the transcriptome of developing jojoba seeds was investigated using the Illunina NovaSeq 6000 system, 100 × 106 paired end reads, an average length of 100 bp, and a sequence depth of 12 Gb per sample. A total of 176,106 unigenes were detected with an average contig length of 201 bp. Gene Ontology (GO) showed that the detected unigenes were distributed in the three GO groups biological processes (BP, 5.53%), cellular component (CC, 6.06%), and molecular functions (MF, 5.88%) and distributed in 67 functional groups. The lipid biosynthesis pathway was established based on the expression of lipid biosynthesis genes, fatty acid (FA) biosynthesis, FA desaturation, FA elongation, fatty alcohol biosynthesis, triacylglycerol (TAG) biosynthesis, phospholipid metabolism, wax ester biosynthesis, and lipid transfer and storage genes. The detection of these categories of genes confirms the presence of an efficient lipid biosynthesis and accumulation system in developing jojoba seeds. The results of this study will significantly enhance the current understanding of wax ester biology in jojoba seeds and open new routes for the improvement of jojoba oil production and quality through biotechnology applications.


Background
Global environmental conditions have been negatively impacted by the increasing consumption rate of fossil fuel. This issue has become an industrial concern at both local and international levels. One way to decrease the consumption of fossil fuels is to develop and use alternative biofuel sources, such as bioethanol and biodiesel. Jojoba (Simmondsia chinensis) is a tree that originated in Mexico (Sonora Desert) and USA (California) and can be used for the production of economic biofuel. Jojoba is currently being cultivated in many parts of the world, including Mexico, Australia, India, Saudi Arabia, Tunisia, and Egypt. Jojoba produces about 65% oil in its seeds with specific characteristics known as liquid wax. The oil from a jojoba seed is different from other plant oil (triglycerides) in that it is mainly composed of liquid wax esters (long chain fatty acid esterified to a long chain fatty alcohol) [1][2][3]. The United States is the main producer of jojoba oil [4]. Previously, the main source of wax ester was the heads of whales' sperms, but whale hunting has been banned since 1986 [5].
understanding of the molecular biology of the jojoba seed and how this molecular data can be used to improve jojoba oil production through biotechnology applications.

Results
To investigate the global gene expression during jojoba seed development, seed samples were collected during seed development and used for de novo transcriptome analysis. As this is the first transcriptome study of jojoba, all samples were assembled to reconstruct sequences of the transcriptome without a reference genome. The assembled contigs were merged in a set of non-redundant unique transcripts and then clustered into unigenes with a 200 bp minimum length cut-off. The clustered unigenes were annotated by blasting them against various public databases using BLASTN and BLASTX. The databases included KEGG, NT, Pfam, GO, NR, UniProt, and EggNOG. Annotation included the prediction of ORFs to identify protein-coding sequences in the unigenes. Additionally, the unigenes were used for read alignment and the abundance was estimated as read count and FPKM.

De Novo Assembly of the Jojoba Transcriptome
For the initial assembled contigs, 176,106 genes and 233,291 transcripts were detected with an average contig length of 710 bp and 165,664,392 total assembled bases. For the longest contigs, 176,106 genes and 176,106 transcripts were detected with an average contig length of 589 bp and 103,727,640 bases assembled. For unigene contigs, 167,684 genes and 167,684 transcripts were detected with an average contig length of 600 bp and 100,770,670 total bases assembled. The GC content was approximately 40% in all types of contigs (Supplementary Materials Table S1).
After trimming, the total number of read bases ranged from 13 × 10 9 to 14 × 10 9 bases, the total reads ranged from 137.5 × 10 6 to × 143. 8 10 6 bases, and the GC content was about 50%. The trimmed reads were clustered into transcript unigenes and merged into one file to construct the combined de novo transcriptome reference. In the combined merge, 176,106 unigenes and 176,106 transcripts were detected with 17,658 bp maximum contig length, 201 bp minimum contig length, 348.0 bp median contig length, and 589.01 bp average contig length (Supplementary Materials Table S2).

Annotation on GO Database
For a functional prediction of the unigenes, they were annotated against the GO database to classify the annotated unigenes using BLASTX of DIAMOND with an E-value cut-off of 1.0 × 10 −5 . The detected unigenes were categorised in three main groups according to their contribution to the plant's functions: biological process (BP), cellular component (CC), and molecular function (MF). The GO annotation indicated that about 7.47% of the total 167,684 unigenes had GO assignments in the assembled reference transcriptome. The annotated unigenes were distributed among the three main categories as 5.53% BP, 6.06% CC, and 5.88% MF, and the majority of the unigenes (82.53%) were not annotated to any functional group (Figure 1). The 5.53% BP, 6.06% CC, and 5.88% MF ( Figure 1) were distributed among 33 BP terms, 20 CC terms, and 14 MF terms consecutively, giving a total of 67 functional groups (Figure 2).

EggNOG Analysis
To identify orthologous proteins for the annotated unigenes in eukaryotic orthologous groups (KOG), clusters of orthologous groups (COGs), and non-supervised orthologous groups (NOGs), BLASTX of DIAMOND was used with an E-value cut-off of 1.0 × 10 −5 on the EggNOG database. The annotated unigenes were mapped to the corresponding orthologous groups in the EggNOG database. The EggNOG database focuses on the predicted function of unigenes based on their orthologs in other species. The annotated unigenes were separated into three main groups-information storage and processing, cellular processes and signalling, and metabolism-which are distributed in 25 functional groups in the EggNOG database ( Figure 6).      The most predominant functional group in the EggNOG database was the replication, recombination, and repair group (7802 unigenes, 21.48%). Other major groups included transcription (1462, 4.03%), signal transduction mechanisms (1542, 4.25%), posttranslational modification, protein turnover, and chaperones (1612, 4.44%), carbohydrate transport and metabolism (1031, 2.84%), intracellular trafficking, secretion, and vesicular transport (866, 2.38%), translation, ribosomal structure and biogenesis (797, 2.19%), amino acid transport and metabolism (778, 2.14%), energy production and conversion (504, 1.39%), secondary metabolites biosynthesis, transport and catabolism (446, 1.23%), and inorganic ion transport and metabolism (407, 1.12%). Lipid transport and metabolism, which is the most related group to this study, represented 390 unigenes (1.07%). About 16,866 (46.44%) unigenes were not annotated in the EggNOG database ( Figure 6, Table 1). Gene expression level was estimated across the samples as the unigene read count and FPKM. EggNOG annotation showed that 1.07% (390) of the annotated unigenes was related to lipid transport and metabolism. In jojoba, lipid metabolism generally includes de novo fatty acid (FA) biosynthesis, FA elongation, FA desaturation, long chain fatty alcohol (LCFA) biosynthesis, triacylglycerol (TAG) metabolism, and wax ester biosynthesis. In this study, all key enzymes and proteins involved in these main lipid biosynthesis processes were represented (Table 2, Figure 7).

Fatty Acid Biosynthesis
Eight different genes for the de novo biosynthesis of fatty acids were detected in our transcriptome, represented by various numbers of unigenes with an overall FPKM range of 0. 35 (Table 2). Long-chain acyl-CoA synthetase (LACS) transfers the acyl group from acyl-ACP at the end of their biosynthesis in the chloroplast into CoA to produce acyl-CoA and then transports them to the ER for chain elongation. Eleven unigenes were detected for LACS in the jojoba transcriptome with an FPKM of 3.21 to 95.33, indicating a high level of expression (Table 2, Figure 7).

Fatty Acid Desaturation
Several fatty acid desaturases were detected in this study. Some of them are located in the endoplasmic reticulum (ER), including delta (12)

Fatty Alcohol Synthesis and Oxidation
LCFA biosynthesis is a critical step for wax ester biosynthesis. Ten unigenes for alcohol-forming fatty acyl-CoA reductase (FAR) with a FPKM range of 0.56 to 1613.05 were detected in this study. It is noteworthy that FAR had one of the highest FPKM values among the lipid metabolism enzymes in this study. It catalyses the conversion of a long-chain acyl-CoA into long-chain alcohol, which is the second component of wax ester precursors. This enzyme had previously been isolated from jojoba [33]. LCFAs are oxidized into fatty acyl-CoA by the long chain fatty alcohol oxidase (FAO). Three unigenes were detected for FAO with the highest FPKM value of 39.02. FAR and FAO keep the balance between VLCFAs and LCFA to maintain wax ester biosynthesis during seed development.  Table 2). This group includes the highest expressed genes in the jojoba transcriptome. For example, nsLTPs are the highest with an FPKM of 9748.18, LTP is next with 5214.36, and OL follows with an FPKM value of 3779.71 (Table 2).

Discussion
Jojoba is the main plant source for natural liquid wax ester. The available genomic information about jojoba is very scarce in public databases, especially the transcriptome of developing seeds, which is the most important stage of wax ester biosynthesis. Data obtained from transcriptome analysis of jojoba developing seeds were employed to establish the lipid biosynthetic pathway during seed development with an emphasis on the VLCFAs and LCFAs (the main components of wax ester) in jojoba seeds. The focus of this study was the detection of lipid biosynthesis genes and the interconnection between the functions of this gene network in the coordination of lipid biosynthesis.
It has been reported previously that the main source for liquid wax ester was from the heads of sperm whales, where it was believed to regulate their buoyancy [15], and from the oil of jojoba seeds [5,34,35]. Since whale hunting was banned, the only two known sources for liquid wax ester are jojoba oil and palm wax, or carnauba from Copernicia prunifera [6]. This makes this study the first de novo transcriptome analysis of liquid wax ester-producing seeds.

Lipid Biosynthesis Gene Expression Profiling
The genes involved in lipid transport and metabolism were the focus of this study because they relate directly or indirectly to wax ester biosynthesis in jojoba seeds. Gene expression was estimated as the read count of unigenes or FPKM. About 390 (1.07%) unigenes were related to lipid transport and metabolism in this study. This was lower than the detected unigenes for lipid transport and metabolism (889, 3.8%) in the transcriptome of the developing seeds of pecan trees [21]. These discrepancies in transcriptome sequencing statistics indicate the high complexity, the diversity, and the plant-dependence of sequencing data. The lipid transport and metabolism genes were distributed among de novo FA biosynthesis, FA desaturation, FA elongation, fatty alcohol biosynthesis, TGA metabolism, phospholipid metabolism, wax ester biosynthesis, and lipid transfer and storage proteins/enzymes.
De novo synthesis of fatty acids (C:16-C:18) occurs in the plastids, which include palmitic, stearic and oleic acids [36]. Very long-chain fatty acids (VLCFAs) (C > 20) are synthesised in the endoplasmic reticulum (ER) by chain elongation of C:16-CoA and C:18-CoA that were initially synthesised in the plastids [37]. Phospholipids, including phosphatidylcholine (PC), phosphatidylethanolamine (PE), phosphatidylserine (PS), and phosphatidylinositol (PI), are synthesised in the ER. In plants, TAGs are synthesised in the ER [38] and are stored in oil bodies as an energy source for seed germination [39].

Fatty Acid Biosynthesis
Unigenes for the main eight enzymes involved in FA biosynthesis were detected in our study. MCAL was detected with an FPKM of 138.28. It is interesting that this enzyme was not detected in similar studies of transcriptome analysis during oil seed development. It provides an additional route for the production of malonyl-CoA, which is the main precursor for FA biosynthesis. The other seven genes for FA biosynthesis were detected in this study with different expression levels and number of unigenes, but the highest expression level among them ranged from an FPKM of 21.47 to 264.77.
The rate-limiting enzyme of FA biosynthesis is ACCase. The multisubunit form of ACCase was detected in our transcriptome. Three subunits, BC, CT-α, BCCP, were highly expressed during seed development with FPKM values of 195.99, 88.75, and 229.71, respectively, whereas the fourth subunit CT-β showed lower expression with an FPKM value of 4.95 (Table 2). A low expression level of CT-β was previously reported during the seed development of Perilla frutescens [27]. The expression of the same four subunits was reported in similar studies on Arbidopsis [40] and B. napus [41]. Fourteen unigenes were detected for ACCase and its subunits in the transcriptome of Millettia pinnata developing seeds [24]. Two types of ACCase were reported in C. merolae algae; the multifunctional type is encoded by nuclear genes and is located in the cytosol, whereas the multisubunit type is encoded by nuclear and plastid genes and is located in the plastid [36]. The upregulation of ACCase definitely enhances the composition and content of FAs in seeds [27,42], thus a high level of ACCase expression during jojoba seed development would enhance the accumulation of more substrate for de novo FA biosynthesis.
The enzymes MCAAT, KASI, KAR, HAD, and EAR use malonyl-CoA through a series of condensation reactions to form acyl-ACP (C:16, C:18). This group of enzymes was highly expressed with FPKM values of 152.71, 264.77, 198.16, 254.27, and 174.01, respectively ( Table 2). The high expression level of this group is consistent with previously reported data [27,43]. In the Millettia pinnata transcriptome, MCAAT, KAR, and EAR had two unigenes, whereas KASI and HAD had one detected unigene [24]. Fatty acid biosynthesis genes were also detected in the Camelina sativa seed transcriptome [22].
Both FATA and FATB (one and three unigenes, respectively) were detected in the Camellia oleifera seed transcriptome [25]. The absence of FATA in our study suggests a high production of saturated FAs in jojoba plastid with the major desaturation of FAs occurring in the ER.
LACS transfers the acyl group to CoA and transports the produced acyl-CoA into the ER for chain elongation and the generation of VLCFAs. LACS were highly expressed in developing jojoba seeds, indicated by the detection of 11 unigenes with an FPKM range from 3.21 to 95.33 (Table 2). In the B. napus transcriptome, ten unigenes were detected, three of which were upregulated and six were downregulated [43]. A family of six LACS genes was detected in the Millettia pinnata transcriptome with different levels of expression [24]. Also, LACS kept a high level of expression during the seed development of Pongamia pinnata [20], Perilla frutescence [27], and Camelina sativa [22].

Fatty Acid Desaturation
Various FA desaturases were detected in the developing jojoba seeds in both the plastid and ER. The ER-located enzymes included FAD2, FAD6, SLD, STE, and DES1 (SLD) with FPKM values of 69.07, 19.88, 33.31, 50.69, and 8.87, respectively ( Table 2). The plastid-located desaturases included FAD4, FAD5, FAD6, and SAD6, with FPKM values of 2.05, 33.65, 8.42, and 1369.99, respectively. SAD6 converts stearoyl-ACP to oleoyl-ACP by the introduction of a cis double bond between carbons 9 and 10 of the acyl chain. Most of the plastid desaturases had a fairly low expression, except for SAD6, which had one of the highest expression levels in this study, indicating that it is the predominant active desaturase in developing jojoba seeds. ER-located enzymes had a fairly high expression, suggesting high desaturation activity in developing jojoba seeds. In the developing seeds of B. napus, FAD2, FAD3, FAD5, FAD6, FAD7, and FAD8 were detected at an upregulated level, whereas no SADs were detected [43]. FAD2 and FAD3 represented high expression level in the transcriptome of the developing seeds of the tree peony species P. lutea and P. rockii [26]. One unigene for the chloroplast FAD6 and FAD7/8 was detected with high expression in the transcriptome of the developing seeds of Perilla frutescens [27].

Fatty Acid Elongation
VLCFAs are synthesized by the repeated addition of 2C by a fatty acid elongase (FAE) complex. Three subunits of elongase KCS (FPKM, 865.4), KCR (FPKM, 402.24), and ECR (FPKM, 76.33) were highly expressed in developing jojoba seeds. On the other hand, the subunit HCD had low expression levels with an FPKM of 0.27 (Table 2). A high expression of elongase subunits indicates a high FA elongation activity, which is essential for the biosynthesis of VLCFAs and LCFAs (the two precursors for wax ester biosynthesis). In Arabidopsis, the KCS9 homologue was involved in the biosynthesis of C24 from the C22 fatty acid. The KCS9 gene's knockout mutations showed an accumulation of C20 and C22. Additionally, seed-specific FAE was involved in the synthesis of erucic acid in B. napus [44]. In rice, WSL4, a homologue of Arabidopsis thaliana (KCS6/CER6) was involved in the biosynthesis of VLCFAs, especially the conversion of C20 to C24 FAs. This was confirmed by the expression of WSL4 and the production of C24 FA in yeast. Wax biosynthesis was highly reduced in WSL4 mutants [45].

LCFA Biosynthesis
LCFA biosynthesis is the second most important step for wax ester biosynthesis, with VLCFA biosynthesis taking first place. LCFAs are produced in plants through acyl reduction and the decarboxylation pathways with an intermediate step of aldehyde production [46]. LCFA biosynthesis is carried out by the FAR gene, which converts VLCFA to LCFA without releasing aldehyde, as documented previously in the acyl reduction pathway [46]. Ten unigenes were detected in the jojoba transcriptome for FAR with FPKM values ranging from 0.56 to 1613.05. This enzyme was originally isolated from developing jojoba embryos and was able to produce fatty alcohol in E. coli. It also produced fatty alcohol in B. napus, directing some of its seed triglycerides to wax ester biosynthesis [33]. The WSD1 enzyme was also detected in our transcriptome. It has wax ester synthase/diacylglycerol acyltransferase dual function. This was confirmed using wsd1 mutants that had reduced stem wax. The in vitro recombinant of the enzyme showed that WSD1 had ad 10 times wax synthase as its diacylglycerol acyltransferase activity and revealed the accumulation of wax ester in Saccharomyces cerevisiae [5]. LCFAs are converted to VLCFAs by oxidation through the activity of FAO, which was expressed in the jojoba transcriptome with an FPKM value of 39.02. FAR and FAO maintain an optimum concentration of VLCFAs and LCFAs for wax ester biosynthesis in developing jojoba seeds.

Triglyceride (TGA) Metabolism
The main five enzymes for TGA biosynthesis, G3PDH, GPAT, LPAAT, PAP, and DGAT, were represented in the jojoba transcriptome with FPKM values of 60.72, 13.59, 125.06, 24.62, and 16.53, respectively ( Table 2). In B. napus seeds, various numbers of unigenes were detected for this group of enzymes. For example, 12, 11, 6, 2, and 1 unigenes were detected for G3PDH, GPAT, LPAAT, PAP, and DGAT, respectively [43]. A group of 7 unigenes were detected for GPAT, 8 unigenes for LPAAT, 4 unigenes for PAP, and 10 unigenes for DGAT in the Millettia pinnata transcriptome [24]. A high expression of TAG biosynthesis enzymes (GPAT, LPAT, PAP, DGAT, and PDAT) was positively correlated with oil accumulation during the seed development of Pongamia pinnata [20]. GPDH, LPAAT, and PAP maintained a high expression during seed development of the peony tree species P. lutea and P. rockii, whereas DGAT was higher in the developing seeds of P. lutea [26]. Unigene for GPAT, LPAT, PAP, and DGAT were detected in the Perilla frutescens seed transcriptome at expression levels consistent with the expression levels of FA biosynthesis genes [27]. In the Camellia oleifera seed transcriptome, six unigenes for PDAT (three PAP, three DGAT, one GPAT, and one LPAAT) were detected at a high level of expression during seed development [25]. They were also detected in the transcriptome of the developing seed of Camelina sativa [22].
TAGs are degraded by three lipases, including MAGL, DAGL, and TAGL (SDP). They were represented in the jojoba transcriptome with FPKM values of 15.34, 10.15, and 32.09, respectively ( Table 2). Twelve unigenes were detected for TAGL and three were detected for MAGL in the B. napus transcriptome; these are essential for lipid degradation [43]. SDP1 (TAGL) was detected in the transcriptome of Camelina sativa developing seeds [22]. SDP regulates the TAG quantity and the TAG fatty acid composition of seed oil during seed development. Manipulation of SDP provides a potential approach for enhancing the quantity and quality of jojoba seed oil.

Phospholipid Metabolism
In our transcriptome, CDIPT, PSS, PES, PDCT, and CTP (CDS) were highly represented, with FPKM values of 24.10, 7.99, 6.6, 35.98, 11.93, and 20.11, respectively (Table 2). CDIPT catalyses the formation of PI from CDP-DAG. In a previous study, CDIPT was cloned from Arabidopsis thaliana and expressed in E. coli to produce PI from CDP-DAG and myo-inositol substrates; this reaction is reversible, indicating CDIPT must act to create balance and maintain the optimal concentration of myo-inositol in the cell [47]. Also, PSS was isolated from Arabidopsis and produced in E. coli confirming that it is essential for PS biosynthesis [48]. PES from Arabidopsis has been isolated and characterised. It was expressed in E. coli and proved that E. coli accumulated PE, but to our knowledge, it has not been detected in any other oil seed transcriptome [49]. PDCT unigenes were detected in the B. napus transcriptome [43]. PDCT represented high expression in P. lutea and P. rockii, but the level was higher in P. lutea [26]. PDCT expression showed a bell-shaped pattern during seed development in the Perilla frutescens seed transcriptome [27]. It was also detected in the transcriptome of the developing seed of Camelina sativa [22]. PDCT maintains the balance of free fatty acids and TAG by regulating the interconversion between PC and diacylglycerol (DAG) during the biosynthesis of TAGs. CTP (CDS) is encoded by five genes in Arabidopsis. CDS1, CDS2, and CDS3 are located in the ER membrane and have a similar function. Double mutants of CDS1 and CDS2 showed an accumulation of PA and a drastic decrease in PI level, leading to defects in cell division and expansion [50].
Phospholipases also maintain the balance between free fatty acids and TAGs. We detected six types of phospholipases-PLA1, PLA2, PLC, PLD, PLSGR2, and BDG3-with FPKM values of 3.8, 159.08, 35.9, 127.73, 13.24, and 42.11, respectively (Table 2). PLC and PLD showed a similar high expression in both tree peony species, P. lutea and P. rockii, during seed development and oil accumulation. In the same study, PLA2 was highly expressed during seed development in P. rockii, but, interestingly, its transcripts were not detected in the developing seeds of P. lutea [26]. PLA2 was detected in castor endosperm and may contribute to acyl editing [41]. In the Camellia oleifera seed transcriptome, 11 unigenes for PLD and eight for PLC were detected with high expression during seed development [25]. PLSGR2 is a phospholipase-like protein detected in Arabidopsis that is involved in the early development of gravitropic response and functions in a way similar to PLA-type proteins [51]. Putative BDG3 lysophospholipase has been detected in Zea mays (maize) [52].

Wax Ester Biosynthesis
Two wax ester synthases were detected in our transcriptome: WS and WSD1 with FPKM values of 101.99 and 32.29, respectively ( Table 2). Although WS was first isolated from jojoba, it was not functionally expressed in other organisms, including E. coli and S. cerevisiae [5,53]. WSD1 is a bifunctional wax ester in Arabidopsis thaliana. It works as a WS/diacylglycerol acyltransferase (DAGT) in the ER membrane [5]. WSD1 was detected in the transcriptome of Camelina sativa developing seeds [22]. The presence of both WS and WSD1 in the transcriptome of the developing jojoba seeds indicates that during seed development, a high metabolic rate is directed toward biosynthesis of wax ester or DAG. WSD1 could also play a major role in maintaining the balance between wax ester and TAGs in jojoba seeds.

Lipid Transfer and Storage Proteins/Enzymes
The five detected lipid transfer and storage proteins included ACP, ACBP, nsLTPs, DIR1 (LTP), and OL. They were highly represented in the jojoba transcriptome with FPKM values of 1036.72, 244. 44, 9748.18, 5214.36, and 3779.71, respectively (Table 2). Interestingly, nsLTPs (FPKM 9748.18), LTP (FPKM 5214.36), and OL (FPKM 3779.71) had the highest expression levels in this study. ACP binds and carries the growing acyl chain during fatty acid biosynthesis. ACBP is a 10 kDa protein that strongly binds medium and long chain acyl-CoA esters and contributes to their intracellular transport. Fifteen unigenes encoding ACBP were detected in the transcriptome of the developing seeds of B. napus, of which eight were downregulated in seeds compared to leaves [42,43]. nsLTPs are uniquely detected in land plants. They are small proteins (6.5-10 kDa) that have the ability to bind various hydrophobic molecules including fatty acids, phospholipids, and fatty acyl-CoA. They play an important role in the intracellular movement of membrane lipids. In addition, they contribute to various biological processes in plants, including cuticle and suberin biosynthesis, plant growth and development, fertilization, seed development and germination, and the response to biotic and abiotic stresses [54][55][56][57]. They also work as antimicrobials against pathogens (fungal, bacterial, or viral) by perturbing the integrity and function of the pathogen's biological membranes [58,59]. LTPs are highly conserved, small (7-9 kDa) proteins detected only in higher plants [60,61]. They greatly contribute to the transfer of lipids, including phospholipids and fatty acids, through membranes [62]. They are divided into two subfamilies based on their size: LTP1 (9 kDa) and LTP2 (7 kDa) [63]. Different LTPs bind different lipid molecules, including fatty acids (C10-C18), acyl-CoAs, phospholipids, sterols, and various drugs [61]. Oleosins are small proteins that bind triglycerides and wax esters and facilitate their storage in the oil bodies. They have a hydrophilic oil body-binding domain between two amphipathic domains that contribute to oil body stabilization [43]. The B. napus transcriptome had 16 unigenes for oleosins. In addition, three homologues of the Arabidopsis oleosin (ole1, ole2, ole4) were detected in the transcriptome of Camelina sativa developing seeds. Ole1 had five-fold of the level of ole2 and ole4 expression [22].

Materials and Methods
Jojoba (Simmondsia chinensis, Link) seeds were collected at different stages of seed development from jojoba plants growing under normal Saudi Arabian conditions in Taif, Saudi Arabia. Seeds were collected at seven different developmental stages (1-7) based on their size (starting from 0.1 g as an early stage to 0.8 g as a late developmental stage). The seeds in each stage were then lyophilized at −58 • C for 48 h and then ground to a fine powder. Equal amounts of each stage of lyophilized seeds were pooled and used for RNA isolation.

Purification of mRNA
The total RNA was isolated from the developing seed powder mixture using RNeasy plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. RNA samples with a RNA integrity number (RIN) of more than 7 were used for subsequent analysis. PolyA mRNA was purified using oligo-dT linked to magnetic beads for two cycles of purification. During the second elution step, polyA mRNA was fragmented and primed with a random hexamer for cDNA synthesis.

Synthesis of cDNA Libraries
cDNA libraries were constructed using TruSeq Stranded mRNA LT Sample Prep Kit following the manufacturer's instructions (Illumina, California, United States). The fragmented, primed mRNA was reverse transcribed into first strand cDNA. Actinomycin D was included during first-strand synthesis to improve strand specificity by allowing RNA-dependent synthesis and to prevent DNA-dependent synthesis. The cDNA was purified using AMPure XP beads along with a magnetic stand. The blunt 3 end of the cDNA was adenylated to prevent ligation to each other and enhance adapter ligation. Adapters (which have an extra T at the 3 end) were ligated to both ends of the cDNA to enable them to hybridize on a flow cell. Ligated fragments were cleaned using AMPure XP beads. Indexed fragments were enriched by PCR amplification with adapter-specific primers and AMPure XP beads.
The quality of the cDNA libraries was checked using the DNA 1000 chip on the Agilent 2100 Bioanalyzer (Santa Clara, California, United States). The size and purity of the sample was expected to produce a band at about 260 bp. The libraries were quantified using qPCR according to the Illumina Sequencing Library qPCR Quantification Guide. The DNA libraries were normalized to 10 nM and pooled in equal volumes by the addition of 10 µL of each normalized library.

Sequencing and Sequence Analysis
The pooled libraries were sequenced on the NovaSeq 6000 System using NovaSeq 6000 S4 Reagent Kit (Illumina, California, United States) to generate 100 × 10 6 paired end reads of 100 bp with a sequence depth of 12Gb per sample. The obtained raw FastQ sequences were checked for various quality parameters, including the overall read's quality, the total bases, the total reads, GC%, and Q30 (% of bases with a quality Phred score over 30). Raw FastQ files were trimmed using the Trimmomatic program [64] to remove the adapter sequence, bases with a quality less than 3 from the read ends, the bases of reads that do not qualify for window size 4 and mean quality 15, reads with a length less than 36 bp, low-quality reads, contaminant DNA, and PCR duplicates. The trimmed reads of all samples were merged into one FastQ file to construct a transcriptome reference. Merged data were assembled using the Trinity program [65] and used for de novo transcriptome assembly. Transcripts above 36 bp were assembled into contigs (combining read sequences of a certain length of overlap to form longer fragments without N gaps). The longest contigs were clustered into non-redundant transcripts (unigenes) using the CD-HIT-EST program [66,67]. The unigenes obtained were used for the subsequent annotation, the open reading frames (ORF) prediction, and the expression analysis.

ORF Prediction and Abundance Estimation
ORFs with at least 100 amino acids were extracted and predicted for unigenes using TransDecoder (https://github.com/TransDecoder/TransDecoder) to identify the candidate coding regions within the transcript sequence. The trimmed reads for each sample were aligned to the assembled reference using the program Bowtie. For the differentially expressed gene analysis, the abundance of unigenes across the samples was estimated as a read count to indicate an expression figure using the RSEM algorithm [68]. The expression level was calculated as read count and as fragments per kilobase of transcript per million mapped reads (FPKM).

Functional Annotation
The unigenes were searched for in the Kyoto Encyclopedia of Genes and Genomes (KEGG), NCBI Nucleotide (NT), Pfam, Gene Ontology (GO), NCBI non-redundant Protein (NR), UniProt and Evolutionary genealogy of genes: Non-supervised Orthologous Groups (EggNOG) using BLASTN of NCBI BLAST [69] and BLASTX of the software DIAMOND [70] with an E-value default cut-off of 1.0 × 10 −5 .

Conclusions
The de novo transcriptome analysis revealed three specific features of Jojoba transcriptome during seed development: the detection of malonyl-CoA ligase (MCAL) with high FPKM value of 138.28, the detection of both WS and WSD1, and the detection of FATB with the absence of FATA. These enzymes substantiate general lipid biosynthesis, specifically wax ester biosynthesis. Also, high expression of FA biosynthesis (ACP, ACBP), lipid transfer (nsLTPs, LTPs, and lipid storage proteins in oil bodies (OL) was used to establish the wax ester synthesis pathway in developing Jojoba seeds. The results of this study will pave the way for innovative biotechnology approaches using these enzymes to improve jojoba wax ester production.