CYP4F2 is a human-specific determinant of circulating N-acyl amino acid levels

N-acyl amino acids are a large family of circulating lipid metabolites that modulate energy expenditure and fat mass in rodents. However, little is known about the regulation and potential cardiometabolic functions of N-acyl amino acids in humans. Here, we analyze the cardiometabolic phenotype associations and genomic associations of four plasma N-acyl amino acids (N-oleoyl-leucine, N-oleoyl-phenylalanine, N-oleoyl-serine, and N-oleoyl-glycine) in 2351 individuals from the Jackson Heart Study. We find that plasma levels of specific N-acyl amino acids are associated with cardiometabolic disease endpoints independent of free amino acid plasma levels and in patterns according to the amino acid head group. By integrating whole genome sequencing data with N-acyl amino acid levels, we identify that the genetic determinants of N-acyl amino acid levels also cluster according to the amino acid head group. Furthermore, we identify the CYP4F2 locus as a genetic determinant of plasma N-oleoyl-leucine and N-oleoyl-phenylalanine levels in human plasma. In experimental studies, we demonstrate that CYP4F2-mediated hydroxylation of N-oleoyl-leucine and N-oleoyl-phenylalanine results in metabolic diversification and production of many previously unknown lipid metabolites with varying characteristics of the fatty acid tail group, including several that structurally resemble fatty acid hydroxy fatty acids. These studies provide a structural framework for understanding the regulation and disease associations of N-acyl amino acids in humans and identify that the diversity of this lipid signaling family can be significantly expanded through CYP4F-mediated ω-hydroxylation.

N-acyl amino acids are a large family of circulating lipid metabolites that modulate energy expenditure and fat mass in rodents. However, little is known about the regulation and potential cardiometabolic functions of N-acyl amino acids in humans. Here, we analyze the cardiometabolic phenotype associations and genomic associations of four plasma N-acyl amino acids (N-oleoyl-leucine, N-oleoyl-phenylalanine, Noleoyl-serine, and N-oleoyl-glycine) in 2351 individuals from the Jackson Heart Study. We find that plasma levels of specific N-acyl amino acids are associated with cardiometabolic disease endpoints independent of free amino acid plasma levels and in patterns according to the amino acid head group. By integrating whole genome sequencing data with N-acyl amino acid levels, we identify that the genetic determinants of N-acyl amino acid levels also cluster according to the amino acid head group. Furthermore, we identify the CYP4F2 locus as a genetic determinant of plasma N-oleoyl-leucine and N-oleoyl-phenylalanine levels in human plasma. In experimental studies, we demonstrate that CYP4F2-mediated hydroxylation of N-oleoylleucine and N-oleoyl-phenylalanine results in metabolic diversification and production of many previously unknown lipid metabolites with varying characteristics of the fatty acid tail group, including several that structurally resemble fatty acid hydroxy fatty acids. These studies provide a structural framework for understanding the regulation and disease associations of N-acyl amino acids in humans and identify that the diversity of this lipid signaling family can be significantly expanded through CYP4F-mediated ω-hydroxylation.
The N-acyl amino acids are a large and structurally diverse family of circulating lipid signaling molecules. These lipid metabolites are unusual peptide conjugates of fatty acids and amino acids. In mice, levels of specific circulating N-acyl amino acids are tightly controlled by the action of two enzymes, PM20D1 and fatty acid amide hydrolase (FAAH), which catalyze bidirectional N-acyl amino acid synthesis from and hydrolysis to free fatty acids and free amino acids (1)(2)(3)(4). Pharmacological, genetic, and mechanistic studies in rodents suggest that certain members of the N-acyl amino acids stimulate mitochondrial respiration and whole-body energy expenditure (1). Other complementary studies have also established roles for N-acyl amino acids in glucose homeostasis (2), adipogenesis (5), vascular tone (6,7), and bone homeostasis (8,9). Importantly, the functional consequence and enzymatic regulation of each N-acyl amino acid are highly dependent on the structural properties of the fatty acid tail group and amino acid head group. For example, even subtle modification of the lipid chain or amino acid moieties can affect N-acyl-amino acid substrate specificity for PM20D1 and ligand potency in mitochondrial respiration (1).
Despite the considerable body of rodent literature on N-acyl amino acids, our knowledge of the regulation and clinical associations of these lipid metabolites in humans is still lacking. Two small studies have previously investigated the role of PM20D1 in the regulation of N-acyl amino acids in human plasma. In one report, plasma N-oleoyl-leucine and N-oleoylphenylalanine levels were found to be associated with circulating PM20D1 protein levels and correlated positively with adiposity and several parameters of glucose homeostasis (10). However, a second study failed to identify any association of serum levels of two N-acyl amino acids, N-oleoyl-leucine, or N-oleoyl-glutamine with genetic variants in the PM20D1 locus though in a very small cohort (11). The conflicting human data, as well as the scope of phenotypes and the limited number of individuals examined in the previous two studies, therefore, motivate additional investigations as to the clinical relevance and genetic underpinnings of this class of molecules.
Our group recently measured circulating levels of four N-acyl-amino acids (N-oleoyl-glycine, N-oleoyl-serine, N-oleoyl-leucine, and N-oleoyl-phenylalanine) in plasma samples from 2351 participants of the Jackson Heart Study (JHS) using liquid chromatography-mass spectrometry (LC-MS) (12). In two broad metabolomic-wide association studies in this population, we identified that levels of N-oleoyl serine and N-oleoyl leucine were among the top metabolites in human plasma associated with future risk of coronary heart disease (13) and heart failure (12), respectively. These observations extend prior studies in murine models and suggest that additional investigations of N-acyl amino acid biology may provide important new insights into human metabolism and cardiometabolic disease.
Here, we provide a detailed analysis of the role of the oleic fatty acid tail group and amino acid head group in driving associations with cardiometabolic disease. We use genomewide association studies (GWAS) to distinguish between different molecular species of N-acyl amino acids according to the amino acid head group. Further, this analysis identifies a previously unknown enzymatic regulator of N-oleoyl-leucine and N-oleoyl-phenylalanine, CYP4F2, in humans. Finally, we use untargeted metabolomics in live cells to map the fate of Noleoyl-leucine and N-oleoyl-phenylalanine downstream of CYP4F2 hydroxylation and identify numerous previously unknown lipid metabolites with varying characteristics of the fatty acid tail group, including several that structurally resemble fatty acid hydroxy fatty acids (FAHFAs), a recentlydiscovered family of bioactive lipids that signal through specific G protein-coupled receptors to improve glucose-insulin homeostasis and block inflammatory cytokine production (14). Our data integrating mass spectrometry, human phenotyping, genetics, and model systems provide new insights into this emerging class of molecules.

Results
N-acyl amino acids are associated with cardiometabolic disease independent of free amino acid plasma levels and according to the amino acid head group We recently measured circulating levels of N-oleoyl-glycine, N-oleoyl-serine, N-oleoyl-leucine, and N-oleoyl-phenylalanine in plasma samples from 2351 participants of the JHS using LC-MS, as described (12). The baseline characteristics of the JHS study participants are detailed in Table S1. Age-and sexadjusted analyses identified N-oleoyl-serine and N-oleoylleucine as top metabolites associated with the future risk of coronary heart disease (13) and heart failure (12), respectively. However, prior work did not examine whether the associations of the intact N-acyl oleic acid conjugate of each amino acid were independent of the corresponding free amino acid (Fig. 1A). Using age-, sex-, and batch-adjusted Cox regression models, we determined that the association between N-oleoylserine and the future risk of coronary heart disease remained unaffected when further adjusted for free serine (hazard ratio [HR] 0.81 per 1 SD increment in transformed and normalized metabolite level; 95% CI, 0.69-0.94; p-value = 7.3 × 10 −3 ; median follow-up 11.6 years). Similarly, the association between N-oleoyl-leucine and future heart failure remained highly significant when further adjusted for free leucine (HR 0.78; 95% CI, 0.66-0.91; p-value = 2.1 × 10 −3 ; median followup 12.6 years). These observations may point towards additional biological mechanisms conferred by the oleic acid tail group of N-acyl-serine and N-acyl-leucine.

Genetic loci associated with levels of N-acyl amino acids in human plasma
In order to determine if the genetic determinants of N-acyl amino acid plasma levels also differ according to amino acid head group, we leveraged available whole genome sequencing data (NHGRI-EBI GWAS Catalog study GCP000239 (15)) in the JHS to compare genome-wide associations between oleic acid N-acyl conjugates of serine, glycine, leucine, and phenylalanine, as described (16). As shown in Figure 2A, we identified a strong association between the FAAH genetic locus and plasma levels of N-oleoyl-glycine (sentinel variant rs324420, p-value = 5.7 × 10 −64 , beta 0.54, n = 2463) and Noleoyl-serine (sentinel variant rs324420, p-value = 3.6 × 10 −32 , beta 0.38, n = 2101). These findings are consistent with studies in cell-and murine-based systems that have identified FAAH as an intracellular N-acyl amino acid synthase/hydrolase. Interestingly, the sentinel variant for both metabolites was a common (observed MAF = 0.21 in JHS and global MAF 0.21 as reported in the NCBI Allele Frequency Aggregator), missense variant (chr1:46405089C > A; Pro129Thr) that has previously been shown to reduce FAAH stability and enzymatic activity in cell-based model systems and has been associated with overweight and obesity in human studies (17,18).
Taken together, these human genetic data point to specific pathways that regulate circulating N-acyl amino acid levels according to the amino acid head group. In particular, the association of the CYP4F2 locus with circulating N-oleoyl- Figure 1. N-acyl amino acids are associated with cardiometabolic disease independent of free amino acid plasma levels and according to the amino acid head group. A, Age-and sex-adjusted associations between plasma levels of N-oleoyl-serine and future risk of coronary heart disease (13) (top) and N-oleoyl-leucine and future risk of heart failure (12) (bottom) in JHS were unaffected by further adjustment for circulating levels of the corresponding free amino acid using cox regression models. The relationship between plasma levels of each N-acyl amino acid (log transformed and standardized) and cardiometabolic traits (standardized) was analyzed using age-and sex-adjusted mixed linear regression models for continuous variables measured in the JHS (B) and age-and sex-adjusted logistic regression models for categorical variables in the JHS (C). Estimated β coefficients are represented by a color scale from red to blue (red represents positive and blue represents negative association). The area of each node corresponds to −log10(p-values). Bonferroniadjusted p-value = 6.0 × 10 −4 (0.05/21 tested traits/4 analytes).
leucine and N-oleoyl-phenylalanine was unexpected, and the role of CYP4F2 in N-acyl amino acid metabolism has not been explored.

CYP4F2-mediated omega-hydroxylation of N-oleoyl-leucine and N-oleoyl-phenylalanine
Canonically, CYP4F2 catalyzes the omega (e.g., terminal) hydroxylation of specific lipophilic substrates, including fatty acids (e.g., arachidonic acid) (19) and vitamins (e.g., vitamin E and K) (20,21). The identification of CYP4F2 as a potential enzymatic regulator of N-oleoyl-leucine/phenylalanine was surprising as this enzyme had not been previously implicated in the metabolism of fatty acid-amino acid conjugates. Nevertheless, we reasoned that CYP4F2 might also catalyze the omega hydroxylation of N-oleoyl-leucine and N-oleoylphenylalanine (Fig. 3A). To directly test this hypothesis in vitro, we first generated lysates overexpressing CYP4F2 by transfection of plasmids expressing human CYP4F2 into HEK293T cells (Fig. 3B, left). Transfected whole-cell lysates were then assayed for the ability to catalyze N-oleoyl-leucine and N-oleoyl-phenylalanine hydroxylation in the presence of NADPH. As shown in Figure 3B (right), CYP4F2-transfected cell lysates exhibited a robust hydroxylation activity on both N-acyl amino acids, leading to the production of two previously unknown lipid species in vitro, hydroxy-oleoyl-leucine and hydroxy-oleoyl-phenylalanine. The absolute rates of CYP4F2-mediated hydroxylation of these N-acyl amino acids were similar to that of a canonical CYP4F2 substrate, arachidonic acid (Fig. S1A). Additionally, we observed increasing formation rates of hydroxy-oleoyl-leucine and hydroxy-oleoylphenylalanine across a range of N-oleoyl-leucine/phenylalanine substrate concentrations (Fig. 3C). Notably, this range of low-mid μM concentrations of N-oleoyl-leucine/phenylalanine correlate with prior studies that detected functional effects of these N-acyl amino acids on energy homeostasis at similar murine model plasma levels and cell-culture treatment doses (1). The use of CYP4F2-transfected lysates precluded the ability to calculate the catalytic parameters, selectivity, or specificity of this reaction. These data demonstrate that CYP4F2 catalyzes the hydroxylation of N-oleoyl-leucine and N-oleoyl-phenylalanine in vitro and provide a biochemical CYP4F2 regulates N-acyl amino acids explanation for the observed association between the CYP4F2 gene locus and circulating N-oleoyl-leucine and N-oleoylphenylalanine levels in humans.
The sentinel variant in the CYP4F2 locus that was associated with N-oleoyl-leucine and N-oleoyl-phenylalanine plasma levels in our human studies was a chr19:15879621C > T CYP4F2 regulates N-acyl amino acids missense variant that results in a Val433Met substitution (Fig. 2B). To examine the consequences of the specific CYP4F2 (V433M) variant on N-oleoyl-leucine and N-oleoyl-phenylalanine hydroxylation activity, we performed similar in vitro hydroxylation activity assays with CYP4F2 (V433V) or CYP4F2 (V433M) enzymes. Equivalent transfection for both alleles was confirmed by equivalent mRNA levels (Fig. 3D). Under these conditions, both the CYP4F2 protein as well as N-oleoylleucine, N-oleoyl-phenylalanine, and arachidonic acid hydroxylation activities of CYP4F2(V433M) were lower than that of CYP4F2(V433V) (Figs. 3D and S1B). To determine the relative activity of CYP4F2 (V433V) or CYP4F2 (V433M) under conditions of equal protein loading, we normalized levels of CYP4F2 (V433M) and CYP4F2 (V433V) protein as assessed by Western blotting (Fig. 3E). CYP4F2 (V433M) exhibited reduced hydroxylation of N-oleoyl-leucine, N-oleoyl-phenylalanine, and arachidonic acid compared to CYP4F2 (V433V) (Figs. 3E and S1C). These data demonstrate that CYP4F2 (V433M) has both lower catalytic efficiency and reduced protein levels relative to CYP4F2 (V433V).

N-oleoyl-leucine and N-oleoyl-phenylalanine are competitive inhibitors of CYP4F2
The ability of CYP4F2 to hydroxylate multiple endogenous lipid substrates raises the possibility that active site competition might modulate the relative flux of each substrate through CYP4F2 (Fig. 4A). To mechanistically examine whether such substrate competition might occur, we used our in vitro CYP4F2 enzyme activity assay to directly measure substrate competition between N-oleoyl-leucine and N-oleoyl-phenylalanine on the canonical CYP4F2 substrate arachidonic acid. At a 1:1 M ratio, N-oleoyl-leucine efficiently inhibited CYP4F2-mediated conversion of arachidonic acid to 20-HETE (86% suppression, Fig. 4B). Similar results were observed with N-oleoyl-phenylalanine (76% suppression, Fig. 4B). A doseresponse curve demonstrated that either N-oleoyl-leucine or N-oleoyl-phenylalanine could inhibit CYP4F2-mediated arachidonic acid hydroxylation even at substoichiometric levels (10 mol%, Fig. 4B). To examine the potential generality of N-oleoyl-leucine and N-oleoyl-phenylalanine inhibition of CYP4F2, we performed similar in vitro competition experiments with two additional CYP4F2 substrates, docosanoic acid, and 8-HETE. As shown in Figure 4, C and D, both Noleoyl-leucine and N-oleoyl-phenylalanine competed CYP4F2mediated hydroxylation of both 8-HETE and docosanoic acid but with differences in potency. For instance, even at substoichiometric levels (10 mol%), both N-oleoyl-leucine and Noleoyl-phenylalanine strongly suppressed CYP4F2-mediated docosanoic acid hydroxylation (87% inhibition by N-oleoylleucine and 81% inhibition by N-oleoyl-phenylalanine, Fig. 4C). In contrast, superstoichiometric levels (10-fold excess) of either N-acyl amino acid were required to observe similar levels of competition of 8-HETE hydroxylation (94% inhibition by N-oleoyl-leucine and 84% inhibition by N-oleoylphenylalanine, Fig. 4D).
Conversely, we investigated if the canonical substrates arachidonic acid, docosanoic acid, and 8-HETE might also compete CYP4F2-catalyzed N-acyl amino acid hydroxylation. While arachidonic acid was able to compete N-oleoyl-leucine and N-oleoyl-phenylalanine hydroxylation by CYP4F2 (Fig. 4E), neither docosanoic acid nor 8-HETE exhibited strong inhibition of either N-oleoyl-leucine or N-oleoyl-phenylalanine, even at superstoichiometric levels (10:1 competitor:substrate, Fig. 4, F and G). We therefore conclude that N-oleoyl-leucine and Noleoyl-phenylalanine and other canonical substrates engage in bidirectional competition to regulate hydroxylation flux at the CYP4F2 enzyme node. The different dose responses observed for competition between each substrate pair potentially reflect differences in effective concentrations and/or substrate affinities at the CYP4F2 active site.

Metabolic diversification of N-acyl amino acids downstream of CYP4F2
CYP4F2-catalyzed substrate hydroxylation can result in either metabolic diversification (e.g., arachidonic acid to 20-HETE) or oxidative degradation (e.g., vitamin E). To determine which of these metabolic outcomes was the relevant pathway for the N-acyl amino acids, we used untargeted lipidomics in live cells to map the fate of N-oleoyl-leucine and Noleoyl-phenylalanine downstream of CYP4F2. For these studies, we initially focused on N-oleoyl-leucine. First, CYP4F2 or mocktransfected HEK293T cells were treated with N-oleoyl-leucine for 4 h. Total intracellular lipids were extracted by the Folch method and differential peaks were identified using XCMS (22) (Fig. 5A). Two statistically significant peaks of high fold change (p < 0.05, >20-fold) were enriched in cells overexpressing CYP4F2. As expected, the top-scoring peak of m/z = 410.3271 matched ω-hydroxy-oleoyl-leucine (expected m/z = 410.3276) (Fig. 5B). The second peak of mass m/z = 674.5718 was also highly enriched in CYP4F2-transfected cells, but its chemical structure remained initially unknown. The mass difference observed between the two peaks (264.2442) as well as the later retention time of the second peak was consistent with an additional lipidation by oleate (+C 18 H 32 O, expected +264.2453). We therefore hypothesized that this second peak represented an unusual and previously unknown very long chain fatty acid, oleic acid-hydroxy-oleoyl-leucine (OAHOL). We used chemical synthesis to generate authentic standards for both hydroxyoleoyl-leucine and OAHOL (see Experimental Procedures). As shown in Figure 5, C and D, fragmentation of both synthetic hydroxy-oleoyl-leucine and the endogenous m/z = 410.33 peak gave an identical daughter ion at m/z = 130.087, which matched the mass of the leucine anion. In addition to demonstrating the existence of these metabolites in cell lysates, we confirmed the presence of endogenous hydroxy-oleoyl-leucine in human plasma as well (Fig. 5C, lower panel). Similarly, fragmentation of both synthetic OAHOL and the endogenous m/z = 674.57 mass gave daughter ions at m/z = 130.087 (leucine anion) and m/z = 281.249 (oleate). We therefore conclude that m/z = 410.33 and m/z = 674.57 correspond to hydroxy-oleoyl-leucine and OAHOL, respectively.
OAHOL is a novel lipid metabolite that is not found in any public databases (23,24). This lipid metabolite is presumably formed by enzymatic acylation of hydroxy-oleoyl-leucine with oleoyl-CoAs. Based on this metabolic pathway hypothesis, we manually examined our dataset for additional fatty acyl-derivatives of ω-hydroxy-oleoyl-leucine. Beyond OAHOL, we also identified PAHOL (palmitic acid-hydroxy-oleoyl-leucine) and SAHOL (stearic acid-hydroxy-oleoyl-leucine) dramatically elevated in CYP4F2-transfected but not mock-transfected cells (Figs. 5E and S3, A and B). To determine if these very long-chain lipid derivatives could also be formed downstream of CYP4F2-mediated hydroxylation of N-oleoyl-phenylalanine, we performed similar untargeted experiments in live CYP4F2-transfected cells with Noleoyl-phenylalanine as a substrate. As shown in Figure 5F, addition of N-oleoyl-phenylalanine also resulted in the downstream production of ω-hydroxy-oleoyl-phenylalanine and oleoyl-and palmitoyl-conjugates of hydroxy-oleoyl-phenylalanine (oleic acidhydroxy-oleoyl-phenylalanine and palmitic acid-hydroxy-oleoylphenylalanine, respectively). For N-oleoyl-phenylalanine, the stearoyl acylation product was not detected. Taken together, we conclude that CYP4F2-catalyzed hydroxylation of N-acyl amino acids leads to the metabolic diversification of a wide array of previously unknown lipids, including several very long chain lipids that structurally resemble previously reported anti-diabetic FAHFAs (14), Fig. S3C).

Discussion
By integrating genetic and clinical data with circulating levels of plasma N-acyl amino acids from a large human  CYP4F2 regulates N-acyl amino acids cohort, our study provides several insights of potential importance into the regulation of N-acyl amino acids in humans. First, individual plasma N-acyl amino acid levels are associated with cardiometabolic disease endpoints independent of free amino acid plasma levels and in patterns according to the amino acid head group. Second, the underlying genetic architecture and biological pathways that may regulate plasma levels of N-acyl amino acids also differ according to the amino acid head group. Finally, CYP4F2 functions as a humanspecific enzyme node that catalyzes the metabolic diversification of N-acyl amino acids into a much larger family of lipids with varying characteristics of the fatty acid tail group, including several that structurally resemble FAHFAs.
Our large-scale human study revealed novel associations between the N-acyl amino acids and specific cardiometabolic diseases in participants of the JHS. Notably, we identified that the associations between fasting baseline levels of N-oleoylserine and future coronary heart disease, as well as N-oleoylleucine and future heart failure, were independent of the plasma levels of free serine and leucine, respectively. This suggests that these associations are driven by the intact N-acyl oleic acid conjugate of each amino acid, rather than levels of the free amino acid themselves. This may suggest that N-acyl amino acids affect cardiometabolic outcomes through distinct biological mechanisms from those mediated by circulating levels of the corresponding free amino acid.
Interestingly, when analyzing associations of N-acyl-amino acids with the prevalent cardiometabolic disease at baseline, we detected a bifurcation in associations according to the amino acid head group, with N-oleoyl-glycine and serine associated with cardiometabolically "advantageous" traits (such as prevalent diabetes), and N-oleoyl-leucine and phenylalanine associated with "disadvantageous" cardiometabolic traits (such as obesity). This division mirrors positive associations between branched chain (e.g. leucine) and aromatic (e.g. phenylalanine) free amino acids with obesity and insulin resistance, and inverse associations between glycine and serine with impaired glucose tolerance and risk of diabetes (25)(26)(27)(28)(29)(30)(31)(32)(33). The mechanisms linking these free amino acid species to cardiometabolic traits and risk of the disease remain to be fully elucidated but may involve various opposing effects on regulatory pathways upstream of pancreatic islet β-cell insulin secretion (25,(34)(35)(36)(37)(38). Whether the N-oleoyl species of these amino acids-as well as the CYP4F2-mediated derivatives of these metabolites-act through the same or different pathways adds a potential additional layer of complexity to this regulatory balance.
Our studies of human N-acyl amino acid metabolism highlight the utility of integrating metabolomic profiling data with genetic data for pathway discovery. In mice, FAAH and PM20D1 are two major enzymes that catalyze the bidirectional N-acyl amino acid synthesis and hydrolysis from free amino acids and free fatty acids. Human-based studies have established FAAH as the primary degradative enzyme of the structurally-related N-acyl ethanolamines (39). Therefore, the association between the FAAH locus and plasma levels of N-oleoyl-glycine and N-oleoyl-serine in participants of the JHS was expected. However, the absence of an association of N-acyl amino acids to the human PM20D1 locus, as well as the identification of the CYP4F2 gene as a novel and humanspecific association, were both unexpected. We validated the novel genetic association between N-oleoyl-leucine and Noleoyl-phenylalanine with the CYP4F2 locus by demonstrating that these two N-acyl amino acids are both bona fide substrates and inhibitors of the human CYP4F2 enzyme in vitro and in cell-based models. The top association for both N-oleoyl-leucine and N-oleoyl-phenylalanine was with a missense variant (chr19:15879621C > T; Val433Met) that has previously been shown to result in reduced CYP4F2 protein levels and enzymatic activity in cell-based model systems, but that has not previously been tied to N-acyl amino acid metabolism (20,40,41). This variant is common with a global MAF of 0.29 and a subgroup MAF of 0.09 in African Americans, as reported in the NCBI Allele Frequency Aggregator.
While it is possible that our study did not detect an association between the PM20D1 locus and N-acyl amino acids due to cohort-specific characteristics or sample size, we note that none of the measured N-acyl amino acids demonstrated even a modest association (nominal p-value ≤ 0.001) with a genetic variant between the PM20D1 transcriptional start and end sites. This finding will require validation in additional cohorts, however, may suggest that human PM20D1 controls local, and possibly tissue-specific extracellular paracrine pools of N-acyl amino acids that do not interact with the levels found in the circulation.
It is notable that we detected very strong associations between the CYP4F2 locus on chromosome 19 and N-oleoylleucine and N-oleoyl-phenylalanine with no appreciable association signals between this locus and either N-oleoylglycine or N-oleoyl-serine (p > 0.001 for all variants between the CYP4F2 transcriptional start and end sites). Importantly, the biochemical basis for this genetic specificity remains the subject of future study. We have previously found that subcellular location, and even organ-specific substrate accessibility can contribute to the regulation of specific N-acyl amino acids by FAAH and PM20D1 (4). Future detailed enzymology studies of CYP4F2 may uncover additional fundamental biochemical characteristics of this enzyme that underlie these specific genetic associations.
Our data suggest that CYP4F2 appears to be a key enzymatic node at the center of a complex lipid network that contains classical bioactive lipids, such as arachidonate and 20-HETE, with more novel lipid species, including N-acyl amino acids, hydroxylated N-acyl amino acids, and very long chain fatty acyl derivatives of N-acyl amino acids that structurally resemble FAHFAs. Notably, we detect that examples of these, including hydroxylated oleoyl-leucine, are present in human plasma.
Currently, the precise biochemical and functional relationship between all of these lipid species, especially in a complex tissue environment such as a human liver, remains unknown. For instance, it may be possible that hydroxy N-acyl amino acids exhibit similar vasoactive effects as 20-HETE, and fatty acylated hydroxy-N-acyl amino acid derivatives might also function similarly to that of the anti-diabetic FAHFAs.
Alternatively, N-acyl amino acids may indirectly modulate the levels of other CYP4F2-regulated lipids via substrate competition. An important area of future work will be to understand the relative fluxes of each lipid class through CYP4F2 as well as the potential functional roles of these new N-acyl amino acid derivatives.
The integration of human functional genomics with untargeted lipidomics studies of cultured cells further allowed for the discovery of a novel class of very long-chain lipids. These compounds structurally resemble FAHFAs, although instead of containing a characteristic branched hydroxyl linkage, contain a terminal ester linkage between a fatty acid and a hydroxy N-acyl amino acid. A similar terminal linkage has been described in O-acyl-ω-hydroxy fatty acids (ωOAHFAs); however, these unusual species have only recently been detected in human skin (42) and meibum (43), and not to our knowledge in human plasma. Further, we are not aware of a previous report of fatty acyl derivatives of hydroxy N-acylamino acids. FAHFAs are a diverse family of bioactive lipids that have recently been shown to signal through specific G protein-coupled receptors and other pathways to improve glucose-insulin homeostasis and block inflammatory cytokine production (14). Our data provide new insights into this emerging class of molecules and raise the intriguing possibility that ω-hydroxy-oleoyl amino acid species may also serve as signaling effectors in human metabolism.

Experimental procedures
Human cohort JHS is a prospective population-based observational study designed to investigate risk factors for cardiovascular disease (CVD) in Black individuals, as previously described (44). From 2000 to 2004, 5306 Black individuals from the Jackson, Mississippi tri-county area (Hinds, Rankin and Madison counties), were recruited for a baseline examination. Of the original cohort, 2351 individuals had metabolomic profiling of N-acyl amino acids performed from baseline samples and were included in the analyses. Details regarding the collection and calculation of clinical data included in Table S1 and Figure 1 have been previously described (13). All clinical data were collected during the baseline exam (Exam 1), except visceral adipose tissue (Exam 2), subcutaneous adipose tissue (Exam 2), coronary artery calcium score (Exam 2), and abdominal aortoiliac calcium score (Exam 2).

Study approval
The Institutional Review Boards of Beth Israel Deaconess Medical Center and the University of Mississippi Medical Center approved the human study protocols, and all participants provided written informed consent. All procedures involving human participants were in accordance with the ethical standards of the 1964 Helsinki Declaration and its later amendments.

N-acyl amino acid metabolite profiling in human plasma
Methods for metabolomics profiling in human plasma have been described (12). In brief, to measure N-oleoyl-leucine/ phenylalanine/glycine/and serine, chromatography was performed using an Agilent 1290 infinity LC system equipped with a Waters XBridge Amide column, coupled to an Agilent 6490 triple quadrupole mass spectrometer. To measure endogenous hydroxy-oleoyl-leucine, chromatography was performed using a Waters UPLC BEH Amide (1.7um, 1.0 × 150 mm) column. Metabolite transitions were assayed using a dynamic multiple reaction monitoring systems. LC-MS data were analyzed with Agilent Masshunter QQQ Quantitative analysis software. Isotope-labeled internal standards were monitored in each sample to ensure proper MS sensitivity for quality control. Pooled plasma samples were interspersed at intervals of ten participant samples to enable correction of drift in instrument sensitivity over time and to scale data between batches. A linear scaling approach was used to the nearest pooled plasma sample in the queue.

Genotyping
Whole genome sequencing (WGS) in JHS has been described (45). Briefly, participant samples underwent >30 × WGS through the Trans-Omics for Precision Medicine project at the Northwest Genome Center at University of Washington and joint genotype calling with participants in Freeze 6; genotype calling was performed by the Informatics Resource Center at the University of Michigan.

Whole genome association studies
Summary statistics from whole genome association studies of metabolite levels in plasma samples of participants of the JHS are available on the NHGRI-EBI GWAS Catalogue (Accession GCP000239) (15) and GWAS methods have been previously described (16). Briefly, metabolite LC-MS peak areas were log-transformed and scaled to a mean of zero and standard deviation of 1 and subsequently residualized on age, sex, batch, and principal components (PCs) of ancestry 1 to 10 CYP4F2 regulates N-acyl amino acids as determined by the GENetic EStimation and Inference in Structured samples (GENESIS) (46), and subsequently inverse normalized. The association between these values and genetic variants was tested using linear mixed effects models adjusted for age, sex, the genetic relationship matrix, and PCs 1 to 10 using the fastGWA model implemented in the GCTA software package (47). Variants with a minor allele count less than five in JHS were excluded from the analysis.

Cell line cultures
All cell lines were grown at 37 C with 5% CO 2 . HEK293T cells were obtained from ATCC and grown in Dulbecco's modified Eagle's medium with 10% fetal bovine serum and 1% penicillin/streptomycin (pen/strep).

Untargeted measurements of metabolites by LC-MS
Untargeted metabolomics measurements were performed on an Agilent 6520 Quadrupole Time-of-Flight (Q-TOF) LC/MS. Mass spectrometry analysis was performed using electrospray ionization (ESI) in negative mode. The dual ESI source parameters were set as follows, the gas temperature was set at 250 C with a drying gas flow of 12 l/min and the nebulizer pressure at 20 psi. The capillary voltage was set to 3500 V and the fragmentor voltage was set to 100 V. Separation of metabolites was conducted on a Luna 5 μm C5 100 Å, LC Column 100 × 4.6 mm (Phenomenex 00D-4043-E0) with normal phase chromatography. Mobile phases were as follows: Buffer A, 95:5 water/ methanol; Buffer B, 60:35:5 isopropanol/methanol/water with 0.1% ammonium hydroxide in both Buffer A and B for negative ionization mode. For 10 min runs, the LC gradient started at 95% A with a flow rate of 0.6 ml/min from 0 to 1 min. The gradient was then increased linearly to 95% B at a flow rate of 0.6 ml/min from 1 to 8 min. From 8 to 10 min, the gradient was maintained at 95% A at a flow rate of 0.6 ml/min. For 30 min runs, the LC gradient started at 95% A with a flow rate of 0.6 ml/min from 0 to 3 min. The gradient was then increased linearly to 95% B at a flow rate of 0.6 ml/min from 3 to 25 min. From 25 to 30 min, the gradient was maintained at 95% A at a flow rate of 0.6 ml/min.

Targeted metabolomics in cell culture samples
Targeted measurements were performed on an Agilent 6470 Triple Quadrupole (QQQ) LC/MS. Mass spectrometry analysis was performed using ESI in negative mode. The AJS ESI source parameters were set as follows, the gas temperature was set at 250 C with a gas flow of 12 l/min and the nebulizer pressure at 25 psi. The sheath gas temperature was set to 300 C with the sheath gas flow set at 12 l/min. The capillary voltage was set to 3500 V. Separation of metabolites was performed as described above in the untargeted metabolomics section. Multiple reaction monitoring (MRM) was performed for the indicated metabolites with the listed dwell times, fragmentor voltage, collision energies, cell accelerator voltages, and polarities.
Synthesis of ω-hydroxy-oleoyl-leucine 1-Ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC-HCl, 1.05 eq) was added to a cold mixture (ice bath) of ω-hydroxy-oleic acid (A2B Chem BG26794, 1 eq) and 1-hydroxybenzo-triazole monohydrate (HOBt H 2 O, 2 eq) in N,N-dimethyformamide (DMF). Stirring was continued for 10 min before a mixture of L-leucine ethyl ester hydrochloride (5 eq) and N,N-diiso-propylethylamine (DIPEA, 3 eq) in DMF was added. The ice bath was removed, and the reaction mixture was stirred for 16 h at room temperature. After removal of the solvent, the residue was taken up in ethyl acetate (EtOAc); washed with KHSO4, brine, NaHCO3, and CYP4F2 regulates N-acyl amino acids brine; and then concentrated under reduced pressure. The crude ester was dissolved in tetrahydrofuran, and 2 N of LiOH was added. The resulting mixture was stirred at room temperature for 4 h. The reaction was acidified to pH 2 to 3 by the addition of 1 M of HCl and then extracted with DCM and concentrated. The resulting crude product was used directly for LC-MS/MS analysis.

Synthesis of oleic acid-hydroxy-oleoyl-leucine
To a solution of oleic acid in c (DCM) was added oxalyl chloride (one drop) and one drop of DMF at 0 C. The mixture was stirred at room temperature for 2 h. The mixture was concentrated and dissolved in DCM then added to a suspension of ω-hydroxy-oleoyl-leucine and DIPEA (1 drop) in DCM. The reaction mixture was stirred at room temperature for 16 h and then acidified to pH 4 with 1 M HCl. The resulting mixture was extracted with DCM and washed with brine then the solvent was removed under reduced pressure. The crude product was analyzed directly by LC-MS/MS.
In vitro CYP42 assays HEK293T cells were co-transfected with CYP4F2 WT/ V433 M, POR, and CYB5R1. CYP4F2 and mock-transfected cells were harvested in PBS, lysed by sonication, and centrifuged (10 min at 15,000 rpm) to remove debris. In vitro enzymatic reactions were conducted in 96-well plates and initiated with 1 mM NADPH. The final reaction conditions were 100 μM substrate (N-oleoyl-leucine, Cayman Chemical 20064, N-oleoyl-phenylalanine, Cayman Chemical 28921, or arachidonic acid, Cayman Chemical 90010) and 50 μg protein in 50 μl PBS. After 1 h at 37 C, reactions were transferred to glass vials, quenched with 150 μl 2:1 v/v chloroform: methanol, and vortexed. Reaction vials were centrifuged (10 min at 1000 rpm) and the organic layer was transferred to a mass spec vial and analyzed by LC-MS as described above. For competition assays, cell lysates were incubated for 5 min with competitor (Arachidonic acid, docosanoic acid, Cayman Chemical 9000338, or 8-HETE, Cayman Chemical 34340) at 37 C before the addition of other substrates and initiation with NADPH.
Kinetic enzymatic assays CYP4F2 lysates were harvested from transiently transfected HEK293T cells as described above. In vitro enzymatic reactions were conducted in 96-well plates and initiated with 1 mM NADPH. The final reaction conditions were: 1 μM, 4 μM, 20 μM, and 100 μM of substrate and 0.46 μg CYP4F2 enzyme in 50 μl PBS. After 10 min at 37 C, reactions were transferred to glass vials quenched with 150 μl 2:1 v/v chloroform:methanol, and vortexed. Reaction vials were centrifuged (10 min at 1000 rpm) and the organic layer was transferred to a mass spec vial and analyzed by LC-MS as described above.

Live cell tracing experiments
Transiently transfected (CYP4F2 or mock) HEK293Ts were washed twice with PBS and then harvested by scraping. Cells were spun down (5 min at 1000 rpm) and resuspended in serum-free media then aliquoted into a 96-well plate (1.2 million cells per well). The final reaction conditions were 10 μM N-oleoyl-leucine or N-oleoyl-phenylalanine. Reactions were initiated with 1 mM NADPH. After 4 h at 37 C, reactions were transferred to glass vials, quenched with 2:1 v/v chloroform:methanol, and vortexed. After centrifuging vials (10 min at 1000 rpm), the organic layers were analyzed by LC-MS or LC-MS/MS as described above. LC-MS data were uploaded to Scripps XCMS Online to identify significantly changed metabolites.

Western blot analysis
Cells were collected and lysed by sonication in PBS. Cell lysates were centrifuged at 4 C for 10 min at 15,000 rpm to remove residual cell debris. Protein concentrations of the supernatant were normalized using the Pierce BCA protein assay kit and combined with 4 x NuPAGE LDS Sample Buffer with 10 mM DTT. Samples were then boiled for 10 min at 95 C. Prepared samples were run on a NuPAGE 4 to 12% Bis-Tris gel and then transferred to nitrocellulose membranes. Blots were blocked for 30 min at room temperature in the Odyssey blocking buffer. Primary antibodies (mouse anti-FLAG and rabbit anti-Beta-actin) were added to Odyssey blocking buffer at a ratio of 1:1000. Blots were incubated in the indicated primary antibodies overnight while shaking at 4 C. The following day, blots were washed three times with PBS-T, 10 min each before staining with the secondary antibody for 1 h at room temperature. The secondary antibodies used were goat anti-rabbit and goat anti-mouse antibodies diluted in blocking buffer to a ratio of 1:10,000. Following secondary antibody staining, the blot was washed 3 times with PBS-T before being imaged with the Odyssey CLx Imaging System.

Statistics
All data were expressed as mean ± SEM unless otherwise specified. A student's t-test was used for pairwise comparisons. Unless otherwise specified, statistical significance was set as p < 0.05.

Data availability
Summary statistics from metabolomics studies in plasma samples of participants of the JHS that were analyzed in this study have previously been uploaded to the JHS database of Genotypes and Phenotypes (dbGaP) repository and are available upon request from the respective study cohorts, which can be facilitated by the corresponding author. Summary statistics from whole genome association studies of metabolite levels in plasma samples of participants of the JHS have previously been made publicly available on the NHGRI-EBI GWAS Catalogue (Accession GCP000239) (15). All other results, analytic methods, and details of study materials are available within the manuscript. Noncommercial study materials will be made available to other researchers for the purposes of reproducing the results or replication of the procedure, as respective IRB and Material Transfer Agreements permit.
Supporting information-This article contains supporting information.