On inferred real-world fuel consumption of past decade plug-in hybrid electric vehicles in the US

Plug-in hybrid electric vehicles (PHEVs) have powertrain architectures that seek to combine the best features of two well-known powertrains: the environmental and other benefits of electric driving of battery electric vehicles; and the fuel efficiency and, due to widely-available fueling infrastructure and quick refueling times, limitless practical range of hybrid electric vehicles (HEVs). Different regulatory organizations around the world have different standard testing procedures, and accordingly, different predictions for the degree of efficacy of PHEVs at reducing greenhouse gas (GHG) emissions. However, there is somewhat of a consensus that PHEVs have the capacity for significant GHG reduction compared to conventional internal combustion engine vehicles, yet some recent studies have claimed the real-world fuel consumption of PHEVs to be more than twice their standard ratings. A key factor to the efficacy of GHG reduction via PHEVs is the fraction of miles traveled in electric mode, also known as the utility factor (UF). In this work, we reinvestigate the data sources cited in previous studies for PHEVs in the US for the inferred real-world UF and fuel consumption via same estimation approaches as a previous study. We then compare with UF from SAE J2841 standard and fuel consumption ratings from the US environmental protection agency. While noting that it is difficult if not impossible to discern the exact reason for observed deviations given the available information in the cited data sources, we find the real-world fuel consumption of PHEVs in the US to be within 62% better to 21% worse than their standard ratings in the US, and generally, significantly better than a comparable HEV. Contrasted with reported results for other parts of the world, the results are viewed as a testimony to the importance of proper procedures for evaluation of PHEVs to reflect their correct environmental benefit value.


Introduction
PHEVs are part of a broader category of powertrains referred to as EDVs, which according to the US-DOE [1] also encompasses non-plug-in HEVs and BEVs. And though not mentioned in [1], it is understood that hydrogen FCEVs [2] and PFCEVs [3] also belong to the broader category of EDVs. All EDVs have the common trait of being more energy-efficient than conventional ICE vehicles owing to a number of reasons, the most prominent of which include the higher efficiency of electric motors, capability to run motors as a generator during deceleration, and the ability to buffer energy in the battery.
In HEVs, where the powertrain still includes an ICE and the source of energy is fuel, having motor(s) and battery in the powertrain allows the engine to operate close to its peak efficiency most of the time or be turned off completely [4]. The energy efficiency gains in HEVs compared to conventional ICEs may vary depending on the driving conditions and other vehicle design attributes, but generally fall between 20% and 35% [5][6][7][8]. When translating such fuel savings into proportional reductions in GHG emissions (measured in an equivalent amount of Carbon Dioxide, CO 2 ), a reduction of 20%-35% compared to conventional ICEs falls short of the GHG reduction goals for the 2030-2050 timeframe in many parts of the world [9, 10] except when the fuels are also low in carbon content, such as bio-fuels [6,8].
In other types of EDVs such as BEVs and FCEVs, there are no GHG emissions from the tailpipe of the vehicle, but when performing W2Ws analysis [11], the GHG emissions for generating electricity and/or hydrogen are not negligible at present day [12][13][14][15][16]. In fact, according to information from the US-EIA [15], it is estimated that the 2020 US electric grid average carbon intensity, measured in grams of CO 2 per kilowatt-hour (g-CO 2 /kWh), ranged from as low as 82 g-CO 2 /kWh to as high as 870 g-CO 2 /kWh [16]. In some sense, utilizing vehicle tail-pipe carbon-free energy moves the GHG emissions problem from the vehicle to the grid or utilityprovider. And while there are goals to reach zero Carbon-utilities by 2050, the pathways to achieving such goals are not without challenges [17,18]. Furthermore, mass-market large-scale adoption of BEVs and FCEVs light-duty vehicles also faces several challenges. Without incentives or subsidies, the purchase cost of BEVs and FCEVs remains higher than that of conventional ICEs of comparable size and utility [19][20][21][22][23][24], which in turn, presents a challenge for gaining mass market acceptability due to many vehicle buyers being more sensitive to acquisition cost than running cost [25,26], and quite reliant on subsidies and/or tax credits [27][28][29]. In addition, even setting acquisition cost aside, issues such as perceptions of range, charging time and charging infrastructure continue to contribute to reluctance to adopt BEVs in the short term [30][31][32]. Similarly, other issues besides acquisition cost exist for FCEVs, including infrastructure availability, perceptions of safety, price and stability of the Hydrogen supply [33][34][35][36].
In simplified terms, PHEVs are similar to HEVs except that they are equipped with larger batteries and on-board chargers that allow charging the battery from grid electricity [1,37]. Conceptually speaking, PHEVs combine the best features of both HEVs and BEVs in terms of moderate acquisition cost, low fuel consumption, and less reliance on electric charging infrastructure [38][39][40], but perhaps most notably, is the capability to electrify a significant portion of the VMT [41,42] without range anxiety since the PHEV automatically switches to fuel (also known as hybrid mode) whenever the battery SoC reaches a lower bound. While PHEVs may be regarded by some as transition vehicles whose primary purpose is to accelerate full electrification [43,44], it stands to good reason that PHEVs are capable of large GHG reductions in both the near and long term [45][46][47][48].
However, much of the estimated efficacy at GHG reduction depends on the fraction of VMT driven in electric mode, which is known as the UF of PHEVs [41]. A recent study [49] highlighted some of the differences in real-world observed UF compared to standard expectation, and laid out the main categories of reasons for such differences, including differences in (a) charging behavior, (b) driving patterns and (c) vehicle efficiency compared to the estimates by regulatory standard. A meta-dataset analysis [50] showed the real-world fuel consumption of PHEVs in various parts of the world to be between twice to four times that of the type-approval ratings in the older European standard (utilizing the test procedure of NEDC [51]). A follow-up study claimed [52] the PHEVs fuel consumption in the US to be twice that of US-EPA ratings. It is a fairly well-known that NEDC test procedure frequently under-estimates observed fuel consumption even among conventional ICE vehicles [51]. In contrast, US-EPA has historically had fuel economy ratings that reasonably reflect how the vehicles perform (on-average) in the US [53]. Thus, when the fuel consumption of PHEVs in the US appeared to be twice their NEDC ratings in [50] and then twice their US-EPA ratings in [52], this seemed worthy of further study.
In this paper, we re-analyze all the data sources for PHEVs in the US that were cited in [52] while utilizing the same approach for estimation of fuel consumption as it was explained in [52]. We then compare the estimated UF and fuel consumption for each PHEV variant (make/model/model-year) with its appropriate reference value from SAE J2841 and US-EPA fuel economy ratings. The obtained results are notably different than the results shown in [52], with all PHEV variants appearing to have fuel consumption between 62% less (better) and 21% more (worse) than their equivalent US-EPA ratings. Though the cited datasets are rather outdated, our results highlight the importance of appropriate evaluation standards; while the old European standard may have been inaccurate by a large margin in its evaluation of PHEVs in the previous decade, US-EPA was mostly right on the mark. The rest of the paper is organized as follows: section 2 presents the analysis approach, while section 3 showcases the obtained results followed by a discussion in section 4, and finally the paper concludes.

Estimation of PHEVs real-world performance
Reference value for the average fuel consumption of a PHEV [54] may be calculated as: where ϕ r is reference average vehicle fuel consumption, which has units of fuel volume per travel distance (gal mile −1 or l km −1 ). ϕ r,CS is the reference fuel consumption when the PHEV is driving in charge sustaining mode (battery reached lower SoC limit) while ϕ r,CD accounts for reference value of fuel consumption while the PHEV is in charge depletion mode (mostly electric driving). UF r is the reference value for UF (fraction of VMT in electric mode) of the PHEV. With nearly all present-day PHEVs sold in the US having negligible fuel consumption in charge depletion, equation (1) reduces to: From equation (2), the ratio between the actual and reference values of fuel consumption may be calculated as: where γ is the fuel consumption ratio, ϕ a and ϕ a,CS are respectively the average actual (real-world) fuel consumption and actual fuel consumption while in charge sustaining mode, while UF a is the actual (realworld) UF.
To analyze the sensitivity of the fuel consumption ratio to deviations between the real-world and reference performance of the PHEV, we take the total differential of γ (equation (4)), then divide by γ in order to obtain the relative difference in fuel consumption ratio, as shown in equation (5).
where Γ is the relative difference in fuel consumption ratio, while δUF a and δϕ a,CS are the deviations between actual and reference values for UF and fuel consumption while in charge sustaining mode, respectively. The negative sign on the first term in equation (5) is in agreement with the general understanding that when the real-world UF is less than its reference value (i.e. δUF a will be a negative value), this will contribute to real-world fuel consumption being higher than its reference value. Furthermore, the presence of the term (1 − UF a ) in the denominator of the first term in equation (5) reveals the high sensitivity of fuel consumption to even small mismatches between real-world UF and its reference value, since for typical PHEVs whose UF is between 0.5 to 0.8, a mismatch in UF is magnified by a factor of 2-5, which in turn, highlights the importance of having the reference value of UF be properly reflective of the real-world. At present-day, reference values for UF are mostly based on models and assumptions for how PHEVs would be driven in the real world rather than on real driving data of PHEVs. For example, in the older European standard that was based on NEDC drive cycle, the UF [55] is calculated as a function of the electric driving ranged per equation (6), which reveals the underlying modeling assumption to be that a PHEV is driven 25 km on-average in charge sustaining mode (consuming fuel) before it is fully charged again, On the other hand, the SAE J2841 standard [41] makes an underlying assumption that PHEVs are charged to full before each driving day but not charged during the day. SAE J2841 then estimates the reference value for UF as a function of the electric driving range of the PHEV and statistical distribution of US daily driving distance (from travel surveys of real-world vehicles, though most of which are not PHEVs). SAE J2841 standard also distinguishes between two types of UFs estimated via travel survey datasets: (a) MDIUF, and (b) FUF. In MDIUF, a UF value is calculated for each vehicle sample in the travel survey dataset (based on the recorded travel distance on multiple days for each vehicle and assuming the vehicle would have been a PHEV of a certain electric driving range), then the UF value is averaged across  all vehicle samples [41]. FUF on the other hand is calculated as a the ratio for total electric miles by all vehicles (if all vehicle samples in the travel survey had been a PHEV of a certain electric driving range) to the total miles by all vehicles [41]. The difference is that MDIUF represents the UF that a typical random PHEV in the fleet is expected to achieve, while FUF is representative of the fraction of VMT electrified in the entire vehicles fleet. It is also notable that in most cases, a PHEV, with some electric driving range, will have a smaller value for its FUF compared to its MDIUF. This is mainly because when calculating MDIUF, all vehicle samples are equally weighed, which is not necessarily indicative of the fraction of electrified VMT by the fleet. Stated another way, a PHEV with above-average driving distance will have its UF weighted by its total driving distance, and thus have a larger impact on the FUF.
Curves for reference UF as function of the PHEV electric driving range per NEDC standard and SAE J2841 MDIUF and FUF are shown in figure 1. It becomes evident how differences in the underlying assumptions for the reference UF curves can lead to the same PHEV being expected to achieve different performance. Thus, when comparing real-world performance to reference values, it is important to select an appropriate reference that has appropriate assumptions for the analyzed real-world PHEVs data.
Mismatch in the underlying assumptions about driving distance for the UF curves is but one category of reasons why there can be differences between realworld and reference UF values, as discussed in [49]. To illustrate other mismatch categories, a sketching (with exaggerated scale) is shown in figure 2. If one were to start with a known valued for the reference electric driving range of a PHEV (shown at point A in figure 2), one could project it vertically to the reference UF curve (point B in figure 2) and read the corresponding reference UF value (UF r ) at point C in figure 2. If one had access to real-world data that allows computing the actual real-world UF of a PHEV (UF a ), which would be point D in figure 2, then the mismatch in UF (δUF a ) would be the vertical distance between points C and D in figure 2.
While several reasons can contribute to such mismatch, they mostly fall into one of three categories (labeled via yellow circles in figure 2). Category #1 encompasses mismatches in assumptions about daily VMT profile. For example, the UF curves of SAE J2841 (shown in figure 1) were originally formulated [41] based on the US nation-wide daily miles traveled profile from the 2001 NHTS (NHTS-2001) [56], in which the vehicles were mostly conventional ICEs. It is plausible that the sample of PHEVs in a dataset (as will be discussed in section 2.2) were driving longer daily distances, which would result in the 'true' UF curve (sketched as a dashed line UF curve in figure 2) being lower than the reference one. It is also plausible that the sample of PHEVs in a dataset were driving shorter daily distances, which would result in the true UF curve being higher than the reference one. To compensate for the category #1 mismatch, one ought to be examining point F (which has smaller UF value due to the PHEVs being driven longer daily distances in this illustration) rather than point B in figure 2.
Category #2 of reasons for UF mismatch (labeled via second yellow circle in figure 2) accounts for the difference between the actual real-world electric driving range (d) and the reference one (d). Mismatch in electric driving range (sketched as shorter range in figure 2, but can be in either direction) can be due to several reasons, including extreme hot or cold driving conditions, aggressive/mild accelerations, high/low travel speed, heavy/light passenger and cargo load, or the testing standard itself (in which the reference electric driving range gets certified) being less representative of the real world. To compensate for the category #2 mismatch, one ought to be examining point G rather than point F in figure 2. Lastly, category #3 includes other reasons for UF mismatch, the most prominent of which being frequency of charging the PHEVs compared to the reference assumption. In case of overnight-only charging assumption (per SAE J2841 [41]), if a PHEV charges less/more frequently, it will have its point H below/above point G in figure 2.
While it is important to keep in mind the plausible reasons for mismatch in UF (and by extension, fuel consumption), oftentimes the level of detail in real-world data of the PHEVs, especially in publiclyaccessible datasets, is insufficient to detangle the rootcause reasons of the mismatch between real-world performance and the reference ones. And even when utilizing such datasets to conduct 'high-level'-type analysis (such as between points C and D in figure 2), it is important to pay close attention to the actual information in a dataset and what performance indicators could be inferred. Towards this, a review of datasets cited for US PHEVs in [52] is examined in section 2.2.

Analyzed datasets
The study in [52] cited several datasets as sources for real-world data of US PHEVs, which are briefly recapped in this work in table 1. It becomes apparent from examining the cited datasets however, that different types of vehicles data were collected by different entities, via different methods and for vastly variant vehicle sample size and monitoring periods, resulting in not only variations in data quality, but also in different types of inferable information from each dataset. For example, all the information available in appendix G of the 2017 CARB report [57], (which gets several different mentions in [52] per the source that [57] had quoted), are bulk numbers (from combining all vehicle samples) for electric and total VMT, which (per the definitions in [41]) would be analogous to computing FUF. In this paper, we distinguish between directly inferable results and indirectly inferable estimates that involve another layer of assumptions. Taking [57] as a data source example, only FUF can be directly inferred, and an assumption about the charging sustaining fuel economy (in equation (3)) is necessary if one wishes to estimate fuel consumption. In section 3.1 of this paper, only the directly inferable results are presented, while section 3.2 considers both directly and indirectly inferable estimates.
While this paper examines all the cited data sources in [52], this does not necessarily imply the exact same data is being utilized. However, when this work utilizes slightly different data from the same sources, it is either to: (a) enact some data cleaning measures that ought to have been done by [52], or (b) utilize the more recent version of the data. We note that these differences in data utilization should have no significant impact on the results inferable from the data. Brief summary of such differences and the reasoning behind it is provided in table 2.
For brevity, a short name is assigned to each PHEV variant that identifies its make/model/modelyear range, as listed in table 3. Table 3 also lists the reference value ratings for US-EPA electric driving range and combined-cycle fuel economy of each PHEV variant, per the publicly available data in [58]. And though not relevant to the results in section 3, since it comes up in the discussion in section 4, whenever available, table 3 also lists the old European standard ratings (based on NEDC test procedures) for the PHEV variants, per the publicly available data in [59].

Individual datasets
This section showcases the directly inferable results for each data separately. The first dataset considered is Voltstats.net [60], which as previously discussed Table 1. US PHEV datasets analyzed in [52], with details of available information and how it is analyzed in current paper.

Dataset #
As cited in [52] Notes about data source  Smart et al [64] Peer-reviewed paper [64]. We utilize the same data as [52] for this. 8 Raghavan and Tal [63] Peer-reviewed paper [63]. We utilize the same data as [52] for this.
in section 2.2, provides vehicle-level summaries of electric and total miles traveled, as well as net fuel consumption. Thus not only does the data permit estimation of averages, but also statistical distributions. One convenient way of show-casing summary of a statistical distribution is box-plots, such as the ones shown in figure 3 for MDIUF and FUF of the three variants of Chevrolet Volt in Voltstats.net. In a box-plot, the 25th and 75th percentiles of the statistical distribution are represented by the bottom and top of the box, respectively. The middle line reprints the median value, extension lines represent the 5th and 95th percentiles, while non-outlier average is represented by diamond marker. The layout of figure 3 arranges the horizontal axis location of the box plot for the three variants of Chevrolet Volt to align with their electric driving range so that the standard UF curves from SAE J2841 [41] can be overlaid on the plot. The number of vehicle samples (n) for each PHEV variant is also listed in figure 3 in between brackets below the variant short name. To maintain consistency with the  figure 3 include the apparent wide range between the 5th and 95th percentiles of vehicle owners (representing the vast spectrum of driving patterns and vehicle usage conditions), as well as the fact that both Volt35 and Volt38 variants have (on average) exceeded their expectation in terms of SAE J2841 MDIUF and FUF, while Volt53 variant seems to be (on average) spot-on with its expected SAE J2841 MDIUF and FUF.
Further analysis of Voltstats.net data towards estimation of fuel consumption focuses on FUF (and weighing vehicle samples by annualized VMT) since this is the better indicator of the fleet-wide GHG emissions reduction performance. First, the statistical distribution of fuel economy in charge sustaining mode is shown as box-plots in figure 4, with US-EPA combined cycle reference values marked via circular symbol. Lastly, box-plots for the statistical distribution of the fuel consumption ratio (γ in equation (3), reference value for gal mi −1 value via FUF in table 3) is plotted in figure 5. A value of 100% on the vertical axis in figure 5 corresponds to exact attainment of the reference value for net fuel consumption by PHEV, with values less/more than 100% implying less/more net fuel consumption than the reference. Though all three variants appear to have slightly underperformed their EPA ratings for charge sustaining fuel economy (figure 4) and despite wide variation between different vehicle owners, the average net fuel consumption for Volt35 and Volt38 in figure 5 appears to be ∼12% better than the reference, while Volt53 is only ∼3% worse than the reference.
The next considered dataset is MyMPG [60], which is a continuously updating source, thus it should be noted that the data pull for this paper (conducted in August 2021) could yield slightly different data than what was analyzed in other studies. Supplementary data of this paper shows the vehiclelevel data for 29 PHEV variants, believed to represent all model-year 2011-2019 PHEVs in the US. Due to data quality concerns associated with self-reported values however, variants with fewer than 30 vehicle samples were excluded. Median value of gal mi −1 data of retained six PHEV variants was compared to the MDIUF reference fuel consumption (table 3) to generate figure 6 for fuel consumption ratio. Reference fuel consumption via MDIUF was utilized because MyMPG data does not provide mileage data so the vehicle samples are essentially equally-weighed. In figure 6, all PHEVs exceeded expectation.
Next considered datasets include appendix G of the 2017 CARB report [54] and Smart et al [60]. In both of these reports/ paper, the presented data (electric and total VMT, electric and total annualized VMT and/or total electric driving ratio) only permits direct estimation of FUF, which is shown (with an overlay of the SAE J2841 FUF curve [41]) in figure 7. One observation from figure 7 is that shorter range PHEVs appear to under-perform in their real-world FUF compared to reference curve, while longer range PHEVs appear to out-perform their reference FUF. There is no information however in these datasets about actual fuel consumption, but indirect estimates (via the assumption that the PHEVs exactly attain their EPA charge sustaining fuel economy) can be made, which will be considered in section 3.2.
Lastly, findings from the eVMT dataset [62] as summarized in Raghavan and Tal [63] are considered. The data provided in [64] permits analysis of both MDIUF and FUF (as shown in figure 8), as well as the real-world fuel consumption ratio (as shown in figure 9). The fuel consumption ratio in this dataset (figure 9) shows a worst-case (Volt53) of ∼21% more real-world fuel consumption than the reference.

Normalized results from all datasets
Though the analyzed datasets in section 3.1 vary in type of available real world PHEV data (and by extent, the directly inferable results), data collection method and quality, vehicle monitoring duration and number of vehicle samples, there is perceived value in comparing all of them together. When a source dataset does     not have both real-world UF and fuel consumption (all datasets except Voltstas.net and Raghavan and Tal [63] have this issue), we follow the same approach in [52], which is to assume that the PHEV exactly attains its EPA rated charge sustaining fuel economy and use (equation (7), adopted from [52]) to estimate an (indirectly) inferred ϕ a or UF a when only one of them is known: Examination of the inferred (indirectly for some datasets) real-world UF are shown in figure 10 and fuel consumption ratio is shown in figure 11, with details of indirectly-inferred result of those plots provided in supplementary data. For quick reference, summary of the deviation from reference fuel consumption is (from figures 5, 6, 9 and 11) is provided in table 4. To guage how much fuel is saved by PHEVs compared to an equivalent HEV, figure 11 also shows the fuel consumption ratio corresponding to a

Discussion
Since this paper reanalyzed the real-world driving datasets and reports about PHEVs in the US cited in [52], it is perhaps important to reemphasize some of the limitations of these data sources. Much of the actual vehicle data collection in these sources (all the data in [57,64], and major portion of [60][61][62][63]) happened 5-10 years earlier than the writing of this manuscript. Many of the PHEV models considered are no longer available as new vehicles for sale, and perhaps the bigger concern is type of persons (likely early-adopters) who owned the PHEVs during the data collection. However, with datasets of vehicle real-world driving being few and far between, aside from the datasets considered in [52] and re-analyzed in this paper, the authors are not aware of other publicly-accessible datasets that may be used to infer real-world performance of US PHEVs except the California Vehicle Survey [66]. Noting that the data in [66] included the vehicle make and model-year, but redacted the vehicle model, it was perceived that while potentially useful, analysis of data from [66] did not fit the theme of current paper. Instead this paper focuses on the same datasets cited in [52], with the note that all obtained results are only representative of previous decade PHEVs in the US. Even when keeping the discussion focused on the specific era, it also ought to be noted that none of the datasets is perfectly representative of all PHEVs in the US. Voltstats.net for example, which is the best among the considered dataset in terms of data reliability, detail, logging period and number of vehicle samples, only has data for three PHEV variants of Chevrolet Volt. Raghavan and Tal [63] has similar quality in terms of data reliability, detail and logging period, has the advantage of including multiple make-model PHEVs, but has the disadvantage of far fewer vehicle sample size per PHEV variant. Data from MyMPG lacks in detail, sample size and may have quality issues due to being self-reported, while data from appendix G in CARB 2017 report and Smart et al [64] lacks in detail and is more outdated than the other datasets. However, when considering the results generated in section 3.2 of this paper as a re-production of the analysis in [52], one may readily observe from figure 10 that real-world MDIUF and FUF of PHEVs in the US appears to have followed closely their respective expected values per SAE J2841 standard, with the shorter range PHEVs somewhat under-performing, while the longer range PHEVs somewhat over-performing.
When considering the real-world fuel consumption ratio in figure 11, the best performing observation is for the i3REX in appendix G of CARB 2017 (62% less fuel consumption than the reference value), while the worst performing observation is for the Volt53 (21% more fuel consumption than reference value), with all other PHEV variants across all datasets being somewhere between those limits. As such, the claim in [52] based on analysis of the cited datasets that US PHEVs real-world fuel consumption are more than twice their US-EPA ratings, is unlikely to be true, and may have been due to incorrect reference values of the fuel consumption. In fact, comparing the real-world performance of US PHEVs (Blue bars in figure 10) to a hypothetical scenario of 'No Charging' (gray bars in figure 10) where PHEVs would essentially operate as equivalent HEVs with their US-EPA charge sustaining fuel economy ratings, one may readily observe that aside from very-short electric driving range PHEVs (which by definition, ought to be close to an equivalent HEV), PHEVs are performing much closer to their reference rating (of fuel consumption ratio value of 100%) than to an equivalent HEV. This result is in the authors' opinion a testimony to the real-world performance of PHEVs in the US (for the respective era and datasets) being in good agreement with their US reference standard ratings.
For reference standard ratings of PHEVs to be reasonably representative of their real-world performance, not only must the assumptions for charging frequency in the reference UF curves be in agreement with how PHEV owners use their vehicles, but also the reference electric driving range and fuel economy need to reasonably match the vehicle realworld performance. When examining PHEV variants in table 3 that had standard ratings under both US-EPA and the old European standard (based on NEDC), one can observe that the same PHEV variants had an expectation of 36%-56% longer electric drive range and 32%-55% better charge sustaining fuel economy in NEDC. While over-estimation of vehicle efficiency is a known issue for NEDC [51], it is compounded in case of PHEVs (per. equations (3)-(5)). Setting aside differences in daily VMT between US and Europe, if one were to consider Volt38 variant (whose real-world performance exceeded expectation by US standards in every dataset in figure 11), the fuel consumption ratio of Volt38 when referenced to its NEDC reference values in table 3 would be between 166% and 251%. In other words, if evaluated via NEDC reference values, the real-world performance of Volt38 in the US would be between 66% and 151% worse than the reference. This is perceived to be an issue with the old European standard rather than how the PHEVs are used by their owners. Unfortunately, recent revisions of the European standard [67] (based on the Worldwide harmonized Light vehicle Test Procedure, WLTP) did little to correct the reference values for fuel consumption in European PHEVs, suggesting a 1:1 conversion from the NEDC ratings to WLTP [58]. However, further revisions of the standard seem to be under way, as was reported in [65].
Further examination of real-world fuel consumption of US PHEVs (from all datasets combined) is investigated in figure 12. This is done by considering the real-world fuel consumption ratio (γ, shown as blue bars in figure 11) in comparison to the one from 'no-charge'-scenario, which represents the expected performance of an equivalent HEV (γ HEV-Eq. , shown as gray bars in figure 11). The ratio of (γ/γ HEV-Eq. ) is then plotted in figure 12 (as blue dots) versus the electric drive range per US-EPA ratings. Figure 12 shows the very short range (11-13 mile) PHEVs, most of which no longer available on the market in the US, achieving 10%-25% reduction in fuel consumption compared to equivalent HEV, while PEHVs with 20 mile range seem able to achieve 30%-45% reduction in fuel consumption compared to equivalent HEV and PHEVs with a range of 35 miles or more appear capable of achieving more than 60% reduction in fuel consumption compared to an equivalent HEV. Considering that HEVs are already capable of 20%-35% reduction in fuel consumption compared to conventional ICE vehicles [5][6][7][8], PHEVs with 35 mile electric drive range or more could be roughly estimated as capable of reducing fuel consumption by 68%-74% compared to equivalent conventional ICE vehicles, thereby representing a quick and readily deployable effective solution for fuel consumption reduction. When considering the bigger picture of LCA of GHG emissions, one must also take into account the equivalent emissions for grid electricity generation as well as manufacturing the vehicles and batteries. LCA however is beyond the scope of current paper.

Conclusion
This paper considered an analysis of several datasets and sources for real-world performance of PHEVs in the US. Most of the real-world vehicle performance data was collected between 2012 and 2017 and thus the inferred results are only representative of PHEVs in the US of the previous decade. The observed real-world UF for all considered PHEV variants seemed to be in good agreement with respective expected UF values from SAE J2841 standard, with the longer-range PHEVs performing better than the standard, while shorter range ones somewhat under-performing. Real-world fuel consumption in the PHEVs ranged between 62% less (better) to 21% more (worse) than the expected reference values based on US-EPA electric driving range and fuel economy ratings, which is perceived as testimony to good agreement between reference ratings and the real-world performance of PHEVs in the US, unlike reference ratings for PHEVs in some other parts of the world. Comparing the real-world fuel consumption of PHEVs in the US to that of estimated equivalent HEVs showed PHEVs as a feasible and effective solution for significant reductions in fuel consumption.

Data availability statement
All data that support the findings of this study are included within the article and supplementary files, or downloadable from public sources that have been referenced.