A method for generating complete EV charging datasets and analysis of residential charging behaviour in a large Norwegian case study

Electric vehicles (EVs) are part of the solution to achieve global carbon emissions reduction targets, and the number of EVs is increasing worldwide. Increased demand for EV charging can challenge the grid capacity of power distribution systems. Smart charging is therefore becoming an increasingly important topic, and availability of high-grade EV charging data is needed for analysing and modelling of EV charging and related energy flexibility. This study provides a set of methodologies for transforming real-world and commonly available EV charging data into easy-to-use EV charging datasets necessary for conducting a range of different EV studies. More than 35,000 residential charging sessions are analysed. The datasets include realistic predictions of battery capacities, charging power, and plug-in State-of-Charge (SoC) for each of the EVs, along with plug-in/plug-out times, and energy charged. Finally, we analyse how residential charging behaviour is affected by EV battery capacity and charging power. The results show a considerable potential for shifting residential EV charging in time, especially from afternoon/evenings to night-time. Such shifting of charging loads can reduce the grid burden resulting from residential EV charging. The potential for a single EV user to shift EV charging in time increases with higher EV charging power, more frequent connections, and longer connection times. The proposed methods provide the basis for assessing current and future EV charging behaviour, data-driven energy flexibility characterization, analysis, and modelling of EV charging loads and EV integration into power grids.


Background 1.Electric Vehicles as important players to providing flexibility in the future energy market
Electric vehicles (EVs) are part of the solution to achieve carbon emissions reduction targets set under the Paris Climate Change Agreement [1].This has led to policy support for EVs in several countries and substantial increase in EV sales in e.g., China, Europe, and the United States.In 2022, the number of different types of EV models available on the market had increased to around 500 [2].On a global level, the market share of EVs was 14% in 2022, with Norway being the leading country with an 88% market share [2].The EV market in Norway has passed the early adopter stage, and EVs are becoming the dominant car choice of the population.EV charging at home and at the workplace are dominating, with charge points (CPs) generally being below 22 kW [2].
Even though EVs are expected to account for a minor share of global electricity consumption also in the future, the EV fleet can challenge the grid capacity of power distribution systems [2,3].Furthermore, EV charging is expected to have a high impact on residential and commercial energy load curves [3].Smart charging is therefore becoming an increasingly important topic [4], and because EV charging loads can be shifted in time, smart charging can provide energy flexibility [5].The energy flexibility of a building can be defined as "the ability to manage its demand and generation according to local climate conditions, user needs and grid requirements" [6].Flexible energy use is becoming increasingly important in the energy system, since a growing share of the energy supply is variable and non-flexible renewable energy generation.
Fischer et al. [7] analysed electric load profiles for household appliances, electric heating systems, and EVs in the residential sector, and found that EVs have the highest flexibility potential among all the energy uses.For large scale utilization of energy flexibility in buildings, new solutions need to be developed, addressing technological, social, commercial, and regulatory aspects [8].Li et al. [8] emphasized the advantages of utilizing flexibility sources from a cluster of buildings to increase the impact and role of energy flexibility.For EV charging, Charge Point Operators (CPOs) can have a role as aggregators of energy flexibility by facilitating the shifting of charging loads.In Norway, for example, several CPOs (such as the companies Tibber, Current, Kople, and ZapTec) provide CP management services for EV fleets, where e.g., aggregated charging loads in parking facilities are kept below the available distribution capacity, or charging loads are shifted to hours during the day when the energy spot prices are the lowest.In the future, such CP management systems may provide opportunities for residential and commercial users to engage in the flexibility market.However, more knowledge about charging habits is necessary to optimally utilize the EV charging flexibility, such as the number of cars that are connected to the CPs, daily charging demand, plug-in state of charge (SoC) of the EV batteries, and CP plug-out times.
The objective of this study is to provide realistic and high-grade EV charging data, and analyses of related EV charging behaviour, based on real-world data from more than 35,000 charging sessions in Norway.The results are useful as input for a range of energy studies, e.g., for load forecasting, for assessment of energy flexibility potentials in neighbourhoods, and for sizing of grid connection capacities.

The need and availability of high-quality datasets for optimizing the flexibility potential of EVs
For analysing energy flexibility in terms of load shifting and load shaving, data with hourly resolution is usually used [9].If there is need for a faster response to flexibility requests, such as in frequency regulation, sub-hourly resolution is normally needed.Salim et al. [10] emphasized the importance of publicly available input data for modelling of occupant behaviour and energy in buildings at urban scale.Amayri et al. [11] concluded that there is a need for more publicly available datasets on EV charging in residential buildings to improve load forecasting and flexibility forecasting.Calearo et al. [12] stated that published EV data is limited, and describes how five parameters ideally should be available for EV studies in the smart grid context: 1) Battery capacity, 2) charging power, 3) plug-in SoC, 4) plug-in/plug-out time, and 5) charged energy.They refer to such data as ideal, "because it is the highest level of data availability one can have when conducting EV studies in the power system context".To sum up, there is a lack of complete datasets related to charging sessions and CPs.
Due to the fact that Norway has been a frontrunner in EV use, EV charging reports from CPOs are becoming commonly available, including CPs in private residential and commercial parking spaces.Such CPO reports include data for the parameters 4) and 5) mentioned above, i.e. plug-in/plug-out times and charged energy for the charging sessions [13,14].However, the CPO reports do not include the parameters for 1), 2), and 3, i.e. battery capacity and charging power for each EV, or plug-in SoC for charging sessions.Our work provides a set of methodologies which complements the data in the CPO reports, providing a complete ideal dataset for EV charging based on an empirical residential case study in Norway.

State of the art: Prediction of EV charging power, battery capacity, and plug-in SoC, and their impact on residential charging behaviour
To complement the data in the CPO reports, values are needed for the parameters 1) charging power, 2) battery capacity, and 3) plug-in SoC.Sections 1.2.1 and 1.2.2 introduce these parameters, the availability of real-world data, and how the parameters typically are predicted in literature.Section 1.2.3 describes literature focussing on how EV charging behaviour is related to battery capacity and SoC values.

EV charging parameters
The energy demand during a charging session depends on the battery SoC at plug-in time, the final SoC, the battery capacity, and the charging efficiency.The time needed for charging depends on the charging power, which can be limited by the CP or the EV characteristics.The actual charging power is the lowest value of the AC power available at the location (Fig. 1, marked A) and the onboard charger capacity in the EV (Fig. 1, marked B).
When the connection time is longer than the charging time, there is a period of non-charging idle time which reflects the flexibility potential for the charging session.The energy which could potentially have been charged during the idle time, is called idle energy capacity.The idle energy capacity depends on the battery's SoC, the maximal charging power, and the availability of connected EVs [15]  for several EVs are aggregated, several studies have found that both the charging power peaks and the flexibility potential increases with higher charging power [13,15].

Prediction of EV charging power, battery capacity, and plug-in SoC
Fig. 2 shows an illustration of charging characteristics for EVs on the market, with onboard charger capacities on the horizontal axis and net battery capacities on the vertical axis, based on [13,16,17].In a garage, different EVs typically have a mix of different charging power levels and battery capacities.Due to lack of data availability, EV characteristics such as charging power and battery capacity for each EV are often determined based on assumptions.Table 1 shows examples of AC charging power assumptions from literature.In references [13,15,[18][19][20][21][22][23] different scenarios with 'low' or 'high' charging powers were presented, where all the EVs have the same charging power level.In [24][25][26] a mix of charging power capacities were assumed for EV fleets in Germany, New Zealand and Norway, based on onboard charger capacities of typical EV models.As shown in Table 1, most studies either assume charging power levels based on typical power levels available at CPs (A in Fig. 1), or based on typical onboard charger capacities (B in Fig. 1).In [27], EV charging measurements from a shopping centre were analysed, and both the CP power and a mix of onboard charger capacities were addressed.The studied CPs had a power level of 22 kW, which means that most of the EVs were limited by their onboard charger capacities.The study found that most charging sessions had a peak charging power of around 4 kW, while some clusters had 7.5 kW and 11 kW.Simolin et al. [27] stated that there is a need for more studies taking into account the combination of the two CP power levels and the onboard charger capacity, including various power levels and charging sites, such as homes and workplaces.In our study, realistic databased charging power predictions are provided per EV user, taking both the CP power and the onboard charger capacity into account.
In general, real-world data that describe EV battery capacity and charging session SoC values are seldom available for AC charging, except in EV trials such as [29,30], where trial participants contribute with data.However EV trial data are often limited in size, geography and time, and, as stated by [20], usually represent a particular set of technologies and/or individuals.Accordingly, there is a need for more EV data from everyday users of CPs.

EV charging behaviour related to battery capacity and SoC values
Vermeulen et al. [31] acknowledged that there is limited research on the influence of battery capacity on EV charging behaviour.In [31], the research group predicted the battery capacities for EVs using public CPs in the Netherlands, based on information about energy charged from CPO reports.They assumed that all users would have at least one charging session where they charged their battery from 0% to 100%.Since several EVs could use the same user ID, for example if the user had a hired car or guests were using the chargers, the 98 percentile of the charging sessions was used.The EVs were split into plug-in hybrid EVs (PHEVs), and two groups of battery EVs (BEVs): low BEVs (up to 33 kWh battery capacity) and high BEVs (above 33 kWh).Due to the fact that only a few charging sessions involves charging the battery from 0% to 100%, the method underestimated the battery capacities, especially for BEVs with large batteries.Another research [32] also used CPO reports from public CPs in the Netherlands to predict battery capacities and start SoC of EVs.The researchers divided the users into 9 clusters and retrieved mean predicted battery capacities between 12.7 and 24.6 kWh.The study found a weak relationship between user types and battery capacities, and recommended further research to explore behavioural changes over time, with various EV types.Wolbertus et al. [33,34] presented charging data from a public dataset as well as private CPs in the Netherlands.The study predicted the users' battery sizes according to the maximum energy charged per user.The charging power was predicted to 3.7 kW (if single phase CP) or 11 kW (if three phase CP).In [33], the researchers found that users with battery sizes above 70 kWh charged 2.8 times a week, with about 25 kWh energy per charging session.Users with smaller battery capacities between 16 and 30 kWh charged 4 times a week, with about 10 kWh energy per charging session.
The optimal SoC range for operating EV batteries is commonly suggested to be 20-80% [35].SoC values from AC charging are not usually available for CPOs, since most EVs do not yet support communication standards such as ISO15118 [36].In a 6-month field trial with 40 EVs, [29,37] found that most EV users were comfortable with utilizing Fig. 2.. Onboard charger capacities and net battery capacities for EVs, based on market data from [16] and [17] (updated from [13]).approximately 80% of their available battery capacity.SoC values at plug-in and plug-out times were presented by [38], which analysed SoC data from trials with in total 29,262 charging sessions.Most of the EVs were part of company car fleets.Schäuble et al. [38] found that most of the charging sessions had a high start-SoC (median value of nearly 70%) and an end-SoC of more than 90%, which resulted in small SoC differences (less than 10% for 37% of the charging operations).Siddique et al. [39] calculated start SoC for 117,339 EV charging sessions in the US, based on information about energy charged per session and battery capacities.They found that the location category "single family residential" (about 17.7% of the sessions), which represented households with a dedicated CP, had a higher start SOC compared to other locations.This indicated that users with access to dedicated CPs typically will charge their EVs more regularly than users using public or shared CPs.Our findings in [13] supported this assumption, comparing charging behaviour of residential users with shared and private CPs.Residents with CPs on their private parking space had on average about 4.4 charging sessions per week, while residents using shared CPs charged about 1.2 times per week.The study also found that residents with a private CP had longer average connection times (12.8 h), than users with a shared CP (6.5 h).

Novelty and scientific contribution
The literature review showed that there is a general lack of EV charging data needed for data-driven analyses and modelling of EV charging and flexibility.Due to lack of data availability, EV characteristics are often based on rough assumptions.This paper presents a methodological framework that can be used to provide complete EV charging data.The methodological framework fulfils the requirements for an 'ideal' dataset for EV studies as specified by Calearo et al. [12].The simple and practical set of methodologies that are proposed, were developed by combining and further developing existing methodologies from literature, and was based on a large empirical dataset obtained from a Norwegian case study with more than 35,000 charging sessions.Realistic predictions of battery capacities, charging power, and plug-in SoC for each EV and charging session were added to datasets with plug-in/plug-out times and energy charged, to provide complete EV datasets.Table 2 shows an overview of all the input and output data presented in this article.The needed input data are commonly available in CPO reports, which lays the ground for a wide-scale use of the methodology, covering different user groups, building categories, and geographic locations.The CPO reports represent data from everyday users of CPs.
Charging habits are related to EV characteristics such as battery capacity and charging power, which again will affect the load profiles and flexibility potential of EV charging.Table 3 includes a comparison of our study with the literature described in Section 1.2, focussing on EV charging behaviour related to EV charging power, battery capacity, and SoC values.The added value of our study is that the charging behaviour analyses were based on a large number of residential charging sessions in a mature EV market, where most of the EV users have private CPs.Charging behaviour was analysed for users with different battery sizes and charging power.Since the EV charging dataset was comprehensive, the analyses covered a wide range of EV charging behaviour, such as the energy charged, the idle times and related idle energy capacities, start SoC of the charging sessions, frequency and timing of EV charging sessions, etc.Such information can be employed as valuable inputs in various energy-related research studies, such as load forecasting, or when evaluating energy flexibility potentials within neighbourhoods.
The main contributions of our study are: 1.A set of methodologies for transforming readily available real-world EV charging data into high-quality and user-friendly EV charging datasets.These datasets are essential for conducting a range of different EV studies, including load forecasting and energy flexibility assessments.The methodologies are developed by combining and further developing existing approaches found in the literature, and  on data from a case study that includes more than 35,000 charging sessions from residential buildings in Norway.2. A statistical analysis of the EV charging dataset that presents how residential charging behaviour and load shifting potential are affected by EV battery capacity and charging power.Behaviour data, such as energy charged, start SoC values, idle energy capacity, and frequency of charging, are presented for users with small and large EVs.The analysis is based on the large case study dataset from point 1, where CPs are mainly located on private residential parking spaces.

Structure of the article
The remainder of this paper is structured as follows: Section 2 introduces the input data used in the analysis, while Section 3 describes the methodology.Section 4 presents the results and a discussion of the findings including predictions of EV charging powers, battery capacities, hourly charging loads and SoC, and idle energy capacities.Section 4.3 provides an analysis of how EV battery and charging power capacities may affect charging habits.Finally, the conclusion of the paper is drawn in Section 5.

Input data for the analysis
The main data source for our analysis was CPO reports from apartment buildings in 12 locations in Norway.In total, 267 user IDs and more than 35,000 charging sessions were analysed (after cleaning).The CPO reports include information on plug-in time, plug-out time, and energy charged for each charging session, in addition to identifiers for user ID and location.Data availability for each location is listed in Table 4, including the number of user IDs and charging sessions before and after cleaning the data.Most of the CPs were located at private parking spaces for the residents and some locations also have shared CPs available for all residents.The CP ownership was not separated in this work, since it is not known for all the locations.The data and analysis for location TRO_R are further described in [13,14].
Most of the locations had an increasing number of user IDs during the period of data collection.The collection period for the locations varied, spanning from February 2018 to August 2021.During this period, Norway was affected by COVID-19, mainly from March 2020.COVID-19 rules and recommendations that were introduced from this time, might have affected the charging habits for the locations with data also from this period (BAR_2, KRO and OSL_T), e.g., due to increased use of home office and changes in travel activities.Consequently, only the period before March 2020 was included for prediction of hourly energy, idle energy capacity, and SoC (Sections 3.4 and 4.2), as well as for the comparison of charging habits (Sections 3.5 and 4.3).However, the whole data period was included for the prediction of EV charging power and net battery capacity (Sections 3.1, 3.2, and 4.1), since these are technical parameters related to the EV, i.e. not affected by user behaviour.Most of the input data were from before the COVID-19 period, and more than 80% of all the charging sessions were completed before March 2020.
The LV distribution system in most of the locations is of type 230 Volt IT system, which is typically the case for residential customers in Norway.For most EV users, the CP charging power (A in Fig. 1) is limited to a maximum of 7.4 kW (32 A).The users have the possibility to manually activate IT 3-phase charging on their CP, which provides up to 11 kW, but only some EV models support this.For the location OSL_T, a charging power of 7.4 kW is available at private parking spaces, while 22 kW is available at 16 shared CPs.Since 11 kW charging power is the limitation in most of our locations, this is the focus of the study.
Data cleaning removed 76 User IDs (22%) and about 3000 charging sessions (7.6%) from the original dataset, including the following: • Sessions with no energy charged (≤0.5 kWh) (n = 2289) (assumed faulty sessions).• Sessions with too high energy charged (>150 kWh) (n = 2) (assumed faulty sessions, since the maximum battery capacity for EVs on the market is 100 kWh).• Sessions with connection time of less than 2 min (n = 131) (assumed faulty sessions).• Sessions with connection time of more than 5 days (n = 155) (sessions affect average connection and idle time).• A preliminary value for average power was calculated based on the energy charged divided by the connection time.For sessions with an average power higher than the available charging power (≥11.5 kW), plug-out times were removed (set to NA), since this indicated that the value was incorrect (n = 40).With removed plugout times, the sessions were excluded for most of the analysis in this work.• Plug-out times were also removed for OSL_T for average power ≥ 11.5 kW (n sessions= 22), even though 22 kW is available at their shared CPs.Two user IDs were removed (including all their 338 sessions), since they had more than one session (n = 12 + 6) with average power higher than 11.5 kW.It was therefore assumed that they normally used the shared CPs, charging with higher charging power, and that removing plug-out times for some sessions could provide misleading results.• User IDs with less than 10 charging sessions were removed (n = 268 sessions, 72 user IDs).
Finally, corrections for time zones/daylight saving time (DST) were performed, before adding calendar data such as weekdays.
Fig. 3 shows plug-in times, plug-out times, connection times, and energy charged in the EV data locations.For all locations, the share of plug-in times increases in the evenings, with a peak in the afternoon (for most locations 16:00-17:00) when the working day typically ends in Norway.For the plug-out times, there is a peak in the morning (for most locations 7:00-8:00), corresponding to the start of a typical working day.OSL_1 stands out with an additional plug-in peak at 09:00 and plugout peak at 14:00, but most of these charging sessions (plug-in: 91%, plug-out: 70%) are related to a single user.Both the morning and afternoon peaks are higher for some of the locations (BAR and KRO), which may be explained by a higher share of commuters in these areas.For the whole data period (n sessions = 35,377), the connection time is in average 12.7 h for the sessions, and 90% of the charging sessions last for less than 22.1 h.Average energy charged per charging session is 12.7 kWh.On average, each user starts 3.9 charging sessions per week.

Methodology
Based on the data available in the CPO reports, a set of simple-to-use methodologies are proposed for assessing complete and ideal EV datasets.The methodologies can be used for locations such as residential buildings and workplaces, where user IDs are unique.Flow charts for the methodologies are shown in Fig. 4, where values for EV charging power, EV battery capacity, and hourly battery SoC for charging sessions are assessed for each user ID individually.Firstly, charging power and battery capacity are predicted for each user ID, as described in Sections 3.1 and 3.2.The charging power and battery capacity predictions are then used to develop hourly SoC predictions, as described in Section 3.4.Section 3.5 describes a methodology for analysing how residential charging behaviour is affected by EV battery capacity and charging power.To do this, the 224 residential EV users were divided into four user groups, according to their battery capacity and charging power, and their charging habits were compared.All data analyses and predictions have been performed using the statistical computing environment R [41].

The EV charging power prediction method
The aim of the charging power prediction method is to provide predictions for the EV charging power per user ID.The EV charging power can be limited by the onboard charger in the EV or by the available AC power at the location.For each user ID, the predicted charging power value will be the lowest value of the two limitations.
When predicting the EV charging power, it is assumed that all user IDs have at least one charging session where they plug-out the charger while the battery is still charging.If this is the case, the available connection time (t connection ) equals the charging time (t charging ), and it is possible to calculate the average charging power (P charging ) using Eqs. 1

CP connection time for an EV session
When plugout during charging : Average EV charging power : To identify the interrupted charging sessions with plug-out during charging, the highest charging power values are of interest.For each user ID, the maximum value for P charging is therefore selected as the preliminary EV charging power prediction (P preliminary ).Our method is similar to the method proposed by [42], but while [42] used the predicted values to sort the users into two charging power levels, our method predicts all the charging power levels individually, reflecting the fact that actual charging power varies with the EV.When predicting realistic charging power values per user, also the results for hourly charging loads and idle energy capacities will become closer to reality.In addition, a second step is included to filter out errors and outliers, as explained in the following.Since charging power is predicted per user ID, the values are not necessarily connected to a specific EV.Some users may drive several EVs, or they may invite others/guests to use their user IDs.To improve the predictions, the P preliminary value for each user ID is therefore compared to typical levels for onboard charger capacities for EVs on the market (ref.Fig. 2).The charger capacities are grouped into three levels: Level 1 (mainly PHEVs and earlier models of BEVs): < 4 kW, Level 2 (mainly BEVs with onboard charger capacities around 7 kW): 4 < 8 kW, Level 3 (mainly larger / newer BEVs with onboard charger capacity from 11 kW and above): 8-11.5 kW.For each user ID, the P preliminary value is categorised into one of the three charger capacity categories.If the same user has at least two charging sessions with P charging in the same charger capacity category, the P preliminary value is considered to be the final charging power prediction (P user ).Otherwise, the charging session with the preliminary prediction is filtered as an outlier, and a new P preliminary value is calculated.The recalculation is repeated, until all user IDs have a final charging power prediction.For the 267 user IDs that we analysed, P user was predicted directly for 93% (249) of the user IDs (P user = P preliminary , no filtering needed).P user was predicted for 6% (17) of the user IDs after filtering one outlier, and 1 user ID after filtering two outliers.Finally, user IDs with P user values of less than 2 kW were removed (6 user IDs), since no EVs with less than 3 kW charging power were identified on the market, thus it was assumed that the predictions were too low.Due to the assumption that all user IDs have at least one interrupted charging session, P user prediction for user IDs with no interrupted charging sessions will be too low.The assumptions and justifications applied in the method are summarized in Table 5.

The net EV battery capacity prediction method
The aim of the net EV battery capacity prediction method is to provide predictions for the net EV battery capacity per user ID.When doing this, it is assumed that all user IDs have at least one charging session where they charge their EV battery a certain SoC range; from a defined minimum SoC-level to a defined maximum SoC-level.Initially, the charging session with the highest value for energy charged for each user is selected.No large values or outliers are filtered in this process, since it is expected that some users seldom charge the full SoC range of their EV batteries [29,37].A filtering process may therefore remove valuable data.The selected maximum values are multiplied by an efficiency for one-way AC/DC conversion to calculate the approximate amount of energy stored in the battery (E battery ), as shown in Eq. 4. For EV charging, [46] and [47] have found energy losses between 12% and 40%.Marra et al. [19] considered 88% charging efficiency, based on empirical studies.Thus, in our work, a charging efficiency (ɳ) of 88% is assumed.

Maximum energy stored in battery
Battery capacity prediction : The calculated values for maximum energy stored in the batteries (Eq.4) are divided by an assumed SoC range for the charging session (Eq.5) to get a prediction of the battery capacities.Two different SoC ranges are assumed for the EV users, dependent on the EV battery classification, i.e., small/medium (EV-SM) or large (EV-L).This is based on the hypothesis that users with large batteries are less likely to charge their EVs from nearly empty to completely full.As found by [29,37], most EV users prefer charging their EV batteries before reaching about 20% SoC.EV users with smaller batteries are expected to more frequently charge for a larger SoC range, based on the occasionally need for longer driving ranges.The findings by [31] support this theory, as they found few EVs with large batteries when assuming that all users charged their battery from 0% to 100%.Two different levels are therefore set for the minimum SoC-level: 10% for EV-SM and 20% for EV-L.The defined maximum SoC-level is set to 100% for all EV users.This gives a SoC range of 90% for EV-SM and 80% for EV-L.
The threshold value that is used to categorize the EVs into two battery capacity groups are based on Eq. 4, which represent the net battery capacity within the defined SoC-range only.When using different SoC ranges to predict the battery capacities (Eq.5), a gap between the net battery capacity groups appears.The EV-SM group has battery capacities up to 31.1 kWh and the EV-SM group has battery capacities above 35 kWh.The threshold value is chosen based on the net battery capacity of typical EVs on the marked (ref.Fig. 2), and is comparable to the 33 kWh value used by [31].There is no distinction between the BEV and the PHEV in our study, and PHEV is included in the EV-SM group.Table 6 shows a summary of the assumptions and justifications for prediction of EV battery capacity.

Method for validation of the EV charging power and battery capacity predictions
To validate the suggested methods, the EV charging power and battery capacity predictions in Section 4.1 were compared to information from 15 users in location TRO_R and BAR_2, including data on their nominal onboard charger capacities (kW AC) and net battery capacities (kWh).For the remaining locations, the CPO reports were anonymized, with no information about the users and their EVs.In addition, the results were compared to typical charging characteristic for models of EVs on the market [13,16,17], as shown in Fig. 2. The market data includes 102 models of BEVs and PHEVs described by [16] and [17].Since some car manufacturers publish gross battery capacities only, the presented net battery capacity for these manufacturers is set equal to the capacity predicted by [16,17].

Hourly battery SoC prediction method
As a basis for predicting the hourly SoC values, energy charged during charging sessions were distributed hourly on the timeline, using the methodology presented in [13].For calculating the hourly charging loads (E load (i) ), the EV charging power predictions per user ID (P user ) are multiplied with the hourly charging time (Eq.6).It is assumed that the charging starts immediately after plug-in, and that the charging power is fixed over the whole charging time.

Charging load for hour i : E load(i) = P user × t charging(i)
where For every charging session hour, the SoC difference for each EV is calculated as the hourly energy stored in the battery (energy load multiplied with efficiency) divided by the battery capacity prediction for each user (E user-battery ) (Eq. 7).Assuming a final SoC value, the SoC value every hour can be calculated, starting with the last hour of every session and then proceed reverse in time, hour-by-hour until the first session hour.We assumed that all uninterrupted charging sessions continued charging until the battery was nearly full, i.e., with final SoCs of 80%, 95% or 100%.This assumption is justified by [38], which found median values above 95% for final SoCs.No final SoC values were assumed for charging sessions where the predicted non-charging idle time was less than an hour, since these charging sessions may have been stopped by the user.

Table 6
EV battery capacity prediction method: Assumptions and justifications.

Assumptions Justifications
All user IDs have at least one charging session where they charge their EV battery at a certain SoC range For a large dataset it is necessary to apply this simplification, since the actual battery capacity is not known.A similar assumption is made by [31].The assumption may result in too low or too high EV battery capacity predictions if the maximum SoC range is smaller or larger than assumed.The method is validated with EV data, as described in Sections 3.3 and 4.1 Some users seldom charge the full SoC range of their EV batteries Based on findings in [28,33].No large values or outliers are therefore filtered in this process.The method may result in too high values if several EVs are connected to one user ID.EV-SM EV-L ɳ 0.88 0.88 Based on [19], supported by [46,47].
SoC range 0.9 (10-100%) 0.8 (20-100%) Most EV users prefer charging their EV batteries before reaching about 20% SoC [29,37].EV users with smaller batteries are expected to more frequently charge a larger SoC range.This is strengthened by the findings of [31].To improve the results, two different SoC-ranges are therefore assumed for EV-SM and EV-L.max (E battery ) < 28 kWh > 28 kWh Calculated using Eq. 4.
The threshold value for EV-SM and EV-L is chosen based on the net battery capacity of typical EVs on the marked (ref.Fig. 2), and is in the range of the value used by [31] of 33 kWh.

Comparing charging habits for EV users with different battery and charging power capacities
In this work, we investigate how residential charging behaviour is affected by the EV battery capacity and charging power.To do this, energy charged, SoC-values, idle energy capacities, and time related data are compared for EV users with different battery and charging power capacities.Hourly values for idle energy capacities are predicted by using the methodology presented in [13], where hourly idle times for a session are multiplied by the assumed charging power for the user ID.The session idle time is calculated according to Eq. 8.The sum of the charging loads and idle energy capacities is referred to as the connected energy capacity (Eq.9).The hourly time series include all hours, also hours with no connected EVs.
Idle time per session : t idle = t connection − t charging (8) Connected energy capacity for houri : The hourly energy loads and idle energy capacities in Section 4.3 are presented normalized per user.Each user ID is classified as 'active' after their first charging session and 'inactive' after their last charging session.For the first and last 30 days in the measurement period of a given location, users are classified as 'active' if having at least one charging session during the 30-days period.This is done to prevent faulty classification of active users that are not charging frequently.
The hypothesis that EV battery and charging power capacity influence the charging behaviour is tested by organising the user IDs into four user groups according to their predicted net battery capacity (below/ above 33 kWh) and AC charging power (below/above 4 kW).Then, charging habits for the four user groups are presented and compared, with particular focus on the two main user groups SM_Low and L_High, which represent the largest differences in EV technology.For example, we want to compare if energy charged per session for SM_Low is significantly different than for L_High.The comparison is done by using a two-sample Mann-Whitney U Test [48,49], performed with the wilcox.test function in R [50].The Mann-Whitney test is rank-based, and does not rely on distribution assumptions, such as the two-sample t-test does.It tests the null hypothesis (H 0 ): That the two independent groups have the same distribution, against the alternative hypothesis (H 1 ): That the distribution of the first group differs from the second group.The result is evaluated as significant if the calculated p-value is less or equal to 0.05.This suggests that the values for the two groups are different, and it is likely that an observation in one group is greater (or smaller) than the observation in the other group.Mean charging habit values are calculated for the two groups, such as average load profiles, charging energy and frequencies, charging duration, and start SoC values.Distributions are shown in graphs, with data for SM_Low, L_High, and all users.The case study values are compared with values from relevant studies in the literature.

EV charging power and net battery capacity prediction
Fig. 5 shows the EV charging powers predicted for all the EV users.The grey lines in the figure represent onboard charging capacity levels for EVs on the market (ref Fig. 2).Black stars represent charging power for 15 of the EVs for which information was received about the onboard charger capacities (manufacturer data from [16,17]) as well as the available AC power at the location.For these 15 EVs, the predicted charging power values are close to the real values (difference of up to 0.5 kW).
The user IDs are grouped into three charging power levels, where 46% of the user IDs are predicted to be within charging power level 1 (<4 kW), 38% are within level 2 (4 <8 kW), and 16% are within level 3 (8-11.5kW).The charging power levels per location are further described in Table 7.For most locations there are users within all the three charging power levels.As stated above, the charging power is limited by the onboard charging capacity of the EV or the power available from the CP (typically 7.4 or 11.0 kW in Norway).For users with onboard charging capacities below 7.4 kW, the power is most likely limited by the onboard charging capacity of the EV.For two of the locations (BER, and OSL_S), all the user IDs were predicted to be within power level 1.This is most probably due to local power limitations for the charging power, which for example can be caused by limited grid connection power capacity of the building.For users with onboard charging capacities above 7.4 kW, the charging power is mostly limited by the CP.The exception is EVs that have activated three-phase charging, where the charging power may be up to 11.0 kW for some EV models.
Fig. 6 shows net EV battery capacity predictions for all the EV users.55% of the user IDs are predicted with EV-SM (below 33 kWh) and 45% with EV-L (above 33 kWh).Comparing the predicted battery capacities with known net battery capacities for the 15 EVs, it was found that the predicted values are close to the real values for the five users with EV-SM.The differences are up to 3.5 kWh, and all the predictions are lower than the real values.For the ten users with EV-L, the differences between the predictions and the real values are larger.One user was found to have charged 78.6 kWh, even though the net battery capacity was 52 kWh.Assuming that the values are correct, the charging losses must be larger than predicted (the EV in question used an external transformer during charging).For the remaining nine EVs, the differences were up to 15 kWh, some higher and some lower than the real values.These differences may be explained by a variance between the real values and the assumptions for charging efficiencies or SoC ranges in Table 6.However, even though there are some differences between the predictions and the real data, the methods provide a fairly accurate indication of the net battery capacities.
All the predictions of EV charging power and EV battery capacities are combined in Fig. 7, together with charging characteristics for EVs on the market (ref.Fig. 2).Four user groups are marked in the figure (SM_Low, SM_High, L_Low, L_High), forming a basis for the analysis in Section 4.3.Fig. 7 shows that the predictions provided by the methods in this paper are in the range of typical EVs on the market.Since the charging power method also takes the local power capacity into account, and not only the onboard charger capacity, EVs with onboard charger capacities above 11.0 kW are not represented in the results.
32 users are grouped as L_Low, even though there are no such EVs identified on the market.This may be explained by local power limitations.Such possible power limitations were discussed earlier in Section 4.1 for two of the locations (BER, OSL_S).The possible power limitation is supported by Fig. 7, where 23 user predictions for the two locations are grouped as L_Low (red tringles in the figure).Local power limitations may also be the case for the remaining 9 users, or these users may not have disconnected their EVs during charging.There are also EV models which does not charge optimally on the IT grid (e.g. the Zoe transformer provides 3.6 kW [51]), which may explain some differences between the predicted low charging power and the actual onboard charging power of the EVs.

Hourly battery SoC predictions
Hourly battery SoC values are predicted for all the charging sessions.Table 8 shows an example of input data and predictions for one of the charging sessions (TRO_20989).In the example, an end SoC of 95% is assumed.
The distribution of start SoC values are shown in Fig. 8.The figure illustrates situations with assumed end SoCs of 80%, 95% and 100%.In our study, 59% of the residential EVs are plugged-in when SoC is above 50%, assuming an end-SoC of 95%.The values are in the range of the findings by [52], which discovered that a high share (65%) of the EV drivers plug in their company cars when the SoC is above 50%.The distribution of start SoC values can also be compared with results presented by [39], which analysed SoC values for EV drivers with a dedicated home charger available.They found that the start SoC values for these drivers were distributed between approximately 20% and 90%, and with a rise towards the higher start SoC values.A high start SoC provides opportunities for Vehicle to grid (V2G) in the future, since the EVs can deliver energy to the building or grid during peak periods.
Fig. 9 shows median values for the predicted start SoCs and how the SoC values differ in the course of a day.Assuming an end SoC of 80-100%, the median start SoC values of the dataset are 42-62%, close to the mean values of 40-60%.The predictions can be compared to hourly median values from [38], where start SoC values were available from 9566 charging sessions.Due to the low number of observed sessions at night in [38], the values were aggregated from 11 pm to 6 am (n = 321).The study also found that the start SoC values differed over the day.This can also be found from the data in our study, but to a smaller degree than in [38].This may have several explanations; The users may behave differently (private EVs in our study compared to mainly company EVs in [38]), or the predicted values may not be fully accurate.Still, the predicted start SoC values are in the same range as the measured values found in [38].
Daily average SoC values for connected EVs are shown in Fig. 10

Comparing charging habits for EV users with different battery and charging power capacities
This section analyses how residential charging behaviour is affected by EV battery capacity and charging power.For the aggregated EV   charging, Fig. 11 shows how the average load profiles have an increased energy use in the afternoons and evenings, with the highest load occurring between 16:00 and midnight.The average load is at the lowest during night/early morning and during daytime.The average load profiles for the 12 locations in our study are similar to the profiles found in a previous analysis of the location TRO_R in [13], which were verified with hourly smart meter data.For charging sessions with idle times less than 1 h, the charging loads are marked as non-flexible.Most of this non-flexible charging occurs during the afternoon/evening.Fig. 11 also shows average connected energy capacities, where the difference between energy charged and connected energy capacity is the average idle energy capacity.The connected energy capacities illustrate how the EVs typically stay plugged-in during night-time.During workdays, the daily connected energy capacity is on average more than four times as high as the energy charged during the day.Even though this is based on average values, it reflects a considerable potential for shifting the EV charging in time, especially from afternoon/evenings to night-time.Other data-based EV studies confirm this flexibility potential in the residential sector, e.g.[13,[53][54][55][56].
When analysing grid-capacity, the maximum load profile may be more important than the average load profile.Load profiles per user is shown in Fig. 12, including the maximum energy charged, the 99th percentile, the 90th percentile, and the average and 25th percentiles of hourly energy charged.The maximum load profiles have two afternoon/ evening peaks during workdays, around 17:00 and 21:00, and one afternoon peak on Saturdays.In Fig. 12, only periods with 30 or more users are included, since the aggregated peak power per user is reduced with increasing number of users [13].
In the future, charging habits may change due to increasing battery capacities and available charging power, which again will affect the load profiles and flexibility potential of EV charging.In Fig. 13, daily average load profiles are shown for four user groups (SM_Low, SM_High, L_Low, L_High), based on net battery capacity (below/above 33 kWh) and charging power capacity (below/above 4 kW).In Table 9, the charging habits of the two user groups SM_Low and L_High are further described and compared.Mann-Whitney p-values are included in the table, to test if SM_Low and L_High are significantly different.SM_Low and L_High were chosen, since these are main user groups which also represent the   Fig. 13 shows that the average energy charged every day is about 1.5 times higher for the users with EV-L than for the users with EV-SM.The energy consumption of an EV is influenced by vehicle-, environment-, and driver-related factors [57], and in general, EVs with large batteries are heavier than EVs with small/ medium sized batteries.Still, the results indicate that users with larger battery capacities also drive more than the users with smaller batteries.This corresponds to interview results found by [58], where drivers with a battery capacity of more than 55 kWh used their car more frequently for long trips, compared to drivers with smaller battery capacities.It is also in line with a questionnaire amongst Dutch EV drivers [59], where only 23% of the Tesla Model S drivers (L_High) answered that they regularly or often used other transportation than the EV due to long distances, while 95% of Nissan Leaf drivers (SM_Low) indicated the same.Another reason for the difference may be due to the charging location, since [39] found that owners of EVs with large battery capacities were more likely to charge at home.Fig. 13 shows that it is especially in the afternoons/evenings that the users with EV-L have a higher hourly energy demand than the users with EV-SM, while the day-time charging is low and more similar for all the four user groups.For the two user groups with low charging power, SM_Low is finished charging around midnight, while L_Low requires more night-hours to finalize the charging.
A histogram of energy charged per charging session is included in    9-D.Looking at all users together, charged energy is less than half of the net battery capacity for 65% of the charging sessions.For SM_Low, the average SoC difference is 42% while it is 34% for L_High.It should be noted that these average SoC difference values are affected by the choice of using different SoC ranges when predicting battery capacities for SM_Low and L_High.If battery capacities for L_High were predicted with the same SoC range as for SM_Low (90%), the average SoC difference becomes more similar: 41% for SM_Low and 37% for L_High.However, also [39] found a relationship between battery capacity and start SoC, where EVs with larger batteries are charged at higher start SoC.Idle energy capacities represent the flexibility potential of EV charging, since the EV charging can be shifted in time.Comparing idle energy capacities for SM_Low and L_High in Table 9-E, the average daily idle energy capacity is 1.3 times higher for L_High.As shown in Fig. 13, the idle energy capacity is higher for SM_Low than for L_Low, which can be explained by the fact that the L_Low group has less idle time due to less average connection time (SM_Low: 8.1 h, L_Low: 6.2 h), and that the L-Low need more time for charging a larger energy amount.A similar relationship can be found between SM_High and L_High.
Users with small and large weekly charging demands are compared, corresponding to 25th and 75th percentiles of the demands.For users with lower weekly charging demand (25th percentile), the average values are 25 kWh of energy charged per week and 24 kWh of idle energy capacity per week.For users with higher weekly charging demand (75th percentile), the average values are 61 kWh of energy charged per week and 42 kWh of idle energy capacity per week.The data indicates that users with lower weekly charging demand have a longer idle time per charging session (average 14.3 h of idle time per session) compared to users with a higher weekly charging demand (average 7.9 h of idle time per session).
The charging sessions are distributed fairly evenly throughout the week, as shown in Table 9-K.Users with larger batteries charge less frequently than users with smaller batteries: 2.6 times per week for L_High, compared to 4.7 times per week for SM_Low.The results are supported by [58], where about 40% of the interviewed Norwegian Tesla drivers (L_High) charged their EVs less than 3 times per week, while 30% of other EV drivers stated the same.Similar results were found amongst Dutch EV drivers [59], where 62% of the Tesla Model S drivers (L_High) stated that they charged their EVs 3 times a week or less, while 80% of Nissan Leaf drivers (SM_Low) stated that they charged their EVs more than 3 times per week.Also the charging frequencies reported in [33] were similar, where EV drivers with large battery capacities (>70 kWh) in the Netherlands charged their EVs 2.8 times per week, and small battery capacities owners (16-30 kWh) charged 4 times per week.The times are not significantly different for the user groups SM_High and L_High, and there is a twin peak in the density distribution, where users either charge for a few hours during daytime or for a longer overnight period.Similar charging durations were found by [34], for EV drivers in the Netherlands.For both groups in our dataset, about 23% of the sessions last for less than 3 h, while about 45% of the sessions last for more than 12 h.This confirms that the main rationale behind the connection times is the daily habits of people and not their type of EV.The predicted charging times and idle times are quite similar for the two user groups.Even though energy charged per charging session is higher for L_High, this is compensated by the higher charging power.The average time between charging sessions is about 23 h for SM_Low and 51 h for L_High (after removing situations with more than 1 month between charging sessions).
The magnitude of the available charging power has impacts for the car users, and for the electricity loads and flexibility potential.For the users, a higher charging power provides the possibility of charging the EV faster.However, this may come with a cost, e.g., related to higher power tariffs or a need to upgrade the electricity distribution and fuse sizes for the building.For many users, however, the charging time is normally not an issue even with lower charging powers, since their daily driving distances are limited.In this study, the energy charged was below 19 kWh for 80% of the EV sessions, which can be charged in about 5.5 h even with a low charging power of 3.6 kW.When the charging power is high, there is a risk that also the peak power will be higher, i.e. if the EV charging coincides with the peak domestic demand [20].When comparing two charging power levels for apartment buildings in [13], the hourly aggregated charging peaks increased with a factor of 1.2, going from 3.6 to 7.2 kW charging power, assuming immediate charging after plug-in.Bollerslev et al. [23] found that the peak power demand was reduced by about 40% and 60%, respectively, when going from a charging power of 3.7 kW to 11 or 22 kW.However, a high charging power also provides a better opportunity for smart charging.In [13], the average idle energy capacity during weekdays was 2.3 times higher with a charging power of 7.2 kW compared to a charging power of 3.6 kW.Such smart charging strategies can save costs for the users and provide benefits for the electricity grid.

Conclusions
With an increasing number of EVs worldwide, there is a need for more data and research on EV characteristics and related EV charging behaviour.This paper proposes a set of methodologies for generating complete EV charging datasets, from data commonly available in CPO reports.The case study includes more than 35,000 charging sessions from 267 users in 12 residential locations in Norway.Residential charging behaviour is analysed, and it is described how these are affected by EV battery capacity and charging power.
• A set of simple methods are proposed for more accurate predictions of battery capacities, charging power, and plug-in SoC for all EVs and charging sessions.In our study, 46% of the users were found to have a charging power below 4 kW, while the remainder had a charging power between 4 and 11 kW.Also, we found that 55% of the users could be assumed to have battery capacities below 33 kWh, while 45% of the users had battery capacities between 33 and 100 kWh.• Our work presents a statistical analysis on how residential charging behaviour is affected by EV battery capacity and charging power.On average, users in the residential case study charged around 6.2 kWh per day, having an average of 3.7 weekly charging sessions.The average energy charged every day was found to be 1.6 times higher for users with large batteries and high charging power (L_High) compared to users with small/medium batteries and low charging power (SM_Low).• The results indicate that most EV users seldom utilize their full battery capacity, and especially EV users with larger batteries.For 65% of the charging sessions, the charged energy was found to be less than half of the predicted net battery capacity.• The daily load profiles suggest that there is a considerable potential for shifting residential EV charging in time, especially from afternoon/evenings to night-time.Such utilization of energy flexibility can reduce the grid burden of residential EV charging.While the average charging time was less than 3 h, the EVs were in average connected to the CPs for 12 h.Comparing SM_Low and L_High, the average daily idle energy capacity was 1.3 times higher for L_High.
For high idle energy capacities, it is advantageous with high charging power, frequent connections, and long connection times.If users start charging less frequently in the future, this will affect the idle times and most likely reduce the flexibility potential.Other publicly available charging infrastructure and end-user costs may also impact the residential charging behaviour.For example, the use of public fast charging or EV charging in workplaces may reduce the need for home charging.In a future perspective, the use of V2G may increase the flexibility potential of EV charging, since the EV batteries can deliver electricity to the building or grid during the idle periods.
To generate the EV charging dataset in this work, it was necessary to make some assumptions, e.g., for charging efficiency and for maximum SoC range charged per EV user.However, the results were compared to results from the literature, which reinforced the validity of our findings.Further studies could be extended with larger datasets, and include also commercial buildings.In 2022, there were about 600.000 million EVs in Norway (8 million in Europe), and CPO reports are often available for energy management and invoice purposes.Having more such studies will make it possible to analyse how EV charging behaviour differs depending on building categories and user groups.EV charging related to office buildings will, for example, have different load profiles and flexibility potential compared to EV charging for company fleets such as healthcare services.If more real-world values for charging power, battery capacity, and session start SoC are made available, the validity of our results may be further increased.
The proposed set of methodologies aims to provide a complete EV dataset with EVs and charging sessions, where realistic predictions for battery capacities, charging power, and plug-in SoC are added to datasets with plug-in/plug-out times, and energy charged.Such datasets provide the basis for assessing current and future EV charging behaviour, data-driven energy flexibility characterization, and modelling of EV charging loads and EV integration with power grids.
Norway.from EV owners and housing cooperatives in Risvollan, Baerum and Tveita, Current Eco AS, Kople AS, ZapTec AS, NTE Marked AS, and Mer Norway AS are highly appreciated.The study is part of the Research Centre on Zero Emission Neighbourhoods in Smart Cities (FME ZEN, 257660).The authors gratefully acknowledge the support from the ZEN partners and the Research Council of Norway.

Fig. 1 .
Fig. 1.Charging power is limited by available AC power in the CP (A) and EV onboard charger capacity (B).

Fig. 3 .
Fig. 3. Plug-in times, plug-out times, connection times, and energy charged in the EV data locations.
Å.L.Sørensen et al.   to 3. The values for plug-out time (t plug-out ), plug-in time (t plug-in ), and energy charged (E charged ) are available from the CPO reports.

Fig. 4 .
Fig. 4. Flow charts for predicting charging power, battery capacities, and hourly SoC, based on CPO reports.

Fig. 5 .
Fig. 5. EV charging power predictions for all EV users.Grey lines: Nominal onboard charger capacity for EVs on the market.Black stars: Validation of charging power for 15 EVs.
, along with the number of connected EVs and new connections each hour.The figure shows how the average SoC-values for all the connected EVs are at the lowest in the afternoon, when most new EVs are being connected.During the night-time, the average SoC-values are getting close to the end SoC-values since most of the EVs have finished charging and there are few new connections.

Fig. 6 .
Fig. 6.Net EV battery capacity predictions for all EV users.Grey lines: Max/ min battery capacities for EV-SM and EV-L.Black stars: Validation of net battery capacity for 15 EVs.

Fig. 10 .
Fig. 10.Upper figure: Daily average SoC values for connected EVs.Lower figure: Number of connected EVs (n connected hours = 373,989) and new connections each hour (n sessions = 28,681).

Fig. 13 .
Fig. 13.Daily load profiles per user during workdays, for four different EV categories.The figure shows energy charged, non-flexible energy charged (idle time < 1 h), and connected energy capacity.
. When energy loads

Table 1
Examples of AC charging power assumptions found in literature.

Table 2
Overview of input data and predicted output data in the article.

Table 3
Summary of literature describing EV charging behaviour related to battery capacity and SoC values.

Table 4
Residential locations with EV data analysed.

Table 5
EV charging power prediction method: Assumptions and justifications.
connected to one user ID.The step of filtering outliers is therefore included, but with the risk of filtering real charging power values.Despite the risks, this is a transparent simplification, which is assumed to give satisfactory charging power levels when aggregated.The method is validated with EV data, as described in Sections 3.3 and 4.1.Å.L.Sørensen et al.

Table 7
Charging power levels in the 12 case locations: Share of users within each level.
B. Charging power (calculated according to the EV charging power prediction method, described in Section 3.1) Mann-Whitney p-value: Significant (< 2.2e-16) (continued on next page) Å.L. Sørensen et al.