Analysis of door openings of refrigerated display cabinets in an operational supermarket

This article presents a suite of data sets describing door openings of refrigerated display cabinets collected from an operational supermarket. Our goal is to provide a realistic and well-documented suite that will serve as a basis for consistent evaluation and study. Many applications ranging from modelling and optimising supermarket refrigeration systems to food safety and customer modelling depend on such data sets. We describe the data sets in the suite in detail along with the methodology used to collect them from an operational supermarket in Germany. We quantitatively analyse and characterise a total of 32,498 openings reported in the data sets. The properties that we study are opening speed, frequency, time, duration and opening angle with respect to a given weekday, time and type of refrigerator. We expect the current suite of data sets to attract interest and to become the core of a more extensive collection of data sets with time.


Introduction
The European Union (EU) is set to reduce greenhouse gas emissions by − 80 % 95 % by 2050 [1]. Within this broader scope, there are milestones to reduce the greenhouse gas emissions by 20 % until 2020, increase the share of renewable energy to at least 20 % and achieve an energy use reduction of 20 % [2].
By including a larger share of renewable energy sources such as solar, wind and wave energies, the electrical energy generation becomes less flexible as the sources are weather-dependent and out of human control. It is, therefore, necessary to introduce energy buffers to tailor the demand to the energy supply in grids with a large share of renewable sources [3].
In Ref. [4], the authors presented an approach to benefit from refrigeration systems in supermarkets as energy buffers for the grid. By utilising the thermal processes of the refrigeration systems, electrical energy could be stored in cold thermal energy storage for later use in the supermarket. A similar concept was later presented in Refs. [5,6], concluding that demand response by the refrigeration system in supermarkets is a possible option for demand-side management even if only utilising the thermal inertia of the food goods in the store.
Alternative options include load shifting in the refrigeration system of markets, albeit the thermal inertia that could be activated that way would be smaller than that of a dedicated thermal storage.
These projects and calculations as well as related work from Refs. [7,8] all conclude and agree on the importance of forecasting the energy demand of the supermarket to enable full utilisation of the thermal capacity. To identify the variables for such forecasting the author used a detailed conceptual systems approach to map the energy flows within a supermarket [9]. Within this dissertation, it was found that the customer interactions with the refrigerated display cabinets (RDC) are the dominant variable affecting the hourly variation of heat extraction rate, i.e. the variation is dependent on occurrence and characteristics of the door openings and the related extraction of food.
It is therefore essential to quantify the general characteristics of door openings in a supermarket such as frequencies, speeds, durations and opening angles to be able to correctly forecast the energy signatures of supermarkets, potentially enabling them as thermal buffers, tailoring the refrigeration systems and improving the energy efficiency of the RDCs.
Understanding when and how doors on RDCs are opened is also an important aspect from a sales and marketing perspective, door openings provide a basis for understanding customers preferences and shopping behaviour.
Even though doors have been present on RDCs for many years and door openings relevance for the heat extraction rate of RDCs is generally accepted, only a few scientific references exist that document and quantify door openings.
A comprehensive study was presented in 2010, where [10] compared open and doored vertical RDCs. To include the effect of door openings, the authors had 10 RDC doors in a supermarket monitored by the implementation of electrical switches that measured the opening duration and occurrences. The number of openings was found to be 6.2 per hour as a daily mean value. The presented histogram on opening duration was skewed with a mean door opening duration of 31 s. The study found that 90 % of the openings were shorter than 60 s, and 30 % of the openings were in the interval between 4.5 and 6.5 s.
In the same period of time, Cecchinato et al. [11] presented a study on the energy performance of supermarket refrigeration systems where the daily variations in heat extraction rate by RDCs were included by increasing the nominal thermal load by 28.2% during store opening compared to closed hours.
Hill et al. [12] presented a study to investigate RDCs as heat sinks in a supermarket. Within the article, they acknowledge the importance of including door openings in the equations. The author's team self-observed 20 openings per hour with an average opening duration of 10 s during what was considered normal operating conditions in the supermarket. These figures were later used as input for the model.
Glavan et al. [13] explicitly acknowledge the lack of data on door opening occurrence and uses a 'to the best of our knowledge' approach where daily variations are modelled as discrete events based on personal experiences to mimic reality.
While addressing the problem with the appropriate methods, all studies above are lacking in spatial and temporal scope. The number of RDCs monitored should allow for statistically valid representativeness and the timespan covered should allow taking into account fluctuation of shopping behaviour over the week. Furthermore, while some information on opening frequency and duration is available, crucial information to deduct the energy impact of a door opening such as opening speed and the opening angle is completely unavailable.
As can be seen, there is a need to create an improved data-driven understanding of the actual customer interactions with the doors of RDC in operational supermarkets and a respective data set. Here our contribution is to further investigate the operation of RDC doors by following a larger set of doors (85), i.e. all doors in the supermarket used as a case study, over a longer period of time, where, in addition to the two previous studies, the door opening speeds and angles are included in the data set with a temporal resolution of seconds. In Table 1 the previous studies [10,12] outcome are compared to that of this study.
As can be seen from the table, the data generated by the authors set is more comprehensive than any earlier study. The data covers a full month, which is shorter than the period covered in Ref. [10], but still sufficient to identify behavioural differences between weekdays and Saturdays (the market is closed on Sundays). Also, the data set covers the entire market, including both low temperature (LT) and medium temperature (MT) RDCs, where interestingly significant differences were found in both the opening characteristics and frequency. Another unique feature of the data set is that is the actual door rotation is recorded, i.e. it is possible to identify different parts of the opening and by that characterise the openings. Our findings can be used to tune larger-scale energy system simulations with realistic data, but also in more detailed development work on future energy-efficient and demand-responsive supermarkets. The novel insights presented on the door opening characteristics will additionally allow RDC manufacturers to further optimise their RDCs for energy efficiency.

Method
To enable a highly detailed data collection, the measurement platform was designed in-house to allow for non-intrusive measurements. The rotational speed of the doors (ω ) was directly recorded by a gyroscope and the data was later post-processed to find the opening angle, duration and time of the opening. A total of 36.3 million data points was recorded and 32,498 openings were later recognised by the postprocessing algorithm. The details on the hardware are described further in Section 2.1 and the post-processing in Section 2.2-2.3.
The data collection was carried out between the 30 th November until the 30 th December 2017 in a supermarket located outside Hannover in Germany. The studied supermarket was built in 2012 and has a total floor area of 2000 m 2 , whereof 1300 m 2 is the sales area. The store has a separate bakery at the entrance and a refrigerated serve-over-counter for delicacies in the rear of the store. In total, the supermarket contains 85 RDCs, where 62 are medium temperature (MT) and 23 low temperature (LT). Fig. 1 shows a floor plan, including the location of the studied RDCs and their individual door ID used for identification in the data set.
The RDCs for medium temperature (MT) refrigerated goods are Monaxis cabinets (Carrier Corp., Farmington, USA) equipped with double-glazed doors (PAN-DUR GmbH, Boxberg, Germany). Those for low temperature (LT) refrigerated goods are Velando cabinets (Carrier Corp.) with triple-glazed doors, complemented by an active hot air deicing system to keep windows clear. In Fig. 1, the door ID's 1-62 represent the MT RDCs and 65-87 the LT RDCs. Numbers 64 and 65 were excluded as these cabinets no longer existed at the time of the experiment.

Door blade tracking hardware
To be able to track the opening of the doors in time with high accuracy, the authors decided to use gyroscopes. The design criteria for the system in addition to collecting high-quality data was that it had to be able to (1) scale to cover the complete supermarket, (2) be relatively non-invasive (i.e. no changes to the door mechanisms), (3) not attract customer attention which might affect the results, (4) extend and be repeatable for future studies and (5) run for an extended time to capture variations over weekdays and weekends.
To comply with (2), we decided in favour of a wireless batterydriven solution. This, however, limits the scalability to the chosen technology's limitation in the number of simultaneously connected units (see above 1), and also the experiment duration as it becomes dependent on the battery capacity (see above 5). Furthermore, the battery capacity affects the size and thereby visibility (3). To conclude, we found the SensorTag CC2650STK (Texas Instruments, Dallas, USA) platform to be a good compromise of the criteria mentioned above. The TI-SensorTag sensor platform is equipped with multiple sensors for movement, temperatures, light and humidity. In this study, we exclusively used the nine-axis motion tracking MPU-9250 (gyroscope + accelerometer + magnetometer) to record the rotational speed, ω ṫ ( ), of the door openings. Given the small size ( × × 50 67 14 mm) of the TI-SensorTags, they could easily and discreetly be attached to the RDC doors in the supermarket. Also, the low price ($29) and their shelf availability would allow for extending and repeating the study. In Fig. 2, the TI-SensorTags can be seen together with the wireless gateway as installed in the supermarket.
In the supermarket, the SensorTags are connected to several gateways, BeagleBone Black rev B (Waveshare International Ltd., Shenzhen, China), that collect the data and adds timestamps. The tracking system was designed in clusters for radio reachability reasons, where each BeagleBone is communicating with eight SensorTags over Bluetooth low energy. Further details on the Bluetooth communication and data transfer are beyond the scope of this article.
The data was recorded with a sample rate of 4 Hz as a compromise between accuracy and battery running time. From iterative studies, it was concluded that increasing the sample rate did not affect the calculated opening angle as the time-averaging is made within the motion tracking sensor.

Defining the door opening
In this study, the rotational speed of the door°ω s (˙[ / ]) and the calculated rotational angle°ω ( [ ]) as shown in Fig. 3 are used to define an opening. Here a complete door operation is defined by three identifiable parts, namely the initial opening, the holding, and the closing part, which can be seen as the intervals between point − t t    Formulated as a function of rotational angle, a complete door opening is defined as an operation where the door is rotated from a closed position ( = ∘ ω 0 ) to beyond = ∘ ω Ω . The door blade is then kept at a position ≥ ∘ ω Ω for at least t min seconds but no longer than t max seconds before returning to its closed position ( = ∘ ω 0 ). As a consequence of sensor noise, several constants (Ω, t max , t min ) were used in our analysis. A threshold value Ω was used to specify to what angle a door needs to be opened to be considered as a door opening in post-processing. Also, t max is a consequence of the internal noise generated by the sensor. This noise is causing a drift of the calculated rotational angle that creates the illusion of a door opening. The variable t min was introduced as there is a realistic limit to the speed of the door opening by a customer. This minimum time then filters out the 'false' openings from, for example, a trolley that accidentally hits the RDC door and the associated shock resulting in data similar to the first part of a door opening.
Within the study, Ω was set to ∘ 15 and t min and t max to 1.5 and 120 s respectively. The thresholds were selected based on experience and then tuned by hand when post-processing random parts of the recorded data.

Analyses on door openings
In this section, we give an overview of our post-processing to identify the time, duration, and opening angle of the door openings recorded by the gyroscope.
The data from the gyroscope, i.e. the rotational speed of the door, is denoted°ω t ṡ ( ) [ / ]. Hence,°ω t ( )[ ] is the computed door blade position which is the time integral of ω ṫ ( ) as defined in Equation (1).
In parallel with the following textual description, Fig. 3 can be used as a visual guide.
The algorithm identifies an opening in an iterative manner by finding times in the data series when > ω t ( ) 0, i.e. the door is open and then process each opening (window of data points) in isolation. The door opening start time (t 1 ) is set to be the data point where ≈ ω t ( ) 0 and > ω ṫ ( ) 0. The end of the door opening (t 4 ) is then set to be the data point where = ω t ( ) 0 and ≈ ω ṫ ( ) 0, i.e. the door blade has returned to its initial position and is not moving. The time points t 1 and t 4 then define the endpoints of a window for further processing.
The algorithm processes the data points from the start of the window to identify the end of the initial opening (t 2 ) and the beginning of the closing (t 3 ). The end of the initial opening (t 2 ) is set to be the first point where > ∘ ω t ( ) Ω and ≈ ω t ( ) 0. This implies that the door is no longer moving, but is kept in an open position above the threshold value.
Next, the beginning of the closing (t 3 ) is found by processing the data series in reverse, i.e. from t 4 and backwards. As t 4 is known, t 3 is set to be the first point where > ω t ( ) 0 and ≈ ω t ( ) 0. As a consequence of the diversity of customer interactions, noise and measurement errors, t 2 and t 3 were not always identified for all openings. However, the door openings without these intermediate points were visually inspected by the authors to verify the correctness and then only included in the data set summarising opening frequencies and duration. They were not used for the more detailed analysis (notably Figs. 11 and 12). It was not possible to find t 2 and t 3 for 4,937 of the total 32,498 recorded openings.

Accuracy of measurements
The accuracy of the calculated door blade position was estimated by manually measuring the opening angle of the RDC doors and then compare it to the calculated value by the post-processing algorithm.
The authors used an analogue goniometer where the opening angle could be read with a ± ∘ 0. 5 precision. No noticeable difference between observed and calculated values was found. Thus the accuracy of the calculated door opening angle is assumed to be within the ± ∘ 0. 5 range. Based on the observed door blade position accuracy, the minimum gyroscope accuracy can be estimated. From Equation (1) we expand to Equation (2), where τ is used to denote the internal sample times during an opening.
By assuming a door opening of ∘ 90 performed in 1.5 s, we would have 7 data-points ( = N 6) and − = t t 1.5 2 1 seconds. Knowing that the accumulated error is less than ∘ 0. 5 , we can estimate the uncertainty of the gyroscope based on Equation (2). Hence the gyroscope accuracy over seven data points must be less than ± ∘ s 0.31 / . The gyroscopes are reporting time with a temporal resolution of 1 ms. However, for individual door openings, the sample rate of the gyroscope limits the temporal accuracy when determining the duration, opening or closing times to ± 0.25 seconds. The error within the accuracy range is assumed to be randomly distributed, thus the mean value can be estimated with a higher precision.

Results
In this section, the results are presented in order of increasing level of detail. Firstly, the occurrence of the door opening on individual levels and opening frequencies in general are presented. This is followed by a second analysis where the data is presented in histograms and boxplots of opening duration, speed and angles. A total of 32,498 openings were recorded throughout the experiment period between the 30 th November until the 30 th December 2017. During this time, the store was closed with no activity recorded on the 3,10,17 and − 24 26 th December.

Door opening occurrence
In Fig. 4 the door openings spatial and temporal distribution is visualised in a heat map. The areas of higher opening frequencies (yellow to red/black) can be seen to follow a periodical pattern in time, occurring during noon and afternoon. Also, the spatial distribution of frequently opened RDCs can be seen as more yellow and red horizontal lines as for example RDC 58-62 which contains meat products that are more frequently bought. Since the store is closed on Sundays, these days are clearly visible as vertical white lines. The other white areas in the beginning and the end of the test period was a consequence of data loss due to a communication error in the data collection system.
With the high temporal resolution of the collected data, more detailed visualisations showing the actual occurrence and duration of door openings can be visualised as exemplified in Fig. 5. Here the recorded data from 18:00 to 19:00 on the 7 th December is shown. The Sshaped darker area in the upper right corner visualises the stocking of goods. A staff member is keeping the door open for a longer duration and progresses with restocking from RDC to RDC.
By reducing the spatial resolution from individual RDCs to groups of MT and LT RDCs, line graphs representing the different categories were created. In Fig. 6, the hourly mean openings frequency categorised by day and RDC temperature is shown. From the analysis of the presented data, there were no noticeable variations between different weekdays. As the results were showing significant differences between weekdays and weekends, the authors included these as sub-categories in the graphs. Daily trends with similar magnitude intermediate peaks around noon and afternoon can be seen during weekdays for MT and LT RDCs individually. However, during weekends, the MT RDCs have a significantly higher opening frequency during noon than during the afternoon, indicating that the shopping pattern differs depending on if it is a weekday or not. The data also indicates significant differences in magnitude on hourly mean values of opening frequency between MT and LT RDCs, MT RDCs mostly indicating opening frequencies of above or around three openings per hour, whereas the LT RDCs limits to vary around one opening per hour independently of whether it is a weekday or weekend.
The hourly mean values give a general overview of customer interactions. To investigate individual variations in door opening frequency, the data-set was statistically processed and box plots were generated in analogy with the presented data in Fig. 6. These box plots are found in Fig. 7. Here, the outliers are highlighted with red as they mark the highest recorded door opening frequency of any individual RDC, i.e. the design criteria for RDCs in this market. The span between 25 th to 75 th percentiles as visible in the box-plots indicate the number of door openings per hour that the majority of the RDCs are exposed to.
From the box-plot and the heat map, it can be concluded that the majority of the RDCs are opened at least once per hour during the opening hours. Only a few selected RDCs are exposed to substantially higher opening frequencies.

Door opening characteristics
In analogy with the analysis above, the door opening characteristics such as opening duration, speed, angle etc. have been categorised by MT and LT RDCs separately as they represent different types of RDCs.
The figures for door opening duration are based on all the 32,498   recorded openings where t 1 and t 4 were identified (see Fig. 3). The characterisation of the mean rotational speed and the opening angle is evaluated based on the 29,013 recorded door openings where also t 2 and t 3 were identified, i.e. 4,937 openings were excluded. In Fig. 8 the opening duration from t 1 to t 4 is shown for MT and LT RDCs as a histogram. It shows a skewed distribution with the peaks of the histograms for MT and LT RDCs at 5 and 6 s, respectively. The median opening duration for MT RDCs is 8.99 s whereas the duration for LT RDCs is slightly lower, 8.26 s. The mean opening time is affected by the service openings for stocking food, etc. and are 13.99 s for MT RDCs and 12.73 s for LT RDCs. However, longer openings are likely to have been excluded from the data set by the algorithm as the noise level of the gyroscope accumulates over time and causes severe errors for longer opening times.
In Fig. 9, two histograms representing the average opening angle, i.e. the mean value of the angle where the door blade is kept between t 2 to t 3 , can be observed. From the figure, it can be seen that the MT RDCs' doors are likely to be opened significantly more than those of LT RDCs; median opening angle of   Fig. 3), showing a significant difference between the opening angle distribution and the median between the MT and LT RDCs.   RDCs is skewed with a peak at ∘ 78 , whereas that for LT RDCs shows a normal-like distribution with a peak and mean of similar magnitudes (approximately ∘ 49 ). In Fig. 10, the opening and closing time for MT and LT RDCs is shown in a histogram. Defined as the time between − t t 1 2 and − t t 3 4 respectively. A slight but not significant increase for closing time compared to opening time can be seen for the MT RDCs, while the opposite can be observed for the LT RDCs. However, the differences are smaller than the sample rate of the gyroscopes and are therefore not to be considered as significant. The mean opening time for MT RDCs is 1.64 s and 1.68 s for LT RDCs. The mean closing times are 1.81 s for MT RDCs and 1.67 s for LT RDCs. The average rotational speed for the opening and closing is shown separately in Fig. 11. Here, a significant difference can be observed in the mean opening speed between the MT and LT RDCs' doors. Most likely, the decreased speed of the LT RDCs' doors is a consequence of the heavier triple glazing used. The opening and closing is not a constant speed rotation, but rather an accelerating rotation as can be seen in Fig. 3, where a typical door opening is illustrated. The peak and valley, i.e. min and max value of the recorded data over a door opening represents the highest rotational speeds that the door blade is moving at. In Fig. 12

Discussion
For the RDC to keep the food at the required temperature, the heat extraction rate and the heat gains in the RDC must ultimately be balanced, i.e. heat gains and heat extraction must be equal.
As in any thermal process, the higher the supplied heating power the faster the temperature will increase. Hence, the higher the heat gains, the faster the food temperature within the RDC will rise.
Here, the heat gains through the envelope, by infiltration through gaps and by internal electrical components are relatively constant over the days as the indoor climate of modern supermarkets is often kept constant by HVAC systems and the weather impact is limited as supermarkets have few windows and controlled ventilation.
Variations in the frequency and characteristics of door openings are, therefore, the dominating source for variations in the heat extraction rate of the RDC as it affects the amount of infiltrating air and thereby the heat gains.
Hence, an increase in heat gain does imply that the temperature increases faster and the refrigeration system must therefore cool the RDC more frequently during these times.
Extending this to a demand response perspective for the grid, this means that the possibilities to reduce the electrical power of the refrigeration system is limited to the time it takes for the food goods temperature to increase.
This means that the buffering capacity of the supermarket does vary with the heat gains during the day, and the variation in the buffering capacity is therefore strictly linked to the door openings too.
For the grid to fully benefit from the potential buffering capacity of the supermarkets, it is necessary to quantify the duration of time that they can be controlled, i.e. for how long the refrigeration system can be forced to be off or run on a limited capacity.
To achieve this, further research focusing on the forecasting and estimating the in-situ performance of RDCs by alternative measurements is necessary.
In the results of this study it can be seen that the LT RDCs are opened slower, for a shorter duration and in a narrower angle than the MT RDCs. The slower opening and closing speeds are likely to be a consequence of the heavier doors, resulting in a greater momentum. The shorter opening duration and narrower opening angle might be a consequence of the implied thermal discomfort that an LT RDC is causing the customer. By being fast and not opening fully, this discomfort could be minimised. Alternatively or additionally, this difference could also depend on the type of goods that are contained within the different RDCs. When the customer is selecting a product in the MT RDCs, he or she might check the best-before date and tactilely control the fresh goods carefully. Whereas in LT RDCs the best before dates are longer and the inspection can only be ocular as the products are solid. There might also be other reasons or differences that this study did not capture.
By expanding this study to other supermarkets, knowledge on variations and representatives could be concluded. It is likely that there exist local variations based on the type of RDCs and potentially also depending on the busyness of the store affecting customer interactions. Another interesting aspect is the seasonal variations, both considering annual climate variations and also how growing seasons affect sales patterns and thereby rules which RDCs that are visited most often.

Conclusion
By a data-driven approach, this study has quantified and analysed the door opening frequency and characteristics within one operational supermarket in Germany. From the gathered data it can be concluded that there exist significant variations in door opening frequency both between weekdays and weekends as well as between the individual RDCs in the supermarket. The data-set presented within this study contains 32,498 analysed door openings, which is significantly more comprehensive than previously published studies. However, it reflects a limited period of time in a single market only. Hence, to be able to make general statements on customer interactions with RDCs or challenge the norm design criteria of RDCs, this study should be repeated in other locations.
The found variation in opening frequency between individual RDCs implies that the design criteria of the RDCs vary within the supermarket. Additionally, the collected data contains information about the characteristics of how the doors are opened, giving valuable insights for designing new RDCs that are customer adopted and more energy efficient.