Developing home occupant archetypes: First results of mixed-methods study to understand occupant comfort behaviours and energy use in homes

Abstract To better understand home energy consumption, it is important to study the behaviours of occupants in their homes, especially in relation to their comfort needs. A mixed methods study comprising of a questionnaire, interviews, indoor environmental parameters monitoring, and energy consumption readings was performed to group home occupants based on their behavioural patterns. The TwoStep cluster analysis produced five clusters of home occupant with the data from 761 questionnaire respondents. The clustering model comprised of 28 variables including constructs of emotions, comfort affordances, and locus of control. Then, in-depth semi-structured interviews were conducted and IEQ monitoring and energy readings were taken with 15 of the questionnaire respondents. The results of the field study were used to substantiate the findings of the questionnaire. The combination of the statistical clusters with the data from the field study resulted in five archetypes: five distinct types of home occupants, differing in their behavioural motivations towards achieving comfort, and their use of energy when doing so. This study shows that a mixed methods approach is valuable for better understanding energy consumption and implementing archetype-customized lines of action to reduce energy use and maintain comfort.


Introduction
Understanding behavioural patterns of occupants in their home -where they spend over 60% of their time- [1] seems to be essential to achieve reductions in energy consumption. This is because the actual energy consumption of dwellings is not only related to the building (technologies and performance), but also to the occupant (behaviours, lifestyle). These behavioural patterns need to be investigated from an occupant-centered perspective such as comfort needs, satisfaction, perception, behaviour, physiology, and culture [2][3][4][5][6][7]. To ensure a reduction in energy consumption in the residential sector, both components -building and occupant-need to be assumed as an interacting system. Currently, a lack of knowledge is detected regarding occupants' behaviours in their homes, how they use energy, and what their psychobehavioural motivators are when using energy. This could be due to the fact that traditionally in the indoor environmental quality (IEQ) field, these components are being researched independently from one another and unequally in terms of amount of studies.
As an example, between 1997 and 2015, only 13% of articles in energy research used qualitative methods. Contrarily, energy engineering (quantitative research) received 35 times more funding than behavioural and energy demand research (qualitative research) [2,8].
In the last decades, trends suggest that research on the human dimension of energy use has increased [9], but they also show that interdisciplinarity is still uncommon. An example of qualitative methods in energy research is an investigation about owners' reasons to undertake home improvements, finding that their motivations were linked to the meaning of homes as a place for comfort and family life rather than as one for future investments [10]. Similarly, user-centered methods were used to explore the behaviours and attitudes of owners towards home improvements; where five archetypes were developed based on interviews, claiming that the value of such an approach for tackling technical challenges is to enable the development of tailor-made strategies to suit each archetype to improve retrofit policies [11]. Another study integrated the influential factors in domestic energy-saving behaviours in France, by using a survey that combined data from building and user characteristics. It showed a way in which energy behaviours can be included in the design of energy policies to encourage energy savings [12]. Mixed methods were also used in a study aimed at understanding how occupants create and maintain thermal comfort at home: environmental variables were recorded, occupants answered a survey about how they had achieved comfort, and they were interviewed about why and when such thermal comfort actions were performed [13]. Previous studies have already demonstrated that different behavioural patterns among occupiers lead to energy consumption discrepancies. A study from 2018 used principle component analysis to identify the behavioural patterns of Greek home occupants based on a questionnaire assessing building characteristics, occupant behaviour, and socio-demographic variables, in which they found six patterns [14]. Similarly, in the same year, in a study by University of Cambridge researchers [15] used a questionnaire and factor analysis to find five profiles based on the occupants' use of space heating. A Dutch survey found four lighting behavioural profiles, that vary in their impact on consumption, household, and building characteristics [16]. A different approach was used in an Italian study in which they employed simulation and prediction, and the results proposed that occupant behaviours can be classified into three types of lifestyle that impact energy consumption in relation to thermal, ventilation, water, and lighting behaviours [17]. In Wales, a study segmented survey respondents based on their values, perceptions, and self-reported behaviours in regards to energy, and six occupant segments were identified [18]. Finally, a study in the Netherlands categorized home occupants based on heating behaviours and found five types of behavioral patterns [19]. Further studies performed between the 1980s and the early 2010s intending to categorize types of occupants, have generally used statistical approaches, such as principal component analysis, discriminant analysis, cluster analysis, correlation analysis, exploratory factor analysis, or factor analysis [20][21][22].
In addition, other studies have suggested that different types of occupants influence differently the energy of their residences; and therefore, there is a need to better understand these behavioural differences -in addition to taking into account age, lifestyle and number of occupants [23]. One objective of finding patterns is to have more accurate performance predictions [24][25][26]. This is supported by D'Oca, Fabi [27] who found that probabilistic profiles can help strengthening energy models. A reason for this is provided by a study suggesting that out of an average of 27 factors influencing space-heating behaviours, only a few tend to be considered in building performance simulations [28]. A similar conclusion was found in a study researching adaptive occupant behaviours by sorting them into three categories: observation, modelling, and simulation. It was concluded that with the appropriate variables, effects of behaviour on energy performance can be reduced [29]. In low energy houses, it was found that occupants tend to feel more aware of energy and water consumption, especially due to the feedback, and this awareness triggered behavioural changes [30]. Taking into account the aspects mentioned above, behaviours add considerable weight to the energy use and performance of buildings; estimated to affect by factors from 3 to 10 of residential energy use [31][32][33].
Consequently, the results of the current literature in the energy engineering and the IEQ fields suggests that there still is a clear need for 1) better understanding human behaviour in terms of energy use; 2) better interdisciplinary collaboration between the engineering and behavioural fields; and 3) better understanding the occupant component in the development and operation of buildings and its features.
This study goes beyond the statistical clustering of questionnaire respondents by incorporating qualitative data and building features data to the results. More specifically, this study is a development of the questionnaire performed by Ortiz and Bluyssen [34]. In that proof-ofconcept, six archetypes were found by using a specialized questionnaire and the TwoStep cluster analysis. It was concluded that the use of the TwoStep technique is fitting for the variables used, as the questionnaire included categorical and continuous variables [35]. The authors suggested that substantiation of the archetypes was needed with the use of qualitative methods. Combining the results of the cluster analysis with those of qualitative data can strengthen the clusters into "archetypes" [36].
Therefore, the aim of the present study is to strengthen the statistical clusters, in order to formulate archetypes by substantiating the clusters with the mixed-methods data collected from the field study (interviews, IEQ factors, energy readings, and building features).

Study design
The study comprised of two parts. Fig. 1 shows that in the first part of the study, a specialized questionnaire was administered to a sample of home occupants in the Netherlands and France. The second part was a field study in which qualitative data was collected by interviewing participants, and building data was also gathered with a building characteristics checklist, by monitoring indoor environmental parameters (temperature, humidity, and CO 2 ), and by taking energy readings.
The quantitative part involved a previously-developed questionnaire [34], while the field study was divided into qualitative methods (interviews) and quantitative methods (IEQ monitoring, energy readings, checklist). The value of having a mixed-methods approach is that it provides a holistic perspective of the concept of comfort for each of the archetypes. Knowledge is gained not only about what at are "comfortable"' conditions for the participant (environmental monitoring), but also about the extra dimensions of comfort for the archetype, how they are achieved, and which actions or strategies are exercised to achieve them. Ethical approval from the Ethics Committee of the TU Delft was granted to distribute the questionnaire and to perform the field study.

Questionnaire (quantitative data)
Volunteers were drawn from four sources and were invited to take part in the questionnaire. The first and second sources included students from the Delft University of Technology in The Netherlands: 218 master students and 316 bachelor students respectively. The third source was from 1000 employees of the same university, and the fourth from 452 employees of Saint Gobain Recherche in France. The objective was to obtain a sample of a variety of young adulthood and middle adulthood participants that would be representative of diverse home and occupancy types (renters, owners, family homes, student homes, studios). The invitation process started by notifying the potential participant about the purpose of the study one week before they would receive an email with a link to the questionnaire. Participation was voluntary. Participants were given two weeks to fill it out. The first page of the questionnaire introduced the respondent to a consent form detailing time to fill it out (about 30 min), closing date, possibility of non-answers, and confidentiality and anonymity measures. Participants from the first and second sources received credit-points when answering the questionnaire. The administration of the questionnaire spanned from October 2016 to October 2017, depending on the source.
The questionnaire was created on the Qualtrics online platform and was developed based on a literature review and already-validated questionnaires that were adapted to the contexts of comfort-making behaviours in the home environment [2,34]. Comfort-making behaviours are described as behavioural expressions that the occupant  Table 1 Definitions of behavioural constructs included in the questionnaire. exercises to achieve a state of physical, physiological, or psychological homeostasis; thus bringing one's current state into a neutral one.
The constructs assessed in the questionnaire were based on and adapted from the Theory of Planned Behaviour [36]. These were locus of control (beliefs), emotions towards the home, attitudes towards energy, and comfort affordances (needs). Table 1 shows the definitions of each of the constructs.
A first version of the questionnaire was sent to a panel of reviewers for input on content validity, language use, and layout, and was pilottested with twenty individuals (excluded from the final sample) to point out typing or language errors, language clarity, contingency and skipped questions, and time to fill out. The questionnaire was revised accordingly. Simultaneously, Dutch and French translations were made and submitted to reviewers. The final instrument consisted of 65 questions assessing seven categories (demographic and building information, locus of control, emotions towards home environment, comfort affordances, attitudes towards energy, energy-consuming habits, and health and sick building syndrome) [34]. Answers to the questions were presented either dichotomously or on a 5-point Likert scale.

Field study (mixed data)
The field study involved qualitative and quantitative data collection. Recruitment of participants was done by emailing the questionnaire respondents that showed interest in a follow-up to the questionnaire. Of the 761 questionnaire respondents, 212 gave their address. Invitation emails were sent to participate in the field study and 15 people volunteered.

Qualitative field study: interviews
The qualitative part involved in-depth, semi-structured interviews that were conducted in June and July 2018. Interviews were recorded with a Tascam DR-05 V2 digital audio recorder with the consent of participants. The interviews had three parts: background of the participant, comfort perceptions, and energy consumption habits. Generally, fifteen questions were asked. The main topic was "comfort perceptions"; with a focus on actions performed to achieve comfort or on the building characteristics that allowed achieving comfort. Then those practices were related to the use of energy. During the interview, while a participant explained a practice, the place where the practice is done was shown to visualize their actions and experiences. The interviews of this study are a tool that elicits "technical and process knowledge": explicit knowledge that is readily expressed by participants through what they think and say about a certain topic or from frequently done and repeated patterns of actions and routines [41].

Quantitative field study: IEQ monitoring, building features, and energy readings
Measurements were taken of carbon dioxide (CO 2 ), air temperature, and relative humidity (RH). Two types of devices were used: iButton's® and HOBO® MX1102 data loggers. For every interviewee, three iButton's were located in the top three locations that the participant mentioned to spend most time at while being at home. Here referred to as "preferred locations". Measurements were taken for a week and the data acquisition interval was 5 min. The HOBO loggers recorded CO 2 , air temperature, and RH and were placed in the area where the person spent most of their time. HOBO's measured for at least 24 h.
The actual energy use was determined by reading the gas and electricity meters on the day of the interview and a month later for a second reading. In case night fees were displayed, both readings were recorded. If the person had a smartphone energy monitoring app, they emailed the data to the researchers. When no energy meter was present due to the social housing company, energy bills were requested. If the person was living in a shared accommodation, an estimation was made by dividing the reading by the number of occupants. If the person only had the bills without a breakdown of the consumption, estimations were made based on the gas and electricity fees of their energy supplier.
A checklist was filled out in every home, inventorying building characteristics that play a role in the energy consumption during winter and summer (type of home, orientation, construction year, number of rooms, energy label, heating system and terminal units, roof type, general winter temperature, heating season schedules, number of doors to the outside and type of door, percentage of glazing and type, number of windows usually open, solar shading, off-grid power generation, lighting type and appliance usage, and main ventilation strategy).

Questionnaire: clustering and model validation
Data from the four questionnaire sources were merged into a master dataset. TwoStep Cluster analysis was performed using SPSS 24.0. Advantages of the method are that data handling is minimal and allows analysing data pertaining to demographics, health, psychographics, and behaviours [34,35]. The procedure unfolds as follows: first the analysis is run multiple times with different cluster numbers, from 2 to 18; for each run, the ratios of between-and within-cluster variance of the variables are examined: higher ratios imply better cluster separation. A 5-cluster model was chosen for further inspection as it showed the highest ratio. Next, the chosen model was validated. Validation is done to evaluate if the final clusters are influenced by the method, population chosen, and to protect against variables being randomly selected. The validation is a four step process as proposed by Norusis, and performed as follows: a) ensure that the silhouette measure of cohesion is above 0.0 (in this case 0.2); b) perform Chi 2 tests and t-tests to ensure statistical significance of behavioural constructs. This step is done by running the test and removing the behavioural constructs that are not consistent separators; c) remove variables with a prediction score lower than 0.02, and d) halve the sample randomly and apply the final model to each half, ensuring that the results are similar. After the four-step validation was successful, the initial 65 variables of the questionnaire pertaining to behavioural constructs were reduced to 28 variables making up the final model of five distinct occupant clusters.
Further Chi 2 analyses were used to test distribution differences between clusters in personal and building variables (gender, age, country, educational level, building type, tenure type, type of cohabitants, number of cohabitants, tenure, time of residence, size in square meters, number of rooms, diseases in the last twelve months, and source of subject). Descriptive statistics of each cluster were also produced, as frequencies, percentages, maximums and minimums, means and standard deviations, in order to produce a more complete picture of the final archetypes.

Interviews: text mining
Interviews were analysed quantitatively by using a text mining method: sentiment analysis. Preparing the data for text mining required to first transcribe the interviews. Then, a spreadsheet was created with each question per row and the transcription of each respondent per column. The spreadsheets were divided by cluster, to analyse the answers per cluster. Each cluster had an answer spreadsheet that was imported for analysis to SPSS Text Analytics for Surveys 4.
Text mining is an analysis method that extracts meaningful information from large amounts of data from open-ended responses. It does so by identifying themes and analysing words in the texts to find patterns. Text mining analyses the answers by treating subjectivity and sentiment in a quantitative manner. Three outputs result from the analysis. First, the software's linguistic resources extract words and their synonyms that the engine considers important for the analysis; these words are referred to as 'concepts'. Second, during the extraction of concepts, the semantically similar concepts are grouped into 'types'. Third, 'concept patterns' are produced; these are the combination of a single concept with a type. Combining concepts with types is a way to understand the sentiment of the respondents towards a certain topic [42,43]. For details of output, refer to Appendix 1.

IEQ, building features, and energy readings: statistical analysis
Questionnaire, IEQ monitoring, and energy data were tested for normality with the Kolmogorov-Smirnov and the Shapiro-Wilk tests. Data from the i-Buttons and the HOBOs were downloaded as excel files and imported to SPSS Statistics. Files from both sources were individually checked to ensure that no extraneous readings had occurred, i.e. direct sunlight on sensors, etc. The checklist data were transferred from the paper forms to SPSS. The results of the checklist presented here only deal with summer-related energy consuming variables. Finally, the results of the field study were studied per cluster, and they were compared and related to the results of the TwoStep analysis.

General results
Of the 1986 invitations, 969 people responded to the questionnaire, of which 761 completed it, representing a response rate of 48.7% and a completion rate of 78.5%. Table 2 shows the distribution of the four sources of respondents.
The sample was made of 52.6% men and 47.4% women, the most common level of education was a completed master's degree (38.2%) followed by completed primary or secondary school (30.0%). The main building type among the sample was the row house with 29.3%, followed by apartments (24.8%), and semidetached houses (16.6%). 50% of participants reported to live with housemates and 23.4% with family members. 80% were renters, therefore not representing the tenure ratio of the Dutch housing stock which is over 40% [44].
28% of respondents provided their email address and were invited to the field study. Of those 212 invitations, fifteen participated in the field study. The recruitment process for the field study required special selection as it was intended to have at least one representative of each cluster in the field study. For the descriptives of the statistics, refer to Appendix 2.

Cluster results
The questionnaire data was tested for normality with the Kolmogorov-Smirnov and the Shapiro-Wilk tests, and no violations were found. Table 3 shows the five clusters identified by the TwoStep analysis and the 28 behaviour-related variables composing the model.
The final model comprised variables from three constructs: emotions towards home (negative and positive), comfort affordances, and locus of control (internal and external).

Interview text mining
The text mining analysis was performed per cluster and per question; however, as some of the questions belonged to the same subthemes; their results were merged into categories. The categories are "energy awareness and motivations of usage"; "general comfort and perfect home"; "sense of control"; and "affordances". Affordances are individually presented as freedom, temperature, smells, lights, acoustics, privacy, cleanliness, and security. Table 5 shows the percentage of positive sentiments per archetype and per question and the means for each category. Positive 'types' produced by the text mining are grouped together. From the table it can be seen that the Incautious Realists (Archetype 2) have the most positive opinions about energy awareness and usage, while the Positive savers (Archetype 3) have the most negative ones. The Vulnerable Pessimists (Archetype 5) has equally positive and negative opinions about energy awareness and usage. For "general comfort and future home", Restrained Conventionals, Sensitive wasters, Vulnerable pessimists (Archetypes 1; 4; 5) did not express negative opinions; while Archetypes 2 and 3 only expressed 33% and 25% negative opinions, specifically in terms of "air"; "ceiling lamps"; and "freedom".
Looking at the means, the results imply that the Positive Savers (Archetype 3) expressed the most positive opinions for affordances, with 93%. The most negative opinions expressed for this topic came from the Sensitive Wasters (Archetype 4), with 49%. For "Psycho-behavioural", Positive savers (Archetype 3) expressed most negative opinions with 67%, and 78% of opinions about "Psycho-behavioural" expressed by Restrained Conventionals (Archetype 1) were positive. For the full interview all Archetypes expressed between 63% and 65% of positive opinions, except for Incautious Realists (Archetype 2) for which almost 82% of opinions expressed in the entire interview were positive.
The detailed results of the text mining analysis are presented in Appendix 1 and are presented according to the output of the SPSS Text Analytics [42,43].

IEQ and energy readings
The field study data was also tested for normality with the Kolmogorov-Smirnov and the Shapiro-Wilk tests, and due the sample size, it was not normally distributed. Descriptive statistics were produced for the energy readings and IEQ monitoring data per archetype. Table 6 presents the electricity and gas readings during a month in the summer of 2018. Results propose that there is a large variation in gas and electricity. Due to the low number of participants (fifteen), it was deemed insufficient to perform a statistical comparison of means. It is worth mentioning that in the Netherlands, the average gas and electricity consumption per person per month is 54 m 3 and 150 kWh respectively [45]. By treating the archetypes as case studies, from least wasting to most wasting, the archetypes can be ranked as 3; 1; 5; 2; and 4.
Mann-Whitney and Kruskal-Wallis tests were performed to check whether statistical significance exists between measured temperatures and profile. However, as aforementioned, due to the small number of participants, such analysis is inconclusive. Nevertheless, based on the means presented in Table 7, it can be suggested that Restrained Conventionals (Archetype 1) have lower temperatures; while the Incautious

Table 4
Personal and building characteristics with statistically significant differences between clusters and their p-value per archetype a . realists (Archetype 2) have the highest temperatures. Table 8 shows the results of the HOBOs as medians and quartiles of CO 2 and RH taken during 24 h in the location where the participant spends most of their time. Statistical analyses were deemed unnecessary due to the small sample. However, it can be seen that the Positive Savers (Archetype 3) present the lowest concentrations of CO 2 (447 ppm) while the Vulnerable Pessimists (Archetype 5) have the highest ones (746 ppm). Concerning RH, the Incautious realists (Archetype 2) have the lowest measurements (53%) while the highest ones belong to the Restrained Conventionals (Archetype 1) with 59%. All CO 2 and RH results are within the regular levels. Table 9 shows the descriptive statistics of the building checklist. The groups seem to differ considerably in certain aspects: i.e. the number of showers taken per week and their duration; ranging from 5.5 to 9.3 showers a week and between 9.3 and 22.5 min per shower. More differences exist for behavioural aspects, such as the amount of time windows are open during the summer. None of the participants had air     conditioning in their homes.

Final archetype descriptions
Based on the questionnaire results, the variables comprising the model, the text mining outcomes, and the energy readings, the following archetypes are presented and labelled as follows: Restrained Conventionals, Incautious Realists, Positive Savers, Sensitive Wasters, and Vulnerable Pessimists. The names of the archetypes are based on their most extreme features shown by the descriptives from the variables of the questionnaire and the energy readings. The labelling was done as follows: if an archetype has the highest or lowest score for a certain variable, the variable attribute is used to label them. If two archetypes have the same variable as their highest one, the archetype that had the highest score is labelled with the variable attribute, and the second highest variable is used for the other archetype. Fig. 2 shows the relative values per archetype.

Restrained Conventionals (archetype 1)
The Restrained Conventionals (RCs) is the largest archetype, representing 29.4% of the sample and is the youngest group (mean age: 25.4 years). RCs reported to generally have higher-than-average negative emotions, and low positive emotions, while having high external and low internal control. In interviews, RCs expressed positive opinions for energy motivations, comfort, and sense of control, but a general ambivalence of opinions about affordances. They are the second lowest energy consumer, as 50% of them mentioned to use the drier for 10-50% of laundry, and the other half does not own one. They reported the second smallest weekly number of showers (8.3), but they spend the second longest time showering (15 min). They had the third highest concentrations of CO 2 , yet 100% claimed to open the windows "all day and all night" during the summer. It is worth mentioning that Interviewee 2 from this archetype did not occupy the house while the CO 2 measurements were taken.

Incautious realists (archetype 2)
The Incautious Realists are the second largest cluster (22.3%) and have a mean age of 27.3 years (SD: 9.3). 66% of IRs live with housemates and only 10% live alone. This is the second largest renter group (85% renters). IRs have the highest rating of negative emotions, while having low positive emotions. They score lowest in internal locus of control, and higher-than-average external control. They expressed relative positive opinions about their general affordance and psycho-behavioural topics. They are the second largest waster, according to the energy readings, correlating with the longest showers (22.5 min). Yet they take the second smallest weekly amount of showers (6.5). 50% dry their laundry in the drier and 50% don't have one. They have the lowest concentrations of CO 2 , which relates to all of them having a permanent exhaust.

Positive savers (archetype 3)
The Positive Savers (PSs) are the third largest cluster (18.0%) and the oldest (33.9 years). 38.1% live with family members, and is the second largest (19.0%) with people living alone. PSs show the second highest ratings in positive emotions, and lowest for negative emotions. They have the lowest scores in external control, and second highest scores in internal control. PSs expressed very highly positive opinions about affordances and slightly negative ones about comfort and energy. According to energy readings, they are the biggest savers, supported by the fact that 50% of them do not own a drier and that rest uses it for 75% of their laundry. They report the smallest weekly number of showers (5.5) and the second shortest showers (10.0 min). The have the lowest CO 2 concentrations, yet this isn't reflected on the reported window opening behaviours or exhaust features. This is also influenced by Interviewee 8, who spent the day and night away during the CO 2 recordings.

Sensitive wasters (archetype 4)
The Sensitive Wasters (SWs) is the smallest group (14.8%) and has the second oldest mean age of 32.8 (SD: 12.5). 32% of SWs live alone -the highest of all groups-while being the third largest home-owning cluster (22.8%). They scored the highest in positive emotions, and the second lowest in negative emotions. They have the highest internal control scores and second lowest external control. SWs expressed positive opinions about comfort and control of the environment topics but negative ones about energy awareness, while half of their opinions about affordances were positive. They are the highest consumers, reflected on the fact that some of them have more than one fridge, and 66.7% claim to dry 75%-100% of their laundry in the drier. CO 2 registered the second highest concentrations, correlating with the report that 33.3% never open the windows during the summertime; however 66.7% claim to have ventilation grilles constantly open.

Vulnerable pessimists (archetype 5)
The Vulnerable Pessimists (VPs) are the second youngest group (26.1 SD: 8.5). They represent the second largest group living with housemates (57.4%) and largest renters (89.2%). They score lowest in positive emotions and second highest in negative emotions, while having the highest external control scores, and second lowest in internal control. They expressed ambivalence on energy awareness, control of environment, and affordances, but positive sentiments with general comfort. They are the third largest waster according to energy readings, and 50% dry 50%-75% of their laundry in the dryer. CO 2 recorded the highest concentrations, which relates to their report of never opening grilles. However, 50% do open one window all day and all night in the summer, nevertheless, 66.7% have a permanent extractor.

Discussion
In this study using qualitative and quantitative techniques, five occupant archetypes were produced based on the answers of 761 participants and fifteen interviewees. The basis of these archetypes were the responses to the specialized questionnaire related to behavioural constructs, namely emotions, control, and needs; with which statistical clusters were produced by using the strongest separating variables. In a previous study involving the same questionnaire but only 193 respondents, the TwoStep cluster analysis produced six clusters [34]. The model of that study was different since it had one more cluster, but also because the segmentation variables included attitudinal variables. In this study, attitude variables were not strong separators to make up the model. Compared with the current model, in general, the last three archetypes remained the same, while Archetype 1 merged with 3. However, the previous model, having only 193 respondents, was not as reliable as the one of the present study due to its low number of respondents being less appropriate for the clustering technique.
The goal of archetypal data is to allow customizing technologies that will improve health and comfort of each archetype, while reducing energy consumption. The archetypes are described below by emphasizing their differences between energy use and energy attitudes, and their stress-related factors (emotions and control). Understanding the archetypes from these lenses can give insights into what sort of interventions or lines-of-action could be implemented in their homes to help reduce their energy and increase comfort. The Incautious Realists exemplifies a group that should be treated with higher priority. This is because it is the second largest group, and they report the lowest internal control, higher rates of negative emotions, higher wasting patterns, neglectfulness of comfort affordances, and highest frequency of health issues. It concords with the results of studies that propose interactions between locus of control, stress levels, and levels of illness: specifically with the links found between stress and the prevalence of cardiovascular disease, allergies, or healing time [46][47][48][49][50][51]. In addition this group shows what it is known as attitude-behaviour gap, as they express positive awareness about energy, yet they are relatively high wasters [52]. At the other end of the spectrum, the Sensitive Wasters represent the second healthiest group, with highest internal control and positive emotions scores, however, their non-conserving actions are well-aligned with their negative views towards energy, which is coupled with their need for comfort and affordances. This high consumption and need for comfort is reflected in studies showing that northern  European societies are comfort-oriented energy cultures: they tend to choose to live a comfortable life regardless of the energy needed [53]. The Positive Savers have a conservative consumption accompanied by seemingly non-green awareness, literature suggests that such incongruence tends to be the result of financial consciousness rather than energy conservation [54,55]. Restrained Conventionals possess 'green' beliefs which are in line with their low-wasting energy readings; this attitude-behaviour congruency has been proposed to be characteristic of single-occupant homes [56,57], however, this is not reflected in this archetype as only 13% live alone. They present high negative emotions and low internal control, which may be an indicator of higher stress levels [58]. Finally the Vulnerable Pessimists are similar to the previous archetype in that they also show an alignment between their energy awareness and their energy consumption, and they present risk factors for high stress and hence for poor health and general wellbeing. Such differences among archetypes show to a degree how each archetype requires different lines-of-action to achieve comfort, health, and energy expenditure reduction. An example is to develop solutions that support the high external control (belief that the person cannot change the environment) for example with automation, while offering an indoor environment that will at all times ensure comfort and health. Another example could be offering solutions that support the high control of the environment while taking into account the high sensitivity to affordances. This could be an interface offering controlling different aspects of the environment, while also showing how the changes influence comfort. For the archetypes in which there seems to have higher energy consumption than what their green beliefs postulate, interfaces showing costs and use could be useful. These interventions should operate in such a way that the behaviours specific to the archetypes do not bypass the energy efficiency of the technologies. Such concepts need further research with mixed methods studies and cocreation techniques.

Windows open in summer -mean (SD)
Producing occupant archetypes based on behavioural constructs with mixed-methods is valuable as it enables to better understand the occupant dimension of energy use. Although the archetypes presented in this study are not yet complete, they can shed light onto the occupant mental models, especially in terms of their comfort behaviours.
In the interviews, technical and process knowledge data was collected. This is knowledge that is verbally transmitted and is easily retrieved because it is explicit. Different techniques exist to analyse qualitative interview data, mainly qualitative techniques (i.e. content analysis, coding, and recursive analysis). In this study, a type of text mining was used: sentiment analysis. Two reasons exist for using it: it introduces objectivity to the outcome as it is a quantitative technique and sentiment analysis is used to find emotions expressed by participants; an objective of this study. Due to the sample, the quantitative data of the field study (IEQ monitoring and energy readings) cannot be generalized as part of the archetypes, and should rather be observed as case studies. The small sample of the field study can be valuable, as personal data is rarely utilized in the energy research field. Still, the current sample is not representative of the home occupants of the Netherlands, as a large part of it comprises university students, and Dutch and French employees. This therefore, needs to be considered as an influencing factor of the archetypes, since such a population may introduce bias to the outcomes.
The survey involved only self-reported data, while the interviews yielded technical and process knowledge data, which can also be biased. As shown in the description of the archetypes, the self-reported data from the survey and the process knowledge data from the interviews may appear incongruent or dissonant. This is to be expected as in the interviews, participants reflect on how and why they execute the comfort-actions; and while the possibility exists that what they say may be dissimilar to what they actually do, their verbalizations are valuable to understand their 'process knowledge'. Nevertheless, gathering and combining qualitative and quantitative data is not only to validate each other, but also to reduce potential bias.
Some observations of the human-building interactions are noteworthy. For the air temperature monitoring, no large variations were seen the top three preferred locations, meaning that the preference for a location is likely unrelated to temperature and related to other spatial attributes; thus temperature and behaviours are unrelated. As far as the building checklist is concerned, it is interesting to note that archetypes tend to live in buildings that present dissimilar characteristics, meaning that the archetypes may not relate to the buildings' features; in other words, it seems that the environment does not shape the archetype. Energy consumption varied greatly across and within archetypes. Such discrepancies cannot be generalized and based on the current collected information it is not possible to say if they are the consequences of behavioural patterns or of the building characteristics. The sample was too small and the period of sampling was too short, thus, further research is necessary for the energy use part of this study.

Conclusion
This study contributes to better understand the motivations behind comfort behaviours of occupiers in their residences and to see possible energy consumption discrepancies among occupiers with different behavioural patterns. It suggests that combining home occupants from different sources, and analysing their answers to a questionnaire, can be clustered into five distinct groups based on their psychological and behavioural models, related to locus of control, emotions towards their own home environment, and the importance they give to comfort affordances. The findings show that each of the archetypes has distinct valence of opinions when asked about topics regarding energy use, energy awareness, general comfort, and an array of affordances, albeit, what they express verbally is not always congruent to the general results of their self-reported answers. Although IEQ and energy readings were also taken, the sample proved too small to set statistical relationships. Finally, a mixed methods approach seems to be promising to better understand the individual needs of groups of people, and to achieve more energy savings and better comfort levels, as the method allows to have detailed and complete archetypes. Practical uses of the archetypes are that they can be used for improved and more accurate simulation and building prediction models. Additionally, archetypes can be used as part of the design process to develop potential tailormade lines of action for each archetype: their particular characteristics need to be translated into design parameters, such as interfaces that can give the right feedback to the specific archetype. Architects, constructors, or housing associations can also use models pairing archetypes to specific building features that support the archetypes mental models, so as to optimize energy consumption and comfort. • The number next to each "Concept" indicates the frequency with which the concept was expressed in the answer. To shorten the table, only the top five concepts have been shown in this table.
• The < type > is a group of words mentioned by an interviewee generally representing an emotion. They can be positive, negative, positive feeling, negative feeling, etc. This group of words is produced by the software's built-in lexical resources.
• The "Sentiment" is presented in the form of "concept + < type > ". Therefore it is the combination of a word mentioned by the participant and an emotion generally associated to it.
• The combination of "concept + < type > " (the Sentiment) gives insights into the most common way in which the participants feel about the concept in question. It gives an idea as to how the people representing the archetype feel about a certain concept.
• The left column indicates the question during the interview, in which the Concepts and Sentiments were expressed.
• The Concept and the Sentiment columns are to be read independently from each other.
• Example: for the question of "Light" for Archetype 1; 38 concepts were mentioned. "Dim" was the most common concept, mentioned 3 times. The concept of "activities" and positive connotations is the main sentiment about the question of light. This is interpreted as "For Archetype 1; the activities at home elicit positive emotions in relation to the lighting"