Data on the interaction between thermal comfort and building control research

This dataset contains bibliography information regarding thermal comfort and building control research. In addition, the instruction of a data-driven literature survey method guides readers to reproduce their own literature survey on related bibliography datasets. Based on specific search terms, all relevant bibliographic datasets are downloaded. We explain the keyword co-occurrences of historical developments and recent trends, and the citation network which represents the interaction between thermal comfort and building control research. Results and discussions are described in the research article entitled “Comprehensive analysis of the relationship between thermal comfort and building control research – A data-driven literature review” (Park and Nagy, 2018).


a b s t r a c t
This dataset contains bibliography information regarding thermal comfort and building control research. In addition, the instruction of a data-driven literature survey method guides readers to reproduce their own literature survey on related bibliography datasets. Based on specific search terms, all relevant bibliographic datasets are downloaded. We explain the keyword co-occurrences of historical developments and recent trends, and the citation network which represents the interaction between thermal comfort and building control research. Results and discussions are described in the research article entitled "Comprehensive analysis of the relationship between thermal comfort and building control research -A data-driven literature review" (Park and Nagy, 2018 Citation network for investigating the interaction between thermal comfort and building control research Instruction to reproduce our data-driven literature review method

Data
The dataset contains 3 folders: 1) The first folder is all the bibliographic information for thermal comfort and building control research. The total number is 5536 articles, and the publication range is from 1970 to 2016. Table 1 summarizes general information about the publications for the two different search periods. The bibliographic information is summarized by multiple text files.
2) The cooccurrences among keywords are described. Firstly, the keywords are extracted from the title and abstract text and they are further filtered by pre-defined thesaurus words. Subsequently, the keywords are clustered based on research topics. Finally, the co-occurrences among keywords are normalized as distances among them. The files contain each keyword and its coordinate for the two periods. For the visualization of this two, the figures can be found in the original research paper [1]. Tables 2 and 3 explain keyword analysis for historical developments and recent trends, respectively.
3) The papers essentially are classified by their research theme (i.e., thermal comfort, building control, both), and their citation relationship is tabulated by matrix form in the data. Table 4 describes the citation relationship among the three themes (i.e., thermal comfort, building control, intersection). Note that only 3572 papers build the citation relationship. This citation relationship is also visualized in the original research paper [1].

Data collection
For the publication collection, we selected Thomson Reuters' Web of Science bibliographic database [2]. We used the following logical combinations of search terms to collect relevant publications: For thermal comfort research related to buildings, we use the search term (thermal comfort) AND (building*) On the other hand, the search term for building control research related to energy efficiency was (building* automation*) OR (building* energy management*) OR (building* control*) OR (HVAC control*) owing to the fact that building control research can be found under several alternative terms. Using these search terms, we downloaded the publication information, i.e., title, abstract, author, citation, publication year, as a tab-delimited text file, suitable for further processing. Essentially, we split the dataset into two parts by publication dates. The first contains all the publications until 2010 and allows us to study the historical developments. The second part is for the publications from 2011-2016 in order to identify recent trends.

Keyword analysis
For the selection of keywords in a scientific landscape, all the words were extracted from the title and abstract of the publication collections and they were filtered for a minimum of 30 occurrences. With filtered words, the most relevant keywords were extracted through a VOSviewer built-in text mining function [3,4]. Subsequently, we eliminated unrelated words (i.e., regional words, organization names, generic terms) and merged repetitive words (i.e., singular and plural forms, and abbreviation and full name) by applying the pre-defined thesaurus files. With the list of keywords, the VOSviewer generated the co-occurrence table and clustered the keywords based on the   Available bibliographic information Title, abstract, authors, publication sources, citations co-occurrences. Two words are defined as co-occurred if they appear in the same document. In addition, the cluster names were manually labeled based on the observed keywords. Ultimately, the scientific landscape of thermal comfort and building control research is generated. In this figure, the size and color of the circle represents the frequency of occurrence and cluster type of the individual keyword, respectively. Lastly, the distance between the keywords is representative of their relative co-occurrence, e.g., two keywords that are close to each other co-occur more frequently, whereas a large distance between two keywords indicates that they do not co-occur.

Citation analysis
To identify the interaction between thermal comfort and building control research, we investigate citations of the whole publications. Analyzing citation information specifies quantitative interactions between the two (i.e., number of publications cited by others).

Transparency document. Supporting information
Transparency data associated with this article can be found in the online version at https://doi.org/ 10.1016/j.dib.2018.01.033.