SPACE EVIDENCE: AN ORAL INTERVIEW INTERVIEW CORROBORATION METHOD RESEARCH BASED ON GENERATIVE SPACE ANALYSIS

: The oral interview is an essential component of heritage study fieldwork. However, due to the individual memory bias, quick verification of the content's validity about the built environment is required. This paper aims to provide an efficient method for verifying the authenticity of oral interviews by combining architectural heritage data with digital analysis techniques and using spatial analysis as evidence. The method is based on the image pre-processing of existing architectural drawings, the creation of multiple planes and models through parametric variation factors matrix, and the generation of analysis images using daylight analysis algorithms. The Pix2Pix model is also introduced as a basis for tagging the planes and the images of the analysis results according to its principles. Learning by pairing the labeled images enables the computer to generate image analysis based on a given plane figure for fast analysis in the built environment. Following this, the authenticity of the oral interview ’ s content can be quickly determined. This paper examines the industrial heritage of the Puqi Textile General Factory, which was formed under the idea of rapid construction during the Third Front construction, using the factory ’ s built environment and related oral content as the basis for analysis and verification. The findings did not satisfy the poor working conditions described in the oral interviews. Therefore, the authenticity of the oral content is doubtful and requires further verification.


INTRODUCTION
In recent years, digital technologies have been used in research of architecture, such as AI (artificial intelligence) represented by deep learning. For the analysis of the built environment and the process of design, the development of deep learning in architecture research can be divided into two main parts, namely generative design research and built environment element optimization (Yuan et al., 2020). Based on a series of algorithms of GAN (Generative Adversarial Networks), generative design research has made important progress in promoting the exploration of potential regular-pattern and comparisons of possibilities about architectural cases. GAN is a network structure based on the game theory (Goodfellow et al., 2014). The network has the potential to simulate spatial data distribution and is widely used in different research fields. GAN has also been used in the field of architectural study due to its picture creation properties. Researchers used Pix2Pix to construct apartment floor diagrams and examine the network functioning mechanism by labeling rooms with different colors (Huang et al., 2018). GAN was utilized to build the link between sketches with boundaries and apartment floor plans, allowing designers to quickly transition from sketch to floor plan (Zheng et al., 2020). Using a small sample of datasets, Liu and Luo  investigated the idea and method of campus layout production and discovered the aim of automatic campus layout generation through established boundaries and roads. Liu and Fang (Liu et al., 2021) investigated the principles for producing images and the design aspects of traditional Chinese private gardens using the GAN network's dataset. The generative method re-creates the relationship between architectural design and visuals. It uses existing image data as a reference for upcoming architectural designs and uses digital technology to automate a substantial amount of repetitive work.
But in the study of architectural regular-pattern, the potential of spatial verification remains undiscovered. The innovation of this study is proposing an oral interview corroboration method. This method uses deep learning to quickly generate analysis as spatial evidence, and participate in oral interview verification. Oral history is a multidisciplinary approach to researching architectural heritage (Zhang et al., 2021). The interaction between humans and the built environment, as well as the construction process, is explored in this approach. Survey interviews with workers are required by the researchers. The interviews are crucial for comprehending the architectural production process and the motivations for presenting the results (Wu et al., 2020).
Oral research is an important factor supporting the value judgment of heritage (Tan and Qian, 2022). terms of content, however, oral research suffers from individual recall bias (Xiong, 2016). Because traditional methods for verifying oral content have a long verification time and are inefficient, an efficient method for corroborating the authenticity of oral content must be developed, and it will be employed for speedy ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume X-M-1-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy oral site verification. In the field of architecture, GAN and other forms of artificial intelligence have been used. However, GAN currently lacks exploration in merging traditional research methods to develop a hybrid intelligence to improve the efficiency of heritage study and explore possible possibilities. As a result, the goal of this study is to propose an oral corroboration approach based on generative spatial analysis, which employs digital technologies such as artificial intelligence and relies on "space evidence" as a clue.

BACKGROUND
Puqi Textile General Factory is located in Chibi City, Hubei Province. It is a big textile joint-venture that brings together specialized weaving, silk weaving, garment, printing and dyeing, thermal power, and other textile factories. Puqi Textile General Factory is an industrial park that was built with the aim of speed during China's Third Front construction. Its factories are spread out over the valley in the mountains, connected by a highway. Puqi Textile General Manufacturing has a lot of industrial heritage in the form of factory structures (Wang and Tan, 2022). The industrial heritage of the same kind in Puqi Textile General Factory has a series of comparable features since the factories were created by the same firm based on the principle of rapid building. The built environment of factories can be used to analyze the link between former users and the built environment, as well as to investigate the maintenance of collective memory and place and to preserve the industrial historical region (Li et al., 2021).
Oral interviews are an important aspect of heritage study fieldwork. We conducted oral interviews with former employees of the Puqi Textile General Factory to learn more about how they used the factory's physical settings. During the research, the workers recalled and mentioned the factory's poor working environment, as well as the poor lighting during working hours. Based on the content of oral surveys, only a few interviewee mentioned the harsh working environment, and the individual oral interview lack sufficient evidence to judge the actual working environment. Individual memory bias necessitates a quick on-site verification of the built environment to confirm the authenticity of the oral recollections ( Fig. 1). Therefore, an oral interview verification method based on generative spatial analysis is proposed, and 14 factories of Puqi Textile General Factory are taken as samples for this study.

Basic data:
The basic dataset for this investigation was a collection of technical drawings that recorded architectural information. Drawings of the park's industry buildings were taken from the Puqi Textile General Factory's archives. There are 2 types of data in architectural drawings: planar information data and auxiliary information data. The factory's plan drawings are included in the planar information data. It gives us the raw data we need to extract features and improve our dataset. Auxiliary data such as window height and interior height are provided by the auxiliary information data.

Sample processing data:
One of the important datasets is the processed images with key elements labeled. Physical causes impacting indoor daylight conditions are used to identify the extractable building elements. To collect the processed data, the building elements are labeled. It gives the essential information for machine learning to follow.

Daylight analysis data:
Another important dataset is the daylight analysis images. In Pix2Pix, it is paired with the sample processing data and linked. The building models are generated in Rhinoceros/Grasshopper based on the sample processing data, and the daylight analysis images are calculated.

Planar element extraction:
The components that directly affect indoor daylight conditions can be separated into five categories in the planar information of the drawings: building X-axis single-axis span, building X-axis span number, building Y-axis span, skylight length, and window width. We create the data table of element dimensions by extracting the planar information data from the basic data. We categorize each element's dimensional features and extract the following feature extraction information (Table 1).
(4) The length of the skylight is 1.2m and 4m.

Auxiliary information extraction:
Planar features are versatile. On the other hand, the vertical information in the section and elevation drawings is singular. We can see that the total building height, window height, and skylight height are all reasonably equal by comparing the factory's section and elevation drawings. We derive the following information by extracting the common height data of each element in the auxiliary information and using it as the auxiliary data for model generation and calculation (Table 2).
(1) The height of the skylight is 1 m.
(2) The building space height is 5.8 m.
(3) The window height from the ground is 1.2 m.
(4) The window height is 2.4 m.

Feature
Height

Feature elements matrix creation:
There are several combinations for multiple feature elements. The creation of the feature element matrix and the building of a combination mechanism among the feature components are critical for increasing dataset capacity and increasing machine learning accuracy. The feature element matrix is created in this work using the extracted planar feature elements (Fig. 2).

Daylight analysis
Honeybee in Rhinoceros/Grasshopper is the basis for the sunshine analysis. The feature element matrix is used to construct various architectural models based on Grasshopper. Second, existing building models were converted into Honeybee models, which were used to code the battery of daylight analysis and visualization. Honeybee uses Wuhan city's EPW data as the meteorological parameter, with the analysis aim set to sunshine duration for the annual working hours (8:00 a.m. to 12:00 p.m. and 2:00 p.m. to 6:00 p.m.), and the Grid Size set to 800. We get the average sunshine conditions for the annual working hours from the computation. The percentage of annual working hours that satisfy the sunlight comfort is indicated by the colors of the calculated images. Its color changes from blue to red as the proportion changes from high to low.
Finally, the Galapagos Evolutionary Solver algorithm is used to exhaust all potential combinations of element matrix settings. Following these procedures, the computer builds a huge number of building models and calculates the daylight analysis results (Fig. 3).

Date Augmentation:
We increase the dataset's capacity by combining feature elements and image rotation based on planar information data. In comparison to image rotation, the approach of feature element combination efficiently increases the dataset's real capacity and improves accuracy.
(1) Combination of feature elements: The planar information dataset contains 14 real-world architectural data examples. Based on current cases, we create a feature elements matrix, and the combination of feature elements in the matrix is utilized to create virtual cases. This is the main strategy for increasing the dataset's capacity in this study. The feature elements in the virtual examples are identical to those in the actual world, with the exception of the combination method. Finally, the dataset yields 432 virtual examples.
(2) Image rotation: The machine will learn to associate the marked color planes with the daylight analysis images in this study. The dataset's 432 sets of paired images are rotated in four different ways to increase the machine's learning accuracy. Following filtering, we had a total of 1043 samples for the experiment.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume X-M-1-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy

Sample Processing:
In Rhinoceros/Grasshopper, we can quickly generate a large number of labeled images using the Galapagos Evolutionary Solver algorithm. We need to label the position of the walls, windows, skylights, and floors on the floor plan based on the elements in the matrix (Fig. 4). Rhinoceros/Grasshopper generates a huge number of labeled planar images by modifying parameters in the element matrix. The labeled planar images were exported and paired with the corresponding daylight analysis. The exported image is 1280*640 pixels in size. A total of 1043 paired images were obtained after processing the samples and enhancing the data.

Model Training:
In the GAN, the network can be separated into two parts, G (Generator) and D (Discriminator). By analyzing the image features of the input, G will make the output have the corresponding image features that fit the input distribution. The more realistic the created image data is, the closer the generated image features are to the true value features. D takes the real daylight analysis image and the image generated by the G as input and determines whether the image input to the network from outside is the real image or the sample generated by the G. In the process of confrontation between G and D, the generated daylight analysis image will be more realistic. Simultaneously, D's ability to identify the real image from the generated image will be increased. Eventually, the state between G and D will find equilibrium after dynamic game.
Pix2Pix is the machine learning algorithm (Fig. 5). The Pix2Pix is mostly used in the image domain of machine learning and is based on CGAN (Conditional Generative Adversarial Nets). Pix2Pix obtains the created result by inputting a provided image and then passes the image and the generated result to the D. By computing the paired features of the given image and the generated image, the D will establish the connection between the input image and the output result. After the model has been trained, the G will generate the paired output images automatically by inputting images. In this study, the image size of the dataset is 1280*640 pixels. The image is made up of two 640*640 pixel images. To train the network, the labeled planar image is utilized as the input and the daylight analysis image as the output. With an average epoch period of 120 seconds across 1043 photos, the three training times were 1.6 hours, 3.3 hours, and 10 hours. After completing all the training, we utilize the test set to see how the three trained-models actually work. By comparing the output images acquired from the three occasions, the experimental results are assessed. It shows the output for three different epochs (Fig. 6). The output from the three various epochs is compared, and the causes for the differences are discussed. The output (b) appears blurry under the same test set condition with an epoch = 50. This difference is due to a lack of generator training. with Epoch=100, the output (c) can generate comparable results to the daylight analysis. In terms of the number of training, the model with Epoch=300 exceeds the other two. In terms of features, however, the output (d) is identical to the output (c).
The line graph of the loss function supports the same conclusion (Fig. 7). Around epoch=50, L1 loss (G L1) begins to level off. After epoch=100, L1 loss does not alter much in the training. As a result, we may be confident that by epoch=100, we will have relatively mature results. The reason for getting this phenomenon is the availability of sufficient datasets. Furthermore, one of the causes for this phenomenon is that the results of daylight analysis for various cases are not significantly different from one another. As a result, one of the next research directions is to add more data with large differences in features to the dataset. The device CPU used in this experiment is i9 10900KF, and the graphics card is RTX 3090. Under the same test conditions, we compared the time consumed by the Honeybee model in Rhino/Grasshopper and the Pix2Pix model to generate 10 results. In the Honeybee model, the average generation speed of ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume X-M-1-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy each resulting image is 5 minutes, totaling about 1 hour. But it takes less than 5 seconds in the Pix2Pix model. Therefore, the Pix2Pix model has a huge improvement in the speed of generating results.

DISCUSSION
Overall, the results of this experiment were as expected. In terms of image attributes, the generated daylight analysis image based on machine learning is quite close to the real value collected by Honeybee. This indicates that the training has been completed from the labeled building plan image to the daylight analysis image. Image fake B can be generated by inputting image real A. By comparing 10 sets of images of real_B and fake_B (Fig. 8), we can see that the results shown in fake_B tend to be the same as the results of real_B. It means they are close in the performance of the overall daylight analysis. However, they still have slight differences in the corner placement of the planar images. The real B displays a pattern of higher illumination and lighter color near the window and lower illumination and darker color between the window in the placement. In the fake_B, the image also shows the same color tendency. Due to sunshine reflection, the corner placement of the Real B has a yellow color. The fake B has a yellow tint to it as well, but the brightness is different. In terms of computational speed, the test set performs much better in the completed trained network model than Honeybee. Furthermore, to our surprise, the results generated by machine learning were more detailed than the results calculated by Honeybee. Because the Grid Size of Honeybee is set to 800 in the daylight analysis, the image results will be blurred. Machine learning images, on the other hand, show more information in details such as windows. This is a fantastic example of how machine learning can quickly generate more reliable factory daylight condition analyses under present settings.
The results of the generated images reveal that the factory's indoor built environment is well illuminated during working hours, according to the analysis. It contradicts the poor working conditions noted in the oral interviews. As a result, the oral content is doubtful, and the authenticity needs to be further verified. The further research conducted on the built environment of factory, asked old employees additional questions, and confirmed the sunshine conditions of the factories. The interviewees recalled that the poor environmental conditions may be due to the excessive heat in the room. Combined with this study, it can be judged that the ventilation performance in the factory is poor. interviewees responded in the affirmative to this judgement ( Figure 9). Therefore, the comprehensive analysis and supplementary questions draw a conclusion: the factory has good lighting performance, but poor ventilation performance. The factory cannot effectively organize natural ventilation and cooling, and long-term direct sunlight makes employees feel uncomfortable when working. Figure 9. The Oral content recording of local staff.

CONCLUSION
This study explores an efficient method for verifying the authenticity of oral interviews. Combining architectural heritage data with digital analysis techniques, spatial analysis is used as evidence to assist in determining the authenticity of the oral interview content through rapid analysis of daylight conditions in the built environment. A feature element matrix is created by extracting feature element information from the basic data. Two different data augmentation methods were used to expand the basic dataset by combining feature elements and image rotation. A total of 1043 samples were collected. We used Honeybee in Rhinoceros/Grasshopper to choose local meteorological data, calculate the daylight analysis result, and pair the image with the sample data.
Based on Honeybee in Rhinoceros/Grasshopper, local meteorological data are selected and daylight conditions are calculated to obtain image information pairing with the cases. After image processing and pairing, we finally obtained 1043 images of 1280*640 pixels. Based on the Pix2Pix, the training is performed. After completing 100 epochs of training, the output has matched with the real results. This demonstrates that the model can quickly verify the accuracy of the daylight evaluation during oral research.
Artificial intelligence collaboration in heritage research is crucial. A hybrid intelligent workflow is created by combining artificial intelligence with traditional research methodologies. This is critical for increasing research efficiency and maximizing the value of current datasets. In this study, the acquired architectural drawings are used to construct the datasets and train the Pix2Pix. Following that, by virtue of the output, the daylight condition in the oral interviews is verified and the authenticity of the evaluation is judged.
In addition, a variety of neural network models are established in the research process. Based on this, it can also predict the daylight comfort of surrounding industrial heritage of the same type. In future heritage research, we should establish a data set and a comprehensive hybrid intelligence workflow based on the substance of the study subjects using existing data. This method, represented by "space evidence", has shown its extremely valuable. It can help people have a better knowledge of the current state of heritage and speculate on its potential value in order to make greater contributions.