Does a presentation Media Influence the Evaluation of Consumer Products? A Comparative Study to Evaluate Virtual Reality, Virtual Reality with Passive Haptics and a Real Setting

Technologies based on image offer a high potential to present consumers with products by focusing on their visual characteristics, but lack the capacity to physically interact with an object, which can compromise how consumer products are evaluated. The present study aims to analyse the influence of different presentation media on how users perceive the product by comparing the evaluation of a piece of furniture made by a sample of 203 users, which was presented in three different settings: a real setting (R), a Virtual Reality setting (VR) and a Virtual Reality with Passive Haptics setting (VRPH). To evaluate the product in the different settings, a semantic differential scale was built that comprised 12 bipolar pairs of adjectives. To study the results, the descriptive statistics for the semantic differential scales were analysed, a study about the frequency of repetition was conducted of each evaluation, a Kruskal-Wallis test was conducted and Dunn’s post hoc tests were performed. The results showed that the presentation media of a piece of furniture influenced the evaluation of how users perceived it. These results also revealed that the haptic interaction with a product influenced how users perceived it compared to an exclusively visual interaction.

movement of a product's image influence how consumers perceive its degree of usability, which conditions their decision making.
Artacho-Ramirez et al. [16] found a significant influence of the mode of representation on the product perception although differences were less numerous than expected (only 3 out of 11 semantic axes employed for the evaluation). It is worth mentioning that differences decreased as more sophisticated visualizations media were employed such as a navigable 3D model. Naderi et al. [17] studied the combined effects of product design and environment congruence on consumers' aesthetic, affective and behavioral responses. The experimental stimuli used in their study, were presented in a 3D simulation environment using a large TV, and a stereoscopic virtual reality headset. They found that while most of the findings were similar across the two presentation media, there were a few discrepancies attributed to the use of different navigation methods and much closer experience to reality, for the VR headset.
The Augmented Reality (AR), Virtual Reality (VR) and Mixed Reality (MR) technologies have long been changing the way products are presented to consumers. These technologies allow us to go beyond 2D screen limitations by offering consumers a more immersive and interactive experience. Depending on the technology employed and its limitations, users can be immersed in a 3D virtual world, move around it, and can even interact with some elements represented in it using different devices.
It is interesting to study the use of different technologies to virtually present products to consumers without them having to travel to a physical point of sales and which also guarantees the correct visualization of their 3D characteristics. Verhagen et al. [18] stresses that using Virtual Mirrors improves how products are perceived in relation to using 360 spin or images. Suh and Lee [19] point out that using VR increases consumer knowledge and their purchase intention. Grewal et al. [20] point out that the greater immersion and interactivity provided by the product's VR representations allow more information to be obtained about the product and improve the user's experience.
Nonetheless, completely virtual technologies that offer a high potential to present consumers with products by focusing on their visual characteristics lack the capacity to provide physical interaction with an object. This limitation can compromise how consumer products are evaluated in completely virtual settings [21]. Therefore, different research works have studied consumers' need to touch a product to make a purchasing decision [22], [23] [24]. In order to overcome this barrier, some physical objects can be included in a controlled virtual setting so that users can live a more immersed experience in the VR setting by interacting with and feeling some virtual objects they see. We refer to such settings as Virtual Reality with Passive Haptics (VRPH).
Interactions with VRPH settings can provide advantages of coming into haptic contact with the object, along with the possibility of interacting and modifying the virtual setting. This allows the textures, colours, surface finishings or materials of the presented product to be altered in real time so that the range of physical products needed to offer users the whole brand's physical showroom catalogue can be reduced. Instead only making one product physically available would be necessary to provide its shape, tactile texture, materials and real reliefs, regardless of visual finishing touches, so that users could perceive all its other characteristics (colour, pattern, finishing touches, etc.) thanks to VR contents.

II. Related Work
Passive haptics can be defined as the use of physical objects to provide feedback to users through their shape [25]. Several studies have shown that feeling the touch of physical objects in virtual environments can improve global immersion, knowledge about the spatial environment and users' sense of presence, particularly when these virtual objects react to touch just as their physical equivalents would [26]- [29].
To achieve satisfactory user experience in a virtual environment with passive haptics, the position of physical objects needs to be synchronized with virtual objects. It is also necessary to consider that perception of the size of a space and the position of the represented objects can be affected by several factors in a VR environment, such as technology or lack of an avatar, which may affect presence [30]- [33].
In a VRPH environment, users' haptic exploring can be done both passively and actively. With passive exploring, the surface reacts to touch and provides users with information. With active exploring, users explore the surface with their fingers and the palms of their hands. Recent studies have demonstrated that the second method facilitates users perceiving surfaces and helps them to better recognize the represented objects [34], [35]. This is why active exploring might be more suitable to evaluate consumer products.
Visual and haptic exploring strongly influences consumer product evaluations [36] and might also be relevant for online shopping experiences [37]. On the one hand, it has been demonstrated how a product's visual description can influence the opinion that consumers form about it and, thus, influences their purchasing decision [38], [39]. This visual information can also help consumers to mentally simulate how a product is used [38], [40] by, in turn, facilitating the appearance of product-related cognitive activities, which could impact product evaluations [41]. On the other hand, the haptic information that results from coming into physical contact with products can help consumers to form an opinion about them [42], [43], and can even improve consumers' capacity to evaluate their quality [44].
Although visual and haptic information can have a separate influence on how a product is perceived and evaluated, recent studies also demonstrate that some visual characteristics can influence how physical characteristics are perceived. Accordingly, research [45], [46] into how color (cold-warm) can be related to some physical properties, such as weight (light-heavy) or size (big-small), demonstrate that perceptual color experiences form part of the mental representation of tactile object attributes, and are applied to several fields like Tangible User Interfaces (TUIs).
To date, some works have investigated the different potentials of passive haptics. Lim and Follmer [47] created an application of small remote-control robots capable of transmitting physical sensations through several haptic patterns to different body parts, depending on the number of robots, movement or force of contact, among other parameters. Carvalheiro et al. [48] developed a sensors system to map users' hands and real objects, and to represent them in a synchronized manner in real time and in a virtual environment, which is useful for simulating physical interactions. Using low-resolution passive haptics combined with high-resolution VR images has enabled HMI dashboards to be developed [49] and to apply them to simulation booths in the aerospace sector [50], which can help to study how to reduce learning times. Other works have studied the importance of vibrotactile feedback on touchscreen devices [51]- [53] capable of returning confirmation feedback of a virtual button and transmitting meanings. Other research works have focused on physical objects capable of being reconfigured to adopt distinct basic physical shapes to be used as passive haptics in VR environments [54], [55].
Although some studies defend VR as a means to evaluate products in different development stages [56] [57], and others have analyzed the possibilities of distinct haptic devices to help evaluate products' usability via VR environments [58], very few works have either studied the effect of haptic sensations on how a consumer evaluates a product presented by means of a VR environment or simultaneously compared this evaluation by other means. Our article attempts to extend knowledge in this field by comparing the evaluation of the same product by three different means: VR with visual, but no tactile inputs; VR with visual and tactile inputs (VRPH); the real product with visual and tactile inputs (R).

III. Research Aim and Hypotheses
The present study aims to analyse the influence of different virtual presentation media (VR and VRPH) on how users perceive the product by comparing it to its traditional perception (a real product). To do so, a case study was done in which several users had to interact with a product in three settings. Evaluations of their perceived impressions in each setting were made using a semantic differential scale, which was subsequently analysed to detect any significant differences in users' evaluations.
This study posed the following hypotheses: • H1: The medium used to present a piece of furniture influences how the users evaluate their perception of it.
• H2: The haptic interaction with the product (real or VRPH), as opposed to only the visual interaction (VR), influences the evaluation made of how users perceive the product.

A. Case Study Approach
To test the posed hypotheses, a case study design was used in which the users had to interact with the same product, but it was presented by different media in such a way that each user could only interact with it by only one means.
The product selected to conduct the present study was a chair as it is a common piece of furniture with general characteristics known by all users. To enhance their haptic experience in some of design scenes, a round rug was placed below the chair so that when users moved closer to the product, they could stand on it and notice its touch.
With the means selected to present the product, the following scenes were created: 1. Scene Room 1 (SR1): Real environment, in which the product was placed along with some neutral physical furnishing elements to contextualise the scene. Users were able to see and touch the real product, but could neither touch any other element in the scene, nor move the product. They could stand on the real rug.
2. Scene Room 2 (SR2): VR simulated environment. A completely virtual setting represented by means of a VR headset. Users could see the product and the neutral furnishing elements by VR, could move around this scene, and even crouch to see hidden parts of the product, but could not touch anything. SR2 simulates all the SR1 conditions (furnishings, arrangement, lighting, etc), but via VR. In this scene, users could not stand on the real rug.
3. Scene Room 3 (SR3): VRPH simulated environment. A completely virtual-simulated setting represented via a VR headset, where the product to be studied was physically located. Users could see the product and the neutral furnishing elements via VR, move around the scene as in SR2, and touch any part of the product they had to evaluate without moving it. They could even sit on it, but could not touch any other element in the scene, except for the real rug. SR3 was exactly the same as SR2, but the product under study and the rug were physically added.

B. Semantic Scale for Product Evaluations
To evaluate the product in each presentation setting, a semantic differential scale [59] was used based on bipolar pairs of adjectives about the product, which acted as product descriptors. Such scales are widely used to evaluate how products are perceived when many parameters need to be evaluated [60]- [62].
A semantic differential scale was created that contained 12 bipolar pairs of adjectives in Spanish, which was the mother tongue of the participants in the experimental phase. Researchers generally adapt a semantic differential scale in accordance with the nature of the product to be evaluated [63], and each researcher follows the research team's criterion to do so. As this criterion can be somewhat biased in some cases, the present study considered it more suitable for it to be based on a methodology already used by [64], [65]. This methodology sets three stages with which to draw up a list of bipolar pairs of adjectives by providing a list of the images of product examples taken from commercial websites (step 1) to then collect users' adjectives from these websites (step 2). Finally the adjectives are classified and filtered according to the four pleasure categories [12], [13] (step 3). Selecting the most common adjectives used to describe a chair according to Jordan's model allows us to take a representative sample of adjectives from each of the four categories (physio-pleasure, socio-pleasure, psycho-pleasure, ideo-pleasure), which provides us with information related to a wider spectrum of aspects that define the product, allowing us to aim for a more complete and global evaluation of the product. This may be of interest in order to better understand how the means of representation can influence some categories of adjectives more than others. On the contrary, if only the most common adjectives had been selected without considering these categories, the information obtained with the study could have been more limited in scope.
With this method, [64] drew up a semantic differential scale made up of five bipolar pairs of adjectives, and [65] prepared a scale made up of sixteen bipolar pairs. This methodology has been adapted to consider other variables to be able to obtain a suitable semantic differential scale with which to evaluate how an industrial product is perceived by the users or potential consumers of this product typology.
In our study, information was collected from four different sources (designers, users, manufacturers and distributors) so that the selection of bipolar pairs would match the more general criterion that adequately represents the descriptive terms employed by all the involved stakeholders. Eleven designers (9 men and 2 women, an average age of 35.8 years, with an average professional experience of 10 years) and 61 users (34 men and 27 women, with an average age of 21.8 years) were contacted and asked to answer a questionnaire. The websites of the manufacturers and distributors of the products in the studied category (12 in total: Ikea, Andreu World, Viccarbe Habitat, Cappellini, Cassina, Akaba, Barcelona Design, Gandia Blasco, De Padova, Bonaldo, Fornasarig, Amazon) were systematically analysed to collect the adjectives employed to describe the products.
To devise the questionnaire to be used by professional designers and users to collect the descriptive adjectives of the examples of products in the studied category, a search was done on websites specialising in the manufacturing or distribution of these products, which resulted in 50 images. Of these, the 15 most representative ones were selected from the whole studied product typologies range (Fig.  1). These images were edited to homogenise the way they appeared so that designers and users would not condition the way they looked. To collect adjectives, 15 examples were presented one by one using Google Forms, and five descriptive adjectives were requested of each presented example. Every participant was requested to make an evaluation about how much they liked each chair on a 5-interval Likert scale, where 1 was the lowest value ("I don't like it at all") and 5 was the highest value ("I like it very much").
It is worth pointing out that when completing questionnaires, designers and users did not need to make much effort with the first adjectives because they were generally the most evident characteristics of the presented product. In many cases however, the last two adjectives involved more effort as they were more singular and varied than the previous ones, and gave way to a richer more varied collection of terms. In this case, the collected adjectives had both positive ("nice", "elegant", etc.) and negative ("ugly", "uncomfortable", etc.) connotations. Likewise, it is worth stressing that no adjectives with negative connotations about products were given by manufacturers and distributors. Having collected 5,611 adjectives (825 adjectives from 11 designers: each designer provided 75 adjectives, which were the result of writing 5 adjectives for each of the 15 chairs analysed; 4575 adjectives from 61 users; following the same procedure as the designers; 141 adjectives from 8 manufacturers' websites; 70 adjectives from 4 distributors' websites), the list was homogenised by eliminating their gender and number; that is, only the root of the term was considered, but differentiation in original sources was maintained. Then the frequency with which each adjective was repeated was counted, and those with the same meaning were grouped; e.g., "resistant" and "sturdy". Antonyms were also grouped to build the most frequent bipolar pairs of adjectives on the list by considering only the 25 most frequent ones from each source of origin. In those cases in which no antonym was available for one of the most frequent terms because they had only a positive or negative sense, they were added by the research team to create a bipolar pair, but no frequency value was added. To homogenise the order of magnitude of the frequency with which each source of origin was repeated (designers, users, etc.), the number of repetitions was weighted according to the sample of each source. Each resulting bipolar pair of adjectives was classified according to the four pleasures categories [12], [13] and placed in order of their frequency. The three most frequent ones from each category were selected. In order to ensure that when the semantic differential scale was used one of the extremes would not be taken as positive and the other as negative, some of the bipolar pairs of adjectives were randomly reversed. Finally a 7-interval scale was included, following a Likert scale, on all 12 bipolar pairs of adjectives (Table I) by taking 0 as a neutral value and 3 as the maximum value of both extremes. The purpose was to express that a higher value involved a greater extent of identifying the evaluated product with the corresponding adjective, but by avoiding taking one of the two extremes as being positive or negative. A consensus has been reached about this scale magnitude having a sufficient degree of reliability without users having difficulties to make evaluations [63].

C. Preparing Rooms for the Case Study
Three scenes were created in different rooms. In SR1, a series of real neutral furnishing elements was placed. The considered neutral furnishing elements came in basic forms, were white, grey or beige, and displayed no further decorative details. They included a mediumheight shelving unit, two small pictures on the wall and a short-pile round rug placed beneath the product. The product selected for the case study was one of those selected to build the semantic differential scale (model 5 in Fig. 1). The beige-coloured Ikea Odger model was selected because, according to the evaluations made by designers and users when collecting adjectives to build the semantic differential scale, this model obtained a mean score compared to the rest of the sample.
With SR1, a 3D scene was modelled to generate SR2 and SR3. To model the virtual scene, the following tools were used: Solidworks 2018, with which building elements (walls, floor tiles, ceilings, lighting, etc.) and auxiliary furnishing elements (shelving unit, pictures and rug) were produced; Autodesk 3ds Max 2018, with which the product to be evaluated was generated and with which all the textures, colours, materials, lighting, etc., were included to make the scene as real as possible; Unity 2017.3.1f1, with which the executable VR model was generated to immerse users in the virtual room (SR2 and SR3).  The equipment employed in SR2 and SR3 consisted in a graphics workstation (HP Z420 Workstation x64, Intel Xeon processor CPU E5-1660 v2 @ 3.70GHz, 6-CPU Core, 32GB RAM and NVIDIA Quadro K5000 graphics card), a Oculus Rift VR headset, two position sensors placed at the front of the scene and two Oculus Touch controllers, which were employed to only calibrate the scene.

D. Sample (Participants)
To run the experimental phase, 203 voluntary users participated. Gender distribution was 111 men and 92 women aged between 18 and 40 years, with a mean age of 22.77 years. All the voluntary participants were studying the Degree in Industrial Design and Product Development Engineering. Regarding sample size calculation, a priori power analysis was conducted with G*Power [66] supposing an one-way ANOVA statistical test with these input parameters: effect size: 0.25, α=0.05, (1-β)=0.85 and 3 groups. G*Power provided a total sample size of 180. In order to guarantee to achieve at least a power of 0.85 as used with G*Power, the total sample size used in the experiment was 203. Although finally a Kruskal Wallis test has been applied due to the data non-normality, we are confident that a power of 0.85 is achieved considering that with non-symmetrical distributions the non-parametrical Kruskal-Wallis test results in a higher power compared to the classical one-way ANOVA [67].
An initial survey was conducted with the participants to learn about their experience with VR devices: 96 (47.29%) users had no experience, 98 (48.28%) had some former experience and only 7 (3.44%) stated they were very familiar with VR devices. Two people did not answer this survey question (0.99%).

E. The Experiment Protocol
The experiment was carried out on 3 days of one same week to limit as much as possible comments about the actions performed in the experiment among users who might know one another. Participants were also asked to maintain the confidentiality of the actions they performed, at least until the experimental phase has ended.
To design the experimental phase, the sample was divided into three groups of users according to the built Scene rooms: Group 1, R (65); Group 2, VR (68); Group 3, VRPH (70).
To build the Scene rooms, two rooms were used whose size and characteristics were similar. Each scene was configured according to the presented conditions. The same room was used for SR2 and SR3, with the only difference appearing in SR3 (VRPH), with a rug and a real chair standing in the centre so that users could touch the chair. In SR2 (VR) all the elements were virtual and, hence, the real chair and rug were removed.
A protocol was written to perform the experiment in all the Scene rooms so that the sequence of steps to follow or the indications students had to do were independent of the researcher involved in each case.
The experiment's sequence was as follows: 1. Stage 1. Welcome Room (2 Min.) In order to preserve the figures' integrity across multiple computer platforms, we accept files in the following formats: .EPS/.PDF/.PS/.AI. All fonts must be embedded or text converted to outlines in order to achieve the best-quality results.
Step 1. The users came to the Welcome room (outside the Scene rooms), were identified to determine the as-signed Scene room and signed an informed consent to participate in the experiment. An informal friendly conversation was held so that the participants would not feel worried and they were accompanied to the corresponding Scene room.

Stage Scene Rooms (5 Min.)
Step 2. In SR2 and SR3, a VR headset was placed and adjusted to each user's characteristics. Under no circumstances were users allowed to previously view the real scene at any time to avoid conditioning their subsequent evaluation. To do so, a screen was positioned to separate the area in which the VR headset was placed from the scene. The participants were explained that they were about to see a VR scene, they had to respect the room's limits and a researcher would be at their side at all times to avoid them becoming entangled with the VR headset cable. In SR1, they simply entered the room.
Step 3. All the users were explained that they had to observe (and touch or sit on in SR1 and SR3) the product (the chair) located in the scene; they could move around it, crouch or move closer to see details. They were also explained that when the observation phase had ended, they would be handed a survey to give their opinion about the product's characteristics.
Step 4. Each user had 2 minutes to experiment with the product in accordance with the conditions of each scene.
Step 5. In SR2 and SR3, the VR headset was removed behind the screen. The participants were asked about their first impression or any outstanding observation, and they were asked to leave the survey room to complete the survey and, if necessary, to provide details about the aforementioned observations.

Stage Survey Room (5 Min.)
Step 6. All the users completed the survey without saying anything to anyone. A researcher remained to explain any doubt they had about the survey's questions.
In the questionnaire employed to evaluate the studied product in each Scene some questions were included about possible viewing problems that users may have had (myopia, astigmatism, use of glasses or contact lenses), which could have conditioned the use of the VR headset according to previous experience with VR technology. The participants had to rate the chair in accordance with all 12 semantic pairs using a 7-point semantic differential scale ("Rate the chair you just saw according to whether you think it is closer or further away from the following adjectives"). Next they had to indicate how much they liked the chair globally by scoring their answer on a 5-point Likert scale that went from 1 ("I do not like it at all") to 5 ("I like it very much"). An open space was left for the participants to include comments about the experience. Fig. 3 provides some examples of users in step 4 of stage 2. The layout of the equipment utilised for the experiment in both scene rooms VR and VRPH is seen, namely the position sensors of the VR headset. The cable linking the VR headset and the PC is shown, which the researchers had to supervise at all times so that the users were neither entangled nor damaged equipment.
In both SR1 and SR3, the users could touch the product they had to evaluate, and they even sat on it, but were asked to not move it from where it was placed.

V. Results
To help interpret the data collected by surveys in each setting (R, VR and VRPH), an inferential statistical method was used with which the posed hypotheses were tested. It was possible to distinguish two collected datasets: those corresponding to the differential semantic scale evaluations for each semantic pair of adjectives, and those corresponding to the "I like it" evaluation.
Regarding the first dataset, Table II includes the descriptive statistics for differential semantic scales. It is noteworthy that data were collected by a 7-interval Likert scale with a central neutral value of 0 and two extreme values of 3 (in absolute values), where a higher value indicates a better correspondence with the adjective represented on this extreme. For suitable data processing, a negative value to the left of the survey was taken to simply indicate that the adjective on this extreme came closer and had no further connotation. As the scale values were discrete, the value of the median was also discrete. Thus the values of the means and standard deviations are also indicated because they may better represent the distribution of the collected value.  There were two semantic differentials for which users gave similar scores for all three conditions, which came very close to the neutral score (0), namely semantic differentials "versatile-invariable" and "fun-serious". For the remarks collected by the evaluators in stage 3 in the survey room, it was the adjectives that made users doubt the most when relating them to the presented product, which could have led to a poorly polarized neutral score. Therefore, by contemplating only the study of the means and standard deviations, we could consider that the score for the product presented in VRPH was more positive than those given in VR and R. Fig. 4 presents the box plots for the semantic scales, showing the distribution of the values of the collected samples for all the semantic pairs for all the studied conditions. It is noteworthy that as the discrete values corresponded to a reduced 7-interval scale, both the box plots and whiskers took the positions of the integers corresponding to this interval. Thus their interpretation had to be done by complementing with other values such as mean or standard deviation.   Kolmogorov-Smirnov and Shapiro-Wilk tests (α=0.05) (Table IV) showed that semantic scales did not follow a normal distribution in all cases. Consequently, an ANOVA test proved unsuitable for testing, and Kruskal-Wallis was selected. This is a non-parametric method for testing whether samples originate from the same distribution. The viewing conditions were taken as the independent variables (R, VR and VRPH) and the scores for each semantic pair as the dependent variables.
The null hypothesis of the Kruskal-Wallis test stated that the mean ranks of semantic scales scores in the three experimental conditions were the same. Firstly, four assumptions had to be checked: 1. The dependent variable should be measured at the ordinal or continuous level. In our case, semantic scale scores were measured from -3 to 3.
2. The independent variable should consist of two categorical independent groups or more. In our case, we had three independent groups (R, VR and VRPH).
3. There was no relationship between the observations in each group or between the groups themselves.
4. The distributions in each group should have a similar shape and variability (as seen in Fig. 4).
The Kruskal-Wallis test results (Table VI) revealed that the null hypothesis was not confirmed (significance level .05) on many semantic scales. As shown in bold, significant differences appeared among some semantic differentials when comparing the three conditions.
In order to study if these significant differences appeared among the three conditions or were due to differences between them, Dunn's post hoc test was performed (a=0.05) using the adjustment p-value to make a pairwise comparison according to the Kruskal-Wallis test. The results are shown in Table VII. As multiple tests were carried out, Bonferroni adjustment was applied to all the Dunn's p-values, as presented in the last column of Table VII (adjusted p-value), which also includes the effect size calculated as: N = total number of observations.
In Table VII, there are 13 pairwise comparisons for which there are statistically significant differences, and the VRPH condition is present in 10 of these pairs. Only "Uncomfortable-Comfortable" presented significant differences for all three use conditions grouped into pairs. The VRPH score was better than the other two, while the R score was better than that for VR.
For "Light-Heavy", differences were found only between R and VR, where VR was better. Thus no differences appeared between users' scores for "Light Heavy" when comparing VRPH with the other two conditions, so the product's evaluation was not harmed. Conversely, differences were observed between users' evaluations for "Fragile-Resistant" or "Vulgar-Elegant" when comparing VRPH to the other two conditions as their score for VRPH was better, as previously observed (Table II and Fig. 4). Table VII also shows that significant differences were found only when comparing VRPH to either of the other two conditions, such as "Practical Useless", "Ugly-Nice" or "Handmade-Industrial", with the best score going to VRPH.
Regarding the second dataset, corresponding to the "I like it" question, Table III includes the descriptive statistics for the overall evaluation as regards the question "I like it". In this case, the employed evaluation scale was a 5-interval Likert scale, use minimum value was 1 and its maximum value was 5.
The statistical descriptives in Table III reveal that on a scale from 1 to 5, the scores of all three conditions come very close and are slightly higher in R and lower in VR. In Fig. 5, the box plot figure only identifies the medians and lots of outliers because more than 50% of the distribution of values takes a value of 4. Thus it was not possible to draw boxes in the figure.
To study the differences between the scores for the question "I like it", due to the data non-normality (Table V), a Kruskal-Wallis test was applied. It showed that there was no statistically significant difference between the three viewing conditions (R, VR, VRPH), Χ 2 (2)=2.085, p=.353.

VI. Discussion
As Fig. 5 reveals, the overall evaluation that users made of the presented furnishing product is practically the same. So we conclude that the means employed to present this particular product does not influence its overall evaluation.
When the evaluations of the semantic differentials were analyzed for the three studied conditions (Table II), VRPH was highlighted in most cases with a higher mean, as former works have found using a similar product [60]. In Table VII we see that the biggest significant differences in the pairwise comparison tended to appear between VRPH and one of the other two means, and VRPH was evaluated the best. So we conclude that when this furnishing product is presented by VRPH, users evaluate it more favorably than for the R or VR condition. As the statistical differences refer only to the tested product, we cannot confidently transfer these results to other chair models or product categories. But if the results were similar in other further studies evaluating other products, this could mean advantages when presenting a product in a showroom because potential buyers could gain a better impression of it. Yet if we were to employ this means in the design phase to predict future users' responses, mistaken design decisions might be made and might lead to a product being developed that could be evaluated worse by users when presented in other means.
According to the results in Table VI, some semantic differentials present significant differences depending on the viewing conditions. Thus using one means or another to present this product will depend on the category in which evaluations with reliable results we wish to obtain.
The semantic differentials that correspond to categories Physio and Ideo are those with the most significant differences on the whole when comparing the various viewing conditions, where VRPH is the condition that obtained the best valuations. If we bear in mind that Physio refers to the pleasures deriving from sensorial organs like touch, and Ideo refers to esthetic values, it would be logical to think that resorting to passive haptics in an interaction condition to evaluate furnishing products would lead to a better evaluation of the related semantic differentials. So VRPH would be more suitable for presenting furnishing products, where the tactile interaction and the esthetic value are important. However, since this study is limited to a single model of chair, it would be necessary to carry out other studies to check that these conclusions are also applicable to other furniture products.
On the other hand, in the categories of Socio, with a social connotation, and Ideo, with an emotional connotation, no viewing condition would stand out from the rest. Therefore, these results may suggest that VR could be used to present products with a social or emotional character without harming users' evaluations. So making a physical product available for it to be evaluated would not be necessary as this could be done from home without going a physical store. This could also be useful in some design process phases to evaluate product alternatives without having to build a physical prototype. However, given the limitations of this study, these assumptions need to be tested by further research.

VII. Conclusions
What the present study demonstrates is that the ways by which this piece of furniture is presented (R, RV, VRPH) influences how users perceive it (H1). It also demonstrates that differences are found between the score of perceiving this product presented in a virtual means (VR or VRPH) and presented in a physical means (R). Finally, it demonstrates that users' haptic interaction with this product (R or VRPH), as opposed to only their visual interaction (VR), influences how users perceive and evaluate it (H2).
The semantic pairs selection was done along with grouping them into four categories according to Jordan's model to run an accurate analysis of how presentation means influences users' responses. It is worth stressing that the experimental results showed that not all four categories performed the same with variation in presentation type. This is very important if we wish to use these technologies in the initial design cycle phase where the purposes are to evaluate several design alternatives, and to use VR technologies to avoid building physical prototypes and to speed up decision making. However, the experiment would need to be extended to include other product samples and categories before these conclusions could be generalised.
The scores made when observing this product in VRPH were higher, as evidenced by the R and VR scores in most semantic pairs. So it is worth stressing that observing and interacting with products using VRPH could result in the product being positively valued, which could favor purchasing decisions. VRPH was also the means in which certain physical characteristics of this chair, like "comfortable" or "resistant" (Physio), were more positively evaluated than in a means in which touching is not allowed (VR). Thus using VRPH as a presentation means seems suitable if higher evaluations are sought of consumer products that relate well to physical and tactile characteristics, like chairs, but it does not necessarily have to more positively influence the evaluation of products with marked social (Socio) or emotional (Ideo) characteristics than other means.
Again, the conclusions drawn from this study are limited by the fact that only one product was analysed. Further research is therefore necessary to draw more general conclusions that can also be applied to other similar products, or to other product categories.