Glossiness perception and its pupillary response

Recent studies have discovered that pupillary response changes depend on cognitive factors, such as subjective brightness caused by optical illusions and luminance. However, it remains unclear how the cognitive factor derived from the glossiness perception of object surfaces affects pupillary response. We investigated the relationship between glossiness perception and pupillary response through a gloss rating experiment that includes recording pupil diameter. For the stimuli, we prepared general object images (original) and randomized images (shuffled) that comprised of the same images with randomized small square regions. The image features were then controlled by matching the luminance histogram. The observers were asked to rate the perceived glossiness of the stimuli presented for 3,000 ms and changes in their pupil diameter were recorded. Consequently, if glossiness of the original images were rated as high, those of the shuffled were rated as low, and vice versa. High-gloss images constricted the pupil size more than the low-gloss ones near the pupillary light reflex. By contrast, the shuffled images dilated the pupil size more than the original image at a relatively later stage. These results suggest that local features comprising specular highlights involve the cognitive factor for pupil constriction, and this process is faster than pupil dilation derived from the inhibition of object recognition.


Introduction
Glossiness perception is essential for estimating the surface properties of objects and plays an important role in the human visual system (Adelson, 2001;Anderson, 2011;Chadwick & Kentridge, 2015;Fleming, 2014Fleming, , 2017;;Komatsu & Goda, 2018).Glossy objects contain specular highlights on their surfaces (Beck & Prazdny, 1981;Fleming et al., 2003;Shinya & Nishida, 1998), and these brighter regions represent simple features that are derived from the luminance histogram, serve as a cue for the perceived glossiness (Motoyoshi et al., 2007;Nishida, 2019;Sawayama & Nishida, 2018;Sharan, Li, et al., 2008;Wiebel et al., 2015).In contrast, several studies have noted that the three-dimensional structure of an object is essential for glossiness perception (Anderson & Kim, 2009;Kim & Anderson, 2010;Marlow et al., 2012;Marlow & Anderson, 2013).Thus, opinions vary regarding the cues used by the visual system to estimate the surface glossiness.In addition, the perceived glossiness is affected by components other than bright specular reflection, such as the dark region of an object surface that is caused by specular reflection (Kim et al., 2012;Kiyokawa et al., 2019) and image spatial frequency components (Kiyokawa et al., 2021).Therefore, glossiness perception is a complex phenomenon that involves luminance changes on the retina and requires mid-level visual processing (Fleming, 2014) in the visual system.
Glossiness perception is triggered by the physical and image-based characteristics of glossy objects, which has been elucidated through neurophysiological findings.Certain human behavior and eye movements are associated with glossy objects (Phillips et al., 2010;Qi et al., 2018;Sharan, Rosenholtz, et al., 2008;Toscani et al., 2019), which suggests that the visual system efficiently obtains the information source from the external world at the input stage.Subsequently, the information is represented in the inferior temporal cortex through the ventral stream, which has been reported using functional MRI and physiological techniques in humans (Sun et al., 2015;Wada et al., 2014) and macaques (Baba et al., 2021;Komatsu et al., 2021;Nishio et al., 2012Nishio et al., , 2014;;Okazawa et al., 2012).However, the temporal dynamics that link glossiness perception processing-how networks share roles in the visual system--remains unclear.The temporal reactions while evaluating glossiness have been reported in behavioral studies (Sharan et al., 2008;Nagai et al., 2015).If a hierarchical structure of material perception exists in the visual system (Komatsu & Goda, 2018), the visual (e.g., glossiness and transparency) and non-visual material features (e.g., heaviness and hardness) would be processed with different pathways and their physiological reactions would be temporally differentiated.
In this study, we focused on recording the pupillary response to demonstrate the perceptual processing affected by glossy objects.The pupil constricts and dilates in reaction to brighter and darker ambient lights, respectively, as a physiological reflex.Several studies have revealed that these pupil changes can be observed without fixation to the physical stimuli and suggested that the amount of covert attention can be tracked by recording the pupils (Binda et al., 2013a;Bombeke et al., 2016;Mathôt et al., 2013;Naber et al., 2013).Additionally, previous studies have demonstrated that pupillary responses reflect not only the physical factors derived from luminance but also perceptual factors such as emotional (Bradley et al., 2008;Kuraguchi & Kanari, 2020, 2021;Laeng et al., 2013) and attractive (Liao et al., 2021) stimuli as pupil dilation.Many studies have focused on the amount of pupillary light reflex (PLR).For example, subjective brightness that is induced by high-level image contents constricts the pupil (Binda et al., 2013b;Naber & Nakayama, 2013).The presentation of photographs of the sun was found to reduce the pupil size relative to the control conditions, such as equal luminance, phase scrambled images, or the photographs of the moon (Binda et al., 2013b).Similarly, upright images with the sun caused more pupil constriction compared to an inverted image in both natural scenes (photographs) and artificial scenes (cartoons) (Naber & Nakayama, 2013).The effects on the pupil size by subjective brightness were approximately 0.1 mm (Binda et al., 2013b), 0.13 mm, and 0.07 mm (inversion effect and sun effect by Naber & Nakayama, 2013) and the latencies of the peak constriction were less than or close to 1,000 ms from the stimulus onset.
Furthermore, cognitive factors are believed to influence pupil constriction, particularly through the illusory brightness induced by the self-luminosity of the glare illusion (Kinzuka et al., 2021;Suzuki, Minami, & Nakauchi, 2019;Suzuki, Minami, Laeng, et al., 2019).Interestingly, the peak pupil constriction caused by the illusory brightness from a blue-colored glare illusion was approximately 0.1 mm (Suzuki, Minami, Laeng, et al., 2019), which almost corresponds to the aforementioned effect of subjective brightness.Additionally, the pupil constricts in response to both internally perceived subjective brightness and externally observed perceptual brightness (Laeng et al., 2018;Laeng & Endestad, 2012;Suzuki, Minami, & Nakauchi, 2019;Suzuki, Minami, Laeng, et al., 2019;Zavagno et al., 2017).Thus, the glossiness of an object acts as a perceptual factor and affects pupil responses.On the basis of these studies, we hypothesized that glossiness perception induces changes in pupil responses.Specifically, specular highlight components that are structured by brighter regions in an image decrease the pupil diameters.
Therefore, this study aimed to elucidate how pupil responses reflect the perceived glossiness.We hypothesized that an image with high perceived glossiness causes pupil constriction even if its physical luminance is equalized.Alternatively, there may be no significant difference because the effectiveness of a perceptual factor influenced by perceived glossiness is smaller than that of a physical factor based on luminance.We conducted an experiment to investigate how the perceived glossiness relates to pupil responses through a glossiness rating task in human psychophysics with pupillometry recordings.The following stimuli were employed for the experiment: 1) images of common objects in daily life and 2) pixels that were shuffled while maintaining the local features of the images of the objects.The second case was included to validate that the pupil responds even if the stimuli are only textures and lack any semantic aspect.If the original and shuffled conditions change the perceived glossiness but do not change the pupil response, it could be expected that the cognitive factor--how glossiness is perceived--does not affect the pupil response.Thus, we tested the modulation of the perceived glossiness based on the global features of objects and how they reflect pupil responses.Furthermore, we used an exploratory approach incorporating linear mixed-effects models (LMEMs) to predict the pupillary responses that were derived from 1) the perceived glossiness, 2) the physical luminance, 3) original/shuffled, 4) the luminance histogram features (Motoyoshi et al., 2007;Sharan, Li, et al., 2008;Wiebel et al., 2015), and 5) the stimulus size (Gao et al., 2020) to validate the impact of potential factors for glossiness and physical luminance on pupil responses.

Observers
Twenty-four naïve observers participated in the experiment.Two observers were excluded owing to eye-tracking calibration failure for one and an eyesight issue with the other.Thus, data from 22 observers were used for further analysis.Their ages ranged from 20 to 25 years (average 22.9 ± 1.2 years) and they had normal or corrected-to-normal acuity.All experimental protocols were approved by the Institutional Review Board of the Toyohashi University of Technology for their use on humans in experiments, in accordance with the Declaration of Helsinki.Written informed consent for publication of their details was obtained from the study participants.

Apparatus
The experiment was conducted in a dark booth with dim lighting (approximately 60 lx).The stimuli were displayed on a 27-inch liquid crystal display (ColorEdge 27 CS2731, EIZO) with a resolution of × 1080 pixels and a refresh rate of 60 Hz (calibrated by SpyderX Elite, ImageVISION).Each observer viewed the stimuli after being seated on a chair in the booth with their head secured on a chin rest to maintain a constant distance of 86 cm.The pupillary response was recorded using an eye tracker (EyeLink Portable Duo, SR Research) with a sampling rate of 500 Hz.A five-point calibration of the eye tracker was performed prior to each experimental session.The stimulus presentation was controlled by MATLAB using Psychtoolbox 3.0 (Brainard, 1997;Kleiner et al., 2007;Pelli, 1997).

Stimuli
We obtained 60 images from the THINGS database (Hebart et al., 2019(Hebart et al., , 2020)), including general daily life objects.A naïve participant who did not join the experiment was asked to classify 1,854 object names from the THINGS database into three groups, including object, creature, and food.Subsequently, 209 creatures and 284 foods were excluded because we focused on general objects that are likely to possess gloss-related surfaces.Second, the remaining 1,361 object names were divided into high-and low-glossiness groups.We obtained 221 highglossiness and 1,140 low-glossiness objects.Finally, 30 images for each object were selected from both the high-/low-glossiness groups, respectively, which contained a sufficient area of object whose appearance was to be evaluated.
Each image was selected from a different category that was defined in the database.We scaled the images down to 512 × 512 pixels, transformed each original RGB color image into grayscale, and trimmed the background while retaining the main object.Thereafter, the image features (mean, variance, skewness, and kurtosis) from the pixel intensity histogram of these images were equalized by the histogram matching of the SHINE toolbox (Willenbockel et al., 2010), in which we set the target histogram as the average of the 60 images.The matched image features were as follows: mean = 27.24 ± 0.68 cd/m 2 , variance = 795.00± 38.78, skewness = 1.25 ± 0.06, and kurtosis = 3.71 ± 0.20).We defined these 60 images as "original" (see Fig. 1A).
A "shuffled" image was generated from each original image to validate whether the pupil constricts owing to the perceived glossiness, irrespective of the semantic effect of the object.Several studies in the field of visual material perception have applied pixel shuffling in a 1 × pixel resolution (Kuriki, 2015;Miyakawa et al., 2017;Nishio et al., 2012;Okazawa et al., 2011;Yang et al., 2019), which can completely retain the original histogram features.However, these shuffled images contain more high-frequency components and are likely to cause pupil reactions (e.g., Cocker & Moseley, 1996).Thus, we performed a different type of shuffling to suppress this effect and generate images with different glossy impressions.Fig. 1B depicts the flow of this procedure.The object region of each image of 512 × 512 pixels was divided into 64 × 64 pixel patches and the square-shaped regions were randomly shuffled.The remaining region inside the object contour was further divided into 32 × 32 pixel patches, which were then randomly shuffled.We repeated this shuffling process, starting with a patch size of 64 × 64 pixels, until 1 × 1 pixel patch sizes were obtained.The output images from this manipulation maintained the local features, whereas the global features of the object were inhibited.Finally, we used 120 images (60 images in each of the original and shuffled conditions) for the stimuli in the experiment.
The stimulus area was 6 × 6 degrees, and one object existed inside this area.The original and shuffled images had the same image features with the same luminance histogram because the shuffled images only scrambled the pixels of the original images in different region sizes.We expected that the perceived glossiness would be modulated by the difference between the original and shuffled stimuli.

Procedure and task
Fig. 1C shows the trial sequence of the experiment.After a 1,000 ms inter-stimulus interval (ISI), the stimulus was presented on a gray background (9.19 cd/m 2 ) with a black fixation cross (0.15 cd/m 2 ; 0.6 × 0.6 degrees) on the center of the screen center for 3,000 ms.Subsequently, a response-receiving screen was displayed.The observers were asked to rate the perceived glossiness of the stimulus using a seven-point scale (1: lowest and 7: highest) with a numerical keypad.The next trial began immediately after the response was received.The pupil response was recorded during the experiment.Each session consisted of 120 trials (one trial × 60 objects × two image conditions) and each observer participated in two sessions separated by a break.

Data analysis
We computed the average glossiness ratings of all images in the original and shuffled conditions from the responses of the observers.Responses in the trials that were excluded from the pupil analysis were not included in the analysis.
We analyzed the pupil changes in the interval from the beginning of the fixation point to the end of the stimulus presentation.Intervals with a pupil response velocity greater than 0.011 mm/ms were considered as eye blinks.These intervals, as well as 20-ms margins before and after each interval, were excluded from the analysis.In addition, trials that included eye blinks exceeding 1 s, eye blinks with a ratio over 0.3 in a trial, and pupils that were not detected at the beginning or end of the trial were excluded.An observer with a rate of rejected trials over 50 % was excluded and we used the data of the remaining 21 observers (mean and standard deviation of rejected trials: 9.2 ± 11.4 %).Baseline correction was performed using the same procedure as that in Nakakoga et al. (2021).The baseline was defined as the average pupil diameter for 200 ms prior to the beginning of the stimulus presentation, and we computed the diameter change of each pupil by subtracting the baseline from the original pupil diameter.Thereafter, a moving average filter with a 10 ms window size was applied to the responses.
LMEMs were determined to estimate the association between the pupillary responses and presumed factors quantitatively (see LMEMs).

Glossiness ratings and pupil diameters: Comparisons of the original and shuffled images
Fig. 2A shows the average glossiness ratings for each stimulus as a histogram.The ratings of the original images form two peaks and are broadly distributed (average: 3.91 ± 1.33, range: 1.57 to 6.14).However, the shuffled ratings exhibit a center-peaked distribution (average: 3.82 ± 0.77, range: 2.20 to 6.05).These findings indicate that the perceived glossiness changes with the shuffling operation despite the fact that they both possess the same image features.All stimuli and their average rating scores are listed in Appendix 1.
Fig. 2B shows the average pupillary responses for all original and shuffled images.Both indicate similar trends of pupil constriction with its peak near 1,000 ms after the stimulus onset, hereafter referred to as the PLR.The minimum peaks of pupil responses for the original and shuffled images among the observers indicated no significant difference.(t(20) = -1.62,p = 0.120, CI = [-0.042,0.005], Cohen's d = -0.355).provided by the observers (1: lowest -7: highest).We found that the pupil constricted more when the observer rated the images higher, even when the luminance of the stimuli was precisely controlled.The curves in Fig. 3A show gradual qualitative changes in the peak constrictions of the PLRs according to the rating score.Fig. 3B and 3C show the average pupillary responses in the original and shuffled conditions, respectively.The pupil responses of the original images indicate a clearer gradual change depending on the perceived glossiness, whereas those of the shuffled images are intermingled.

LMEMs
We employed LMEMs to quantify the potential factors affecting pupil responses using the lme4 package (Bates et al., 2015) in R. We developed models under the maximal random effects structure to determine the factors that contribute the most to predicting the pupil responses (Barr et al., 2013;Brauer & Curtin, 2018).The model started with maximal complex formulas and automatically eliminated unnecessary variables based on Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC).Thus, the arbitrariness for the model design was minimized.
The dependent variable (Pupil) was the minimum peak of the pupil diameter for the stimulus duration.The model incorporated fixed effects for 1) Rate: glossiness ratings, 2) GazeLum: gazed local luminance, 3) Category: stimulus category (original/shuffled), 4-7) four image features of the stimulus (mean, variance, skewness, and kurtosis), and 8) Area: the ratio that the object area occupied in the stimulus domain.The glossiness ratings were obtained directly from the responses of the observers.
The gazed local luminance was defined as the average luminance of the regions within one degree of the visual angle of the gaze point for the stimulus duration (Liao et al., 2021).This factor was included to test whether the local luminance by the gaze of the participant contributed to predicting the peak constriction, even when we instructed the participant to fixate on the center of the stimulus and controlled the image features of their luminance histogram.The stimulus category indicated whether the stimulus was original or shuffled in terms of dichotomous predictor variables (1: original, 2: shuffled).The image features were simple image histogram statistics that were computed by the mean, variance, skewness, and kurtosis of the object regions of the stimulus.The stimulus area was adopted to verify whether the size of the luminance-controlled stimuli affected the pupil size, as in Gao et al. (2020).In addition, the model included random effects for the observers (Observer) and object variations (Object) in the stimuli.
First, we set random slopes and intercepts for all variables, without any intersections.Thereafter, we simplified the random effect structure by suppressing the correlation parameters to avoid convergence failure, singularity (Brauer & Curtin, 2018), and incomplete computation.The dichotomous predictor (Category) was centered and the other continuous predictors underwent cluster-mean centering and rescaling (Brauer & Curtin, 2018).The initial model was defined as follows:  The AIC and BIC of the model were 12,708 and 12,830, respectively.Second, the fixed and random effect structures were determined by the step function in the lmerTest package (Kuznetsova et al., 2017) using a backward reduction method to the best fit by maximum likelihood.The final model (AIC = 12,698 and BIC = 12,776) was represented as follows: According to the model, the minimum peak of the pupil diameter was predicted by the glossiness rating, stimulus category, variance of the luminance histogram, and stimulus area, with by-observer random slopes for the glossiness rating, gazed local luminance, stimulus category, skewness of the luminance histogram, and stimulus area as well as by-object random intercepts.
As shown in Table 1, the estimates of the fixed effects of the glossiness rating, variance of the luminance histogram, and stimulus area were negative values, which means that increasing these factors reduced the pupil diameter.Furthermore, the estimate of the fixed effect of the stimulus category was a positive value, which means that the original images more strongly constricted the pupil size than the shuffled images.Notably, the model did not show multicollinearity among the variables because all variance inflation factors were less than 10.

Discussion
This study investigated the relationship between glossiness perception and pupillary responses through a glossiness rating experiment with pupil recordings.The stimuli included original and shuffled images, wherein image features that were derived from the luminance histogram were controlled to the same values.We hypothesized that the pupillary responses were either constricted by the perceptual factors relating to higher glossiness perception or that there was no difference owing to the control of the physical factors (the image features).
The experimental results demonstrated that the shuffled images were rated as more centered and clustered than the original objects (Fig. 2A).The observers may have found it harder to judge the glossiness of the shuffled images.This suggests that the shuffling process inhibited the global features of the images, such as the congruency of the specular highlights, while retaining more local features.This finding supports previous studies that showed that the congruency of specular highlights serves as a cue for glossiness perception (Kim et al., 2011;Marlow et al., 2011;Marlow & Anderson, 2013;Todd et al., 2004).
The pupillary recordings revealed that higher-glossiness-rated images trigger more strong constriction than lower-glossiness-rated images.We found a difference of approximately 0.1 at the minimum peak of pupil constriction between the lowest and highest ratings (1 and 7), as illustrated in Fig. 3A.This difference was almost the same as several effects on the pupil responses, such as the subjective brightness according to high-level image contents (Binda et al., 2013b;Naber & Nakayama, 2013) and the illusory brightness (Laeng & Endestad, 2012;Suzuki, Minami, Laeng, et al., 2019).The fitted model (Table 1) showed that the pupil diameter could be predicted by the glossiness rating, stimulus category, variance of the stimulus luminance histogram, and stimulus area.In terms of the first predictor, the object surfaces were perceived as shinier, which resulted in pupil constriction, similar to subjective brightness (Binda et al., 2013b;Bombeke et al., 2016;Laeng et al., 2018;Laeng & Endestad, 2012;Naber & Nakayama, 2013).In other words, higher-glossiness-rated images may induce an illusory brightness, leading to pupil constriction (Kinzuka et al., 2021;Suzuki, Minami, & Nakauchi, 2019;Suzuki, Minami, Laeng, et al., 2019;Zavagno et al., 2017).
We might be able to interpret these findings in terms of eye movements or attention.People tend to direct their gazes more around specular reflections of glossy objects than those of matte ones (Lavoué et al., 2018;Phillips et al., 2010).In addition, several behavioral studies have consistently shown that the visual system focuses brighter (darker) regions and elicits pupil constriction (dilation) (Binda et al., 2013a;Mathôt et al., 2013;Naber et al., 2013), which suggests that the pupil response can indicate the amount of covert attention.We considered the change in the gaze points and the local luminance during the stimulus duration and confirmed that the observers viewed the center of the stimulus because of the fixation point.Nevertheless, the pupils constricted in the case of high-glossiness images, suggesting that specular highlights may be obtained by covert attention.
Another possible interpretation of the reduced pupil size is the near pupil response, which induces accommodation, convergence, and pupil constriction to obtain the best focus on stimuli (e.g., Kasthurirangan & Glasser, 2006).We considered the amplitudes of the spatial frequencies of the high-and low-glossiness rated original stimuli because more glossy objects would contain sharper components, which would aid accommodation and constriction.Contrarily to this expectation, the amplitudes of the low-glossiness images were higher than those of the high-glossiness images, suggesting that the pupil constriction according to the glossiness rating cannot be explained by the near pupil response alone.
The second predictor, namely the stimulus category, indicated that the pupil diameter varied according to whether the stimulus was original or shuffled.The shuffled stimuli elicited a smaller pupil constriction even when the image features were the same as those of the original.This suggests that the observers may have had difficulty recognizing the object in the shuffled condition.In particular, a previous study showed that the pupils responded to unknown objects (Beukema et al., 2019), with some cognitive processing dilating the pupil diameter, such as the mental workload (Klingner et al., 2011).Likewise, the shuffled images contained unfamiliar objects, which led to pupil dilation similar to the old/new effect (Kafkas & Montaldi, 2011;Otero et al., 2011).Although the total area of higher luminance pixels in the original images was the same as that of the shuffled images, the shuffling broke the object structure but increased the number of clusters in the pixels that constituted specular highlights.These clusters would be rated as having high glossiness and would induce pupil constriction even if the object was not identified.We believe that the perceptual factor influences the changes in the pupil size according to whether a target is an object or simply a texture.
The third predictor of the fitted model, namely the variance of the stimulus luminance histogram (i.e., contrast), also contributed to predicting the pupillary responses.A previous study reported on the importance of contrast for glossiness perception (Wiebel et al., 2015), in which the pupil would respond to a high-contrast image that was derived from specular highlights.Although positive skewness is also known as a cue for high-glossiness perception (Motoyoshi et al., 2007;Sharan, Li, et al., 2008), our results indicate that the stimuli contain both high-and low-glossiness perception (see Fig. 2 and Appendix 1).This suggests that more top-down factors, such as the memory relating to the object, affect glossiness judgement in the visual system.An alternative aspect is that other cues such as darker regions contribute to the perceived glossiness (Kim et al., 2012;Kiyokawa et al., 2019).The fourth predictor, namely the stimulus area, revealed that a larger stimulus size contributed to pupil constriction.This could be simply explained by increasing the incoming light to the eyes depending on the stimulus size.In addition, such pupil constriction was observed in different stimulus sizes, even when the average luminance was equally matched across the set of stimuli (Gao et al., 2020).Crucially, the other three factors in addition to the stimulus area (Rate, Category, and Variance) in the fixed effect contributed to predicting the peak constriction.We confirmed that even when the stimulus area (Area) was removed from the initial model, these predictors remained as fixed effects in the final model through the backward reduction method.Thus, we suggest that our approaches are sufficiently effective to account for the candidate predictions of the pupillary responses with combinations of several physical and perceptual factors.
Furthermore, the random effects of the fitted model suggest that variations in the observers and objects affected the pupil size.Specifically, individual differences in the functions of the visual pathways could explain the by-observer random slopes for the variables in Table 1.The object variations only changed the baseline of the pupil diameter, as indicated by the by-object random intercepts.Thus, in addition to the fixed effects, the random effect structure of the fitted model presumably predicted the change in the pupil diameter.
Pupil changes relating to subjective brightness have been discussed in terms of latency as well as the size of the peak constriction.We examined the effect of latency using the same procedure with the LMEMs as that of the peak size, where Pupil, which is a dependent variable in Eq. ( 1), was replaced with latency.The results showed that only the main effect of Category was significant (the original latency was slower than the shuffled one), and the Rate of the fixed effect did not remain.We also developed another model that included the interaction between Rate and Category.This model yielded the same results.These models suggest that viewing more glossy images increases the amount of constriction but is ineffective for latency.In other words, increasing the velocity of the pupil response is similar to the effect of viewing higher illusory brightness increasing the velocity of the peak constriction (Suzuki, Minami, Laeng, et al., 2019).In terms of the perceived glossiness, we speculate that signals from cortical processing may contribute to the PLR changes multiplicatively.In contrast to such perceptual trigger, when viewing stimuli with rather high physical luminance (>10,000 cd/m 2 ), the latency of the pupil constriction is reduced (Bergamin & Kardon, 2003).Although such a quantitative difference in the incoming light is difficult to use as a direct comparison, the latency shifts in the PLRs would depend on a quality of the source of the pupil constriction (perceptual or physical factors).
Several limitations remain in the current study.First, general objects in the real world with low-or high-glossiness surfaces were used in the experiment; it is unclear as to how the semantic factors of these objects from memory affected the glossiness rating.For example, the image of a diamond (right column in Fig. 1A) was rated as a high-glossiness object that constricted the pupil size.This object exhibits not only physical surface properties, such as glossiness, but also more emotional or sensitive factors (e.g., preference, luxury, and desire).Such factors provide high-arousal stimulation and cause pupil dilation (Bradley et al., 2008).For example, perceived cuteness, a positive emotional factor, causes pupil dilation (Kuraguchi & Kanari, 2020, 2021).Thus, the stimuli used in this study are likely to elicit both constriction and dilation owing to their physical and sensitive properties, respectively.Therefore, segregating a combination or hierarchy of these properties (Komatsu & Goda, 2018) to understand the visual process further is highly recommended.If a continuous change in the pupillometry could be observed with the perceptual factors from glossiness to the higher-stage features (i.e., the visual to non-visual material features), the hierarchical relationship could be verified in terms of pupillary responses.The possibility of sharing networks with the pupil regulation system (Mathôt, 2018;Wang & Munoz, 2015) and the inferior temporal cortex relating to glossiness perception (Baba et al., 2021;Komatsu et al., 2021;Nishio et al., 2012Nishio et al., , 2014) ) would then be revealed.Therefore, further investigation through pupillometry is required to understand the connection between glossiness perception and other perceptual factors.
Second, although this study combined pupillometry recordings and glossiness ratings, it is unclear whether pupil constriction occurs in highglossiness images without rating tasks.If the high-glossiness attribute plays an important role in the visual system for obtaining information from the external world, the pupils will possibly constrict with simply passive viewing.For example, regarding pupil constriction in visual material perception, Tanaka et al. reported that the pupil size became smaller when the observers carefully observed the surface of the marble stone, even when the stimuli did not contain particularly brighter regions (Tanaka et al., 2017).They even suggested that the pupillary responses could depend on the task.Therefore, future studies are required to clarify how materials and their surface properties affect the physiological signals, even under passive conditions.
In conclusion, our findings suggest that high-glossiness rated images elicit constriction of the pupil size compared to the low-glossiness rated images.Further studies are required to explore the relationship between other perceptual qualities, such as transparency and the pupillary response, and whether the changes in the pupil responses are dependent on specific physical materials (e.g., wood, metal, and glass).Moreover, connecting more emotional qualities beyond physical surface qualities with pupillometry would aid in understanding the dynamics of material perception in the visual system.

Fig
Fig.3Adepicts the pupillary responses averaged by each rating

Fig. 1 .
Fig. 1.Stimuli and procedure.(A) Examples of the stimuli.Top: original; bottom: shuffled.(B) The flow of the shuffling process.The patch size shrank in a step-bystep manner from 64 × 64 to 1 × 1 pixels.In the binary images of the two middle steps, the white regions indicate remaining pixels for further shuffling.(C) The sequence of one trial of the experiment.Note that the ratio of the screen to the fixation point and the stimulus in this panel differs from the actual ratio used in this study.

Fig. 2 .
Fig. 2. Original vs. Shuffled.(A) Distribution of glossiness ratings of all stimuli.The horizontal axis indicates the average glossiness rating and the vertical axis indicates the frequency with a 0.5 bin width.(B) Average pupillary responses for the original and shuffled conditions of all images (60 images in each).The horizontal axis indicates the time since the stimulus onset and the vertical axis indicates the pupillary response.The error bars represent the standard error of the mean.

Fig. 3 .
Fig. 3. Pupillary responses.(A) Average pupillary responses across each rating provided by the observers.The horizontal axis indicates the time since the stimulus onset and the vertical axis indicates the pupillary response.The different colors indicate each rating score (1: lowest -7: highest glossiness).Note that the numbers of sample trials for each rating were different (516, 722, 652, 800, 997, 604, and 288 trials from 1 to 7).The error bars represent the standard error of the mean.(B) and (C) show the average pupillary responses for the original and shuffled images, respectively.The sampling and other formats are the same as those in (A).