Testing three coping strategies for time pressure in categorizations and similarity judgments

This article compares three psychological mechanisms to make multi-attribute inferences under time pressure in the domains of categorization and similarity judgments. Specifically, we test if people under time pressure attend to fewer object features (attention focus), if they respond less precisely (lower choice sensitivity), or if they simplify a psychological similarity function (simplified similarity). The simpler psychological similarity considers the number of matching features but ignores the actual feature value differences. We conducted three experiments (two of them preregistered) in which we manipulated time pressure: one was a categorization task, which was designed based on optimal experimental design principles, and the other two involved a similarity judgment task. Computational cognitive modeling following an exemplar-similarity framework showed that the behavior of most participants under time pressure is in line with a lower choice sensitivity, this means less precise response selection, especially when people make similarity judgments. We find that the variability of participants' behavior increases with time pressure, to a point where participants are unlikely to make inferences anymore but instead start choosing readily available response options repeatedly. These findings are consistent with related research in other cognitive domains, such as risky choices, and add to growing evidence that time pressure and other forms of cognitive load do not necessarily alter core cognitive processes themselves but rather affect the precision of response selection.


A B S T R A C T
This article compares three psychological mechanisms to make multi-attribute inferences under time pressure in the domains of categorization and similarity judgments. Specifically, we test if people under time pressure attend to fewer object features (attention focus), if they respond less precisely (lower choice sensitivity), or if they simplify a psychological similarity function (simplified similarity). The simpler psychological similarity considers the number of matching features but ignores the actual feature value differences. We conducted three experiments (two of them preregistered) in which we manipulated time pressure: one was a categorization task, which was designed based on optimal experimental design principles, and the other two involved a similarity judgment task. Computational cognitive modeling following an exemplar-similarity framework showed that the behavior of most participants under time pressure is in line with a lower choice sensitivity, this means less precise response selection, especially when people make similarity judgments. We find that the variability of participants' behavior increases with time pressure, to a point where participants are unlikely to make inferences anymore but instead start choosing readily available response options repeatedly. These findings are consistent with related research in other cognitive domains, such as risky choices, and add to growing evidence that time pressure and other forms of cognitive load do not necessarily alter core cognitive processes themselves but rather affect the precision of response selection.
The human mind treats similar objects in a similar fashion. If objects have similar feature values, people likely group them together (Nosofsky, 1986). For instance, animals can be readily categorized as ''harmless'' or ''dangerous'' based on their similarity in features such as color, size, and shape. In addition to being fundamental for categorization (Goldstone, 1994;Medin & Schaffer, 1978;Nosofsky, 1984Nosofsky, , 1986Nosofsky & Zaki, 2002;Smith & Minda, 2002), similarity is used in other higher-order cognitive processes and inferences of all kinds (Goldstone & Son, 2012), including problem-solving (Holyoak & Koh, 1987;Ross, 1987), reasoning (Hahn & Chater, 1998), memory (Raaijmakers & Shiffrin, 1981), and quantitative judgments (Albrecht, Hoffmann, Pleskac, Rieskamp, & von Helversen, 2020;Juslin, Olsson, & Olsson, 2003;von Helversen & Rieskamp, 2009). An understanding of psychological similarity is thus essential to understanding human cognition. This article investigates how psychological similarity is affected by time pressure and tests three coping strategies of how people make categorizations and similarity judgments under time pressure. ✩ This work was supported by the University of Basel, Switzerland. Informed consent for scientific publication and publishing data was obtained for all experiments. The data, analysis code, experimental code, and preregistrations are publicly available at https://osf.io/94e6u/. We thank Amélie Herweg and Vanessa Hoffmann for their help with conducting Experiment 3. * Correspondence to: Center for Economic Psychology, Department of Psychology, University of Basel, Missionsstrasse 60/62, 4055 Basel, Switzerland.
Broadly speaking, people are exceptionally good in categorization. Past research has found that people can learn various category structures of different difficulty with over 80% accuracy after a few hundred training trials (Ashby, Ell, & Waldron, 2003;Erickson & Kruschke, 1998;Nosofsky, Gluck, Palmeri, McKinley and Glauthier, 1994;Nosofsky & Palmeri, 1996;Shepard, Hovland, & Jenkins, 1961). Nosofsky (1986) has demonstrated that a learner who relies on the similarity between individual category members can learn many category structures, including environments in which categories are separated by one feature, multiple features, or a rule-plus-exception scheme. This finding has been substantiated in a series of experiments (e.g., Nosofsky, 1989), showing that people can achieve good categorization performance by relying on similarity.
Often, people have little cognitive capacities available to categorize objects. For instance, people need to quickly decide if a long, meandering animal is a harmless slow worm or a dangerous snake. Even in such https://doi.org/10.1016/j.cognition.2022.105358 Received 7 September 2021; Received in revised form 5 December 2022; Accepted 6 December 2022 situations, people still make stellar categorizations (DeCaro, Thomas, & Beilock, 2008;Lamberts, 1995Lamberts, , 2002Lamberts & Brockdorff, 1997;Nosofsky & Alfonso-Reese, 1999;Zeithamova & Maddox, 2006). For instance, when participants had less than 600 ms time per categorization they categorized two thirds of familiar stimuli correctly in at least 70% of cases (Lamberts, 1995). People's ability to categorize despite cognitive load has been explained by cognitive coping mechanisms. One well-studied mechanism involves that people under cognitive load narrow their focus of attention (Lamberts, 1995;Lamberts & Brockdorff, 1997;Milton, Viika, Henderson, & Wills, 2011;Wills, 2013), meaning that they attend only to a subset of the possible features, which reduces the computational complexity of the categorization problem. An alternative mechanism to cope with cognitive load consists in lower choice sensitivity (e.g., Olschewski, Rieskamp, & Scheibehenne, 2018), meaning a reduced sensitivity to the differences between the alternative categories, which results in less precision and less effort in selecting the category response. A third, yet untested, coping mechanism follows from the assumption that the similarity to previously seen category members governs people's categorizations of objects (for an overview, see Kruschke, 2008). To cope with cognitive load, people may assess a simplified similarity between objects. A simplified similarity may be based on coarse, binary ''same-or-different'' comparisons of the objects' feature values instead of more sophisticated, continuous comparisons. In this article, we use cognitive modeling to compare the three coping mechanisms (attention focus, lower choice sensitivity, and simplified similarities) as accounts of human categorizations and similarity judgments under time pressure.

Inferential choice with cognitive load
People can perform accurate inferences, even if they have reduced cognitive capacities available (Beilock & DeCaro, 2007;Fischer & Holt, 2017;Hoffmann, von Helversen, & Rieskamp, 2013;Lamberts, 1995;Lamberts & Brockdorff, 1997;Mormann, Malmaud, Huth, Koch, & Rangel, 2010). Regarding categorization, cognitive load in the form of a concurrent task or time pressure has been shown to reduce people's categorization accuracy by about 10 percentage points; yet, people can still achieve considerable categorization accuracy between 65% and 75% (Lamberts, 1995(Lamberts, , 2002Lamberts & Brockdorff, 1997;Smith et al., 2015;Zeithamova & Maddox, 2006) and occasionally up to 100% (Nosofsky & Alfonso-Reese, 1999;Nosofsky & Palmeri, 1997). Further, evidence suggests that cognitive load need not always decrease categorization accuracy, but may sometimes also increase it (DeCaro et al., 2008;Markman, Maddox, & Worthy, 2006). It is not fully clear how the cognitive system maintains this performance under cognitive load. Thus, we experimentally investigate three cognitive mechanisms that simplify inference problems and thus facilitate inferences under increased cognitive load-an attention focus, lower choice sensitivity, and a simplified similarity (see Fig. 1 for a visualization of the coping mechanisms in a similarity judgment task).

Attention focus
One way to cope with the burden of cognitive load consists in a narrower focus of attention (Lamberts, 1995;Lamberts & Brockdorff, 1997;Milton et al., 2011;Wills, Inkster, & Milton, 2015a). Focusing attention means that decision makers allocate their attention to a subset of all object features and ignore the remaining features, thereby reducing cognitive processing effort. The categorization accuracy that results from a focus of attention depends on the evidence that the feature which a person focuses on provides for category membership. For instance, when categorizing an animal as ''harmless'' or ''dangerous'', a slow worm may be correctly classified as ''harmless'' if attention focuses on the small size, but it may be incorrectly classified as ''dangerous'' if attention focuses on the meandering shape. Importantly, although an attention focus can reduce the performance of category inferences if relevant features are ignored, it saves cognitive effort irrespective of which features are ignored. This renders it one way of coping with cognitive load.
Evidence from multiple-feature categorization experiments supports the idea that people focus their attention under cognitive load (Lamberts, 1995(Lamberts, , 2002Lamberts & Brockdorff, 1997;Milton et al., 2011). For instance, Lamberts and Brockdorff (1997) found in a categorization task involving two multivalued features that people considered both features with almost 100% probability in the absence of a response deadline, but only with 76% and 22% probability given response deadlines of 700 ms and 400 ms, respectively. In a related triad task, where participants had to identify the most dissimilar stimulus out of three, the percentage of participants taking into account only one of the two available features increased from 28% under stimulus presentation times of 7500 ms to 75% under stimulus presentation times of 2048 ms or lower (Wills et al., 2015a). Similar results have also been found in inference tasks where high time pressure increased the probability that people use a lexicographic heuristic that is based on the minimum number of features (ordered by their validity) necessary to make a clear response prediction (Rieskamp & Hoffrage, 2008). These findings indicate that cognitive load can lead to a narrow focus of attention and that the number of features being processed depends on the amount of cognitive load.

Choice sensitivity
Another way by which decision makers can cope with cognitive load consists in a decrease of choice sensitivity (Diederich, 2003;Olschewski & Rieskamp, 2021;Olschewski et al., 2018;Smith, 1990). Lower choice sensitivity means that people under cognitive load are less sensitive to the differences between alternative categories and reduce their decision threshold so that they already make a category decision with less evidence. This may translate, for instance, into a higher error in response selection (e.g., a trembling hand error, Olschewski & Rieskamp, 2021). Returning to our animal example, consider a person that believes a slow worm is harmless with 80% probability and dangerous with 20% probability. In this case, the person will almost deterministically select the likelier category ''harmless'' under high choice sensitivity; with decreasing choice sensitivity, however, the less likely category ''dangerous'' will be selected more often. In such a binary categorization task, decreasing choice sensitivity gradually shifts categorization towards random choice. Importantly, while decreasing choice sensitivity lowers categorization accuracy if the underlying belief is correct, it saves response selection effort irrespective of the belief and is thus another way to cope with cognitive load.
Studies have shown that cognitive load causes people to respond less consistently across various cognitive domains (Diederich, 2003;Olschewski & Rieskamp, 2021;Olschewski et al., 2018;Smith, 1990). Categorization research has found that higher cognitive load gradually decreases choice sensitivity by shifting categorization to the chance level (e.g., Lamberts, 1995, although people's categorization accuracy in general remained above chance level). Similarly, lower choice sensitivity due to cognitive load has been found in other domains, namely risky choices, temporal discounting, and strategic interactions (e.g., Olschewski et al., 2018). Furthermore, the pattern that lower cognitive capacities are associated with less consistent responses has been found for externally induced cognitive load such as time pressure and for internal cognitive abilities such as intelligence. For instance, Burks, Carpenter, Goette, and Rustichini (2009) found that lower scores on an IQ test were associated with lower choice sensitivity in risky gambles and intertemporal choices. These findings suggest that cognitive load can impede the effort of a deterministic response selection in inferential choice tasks, thereby lowering choice sensitivity.   1. Visualization of the three coping mechanisms in a similarity judgment task. Shown are two stimuli with a magenta-red and a cyan-blue feature. To cope with time pressure, people may focus their attention to one (e.g., the red) feature, simplify the stimulus comparison, or lower their choice sensitivity, leading to less precise responses (see below for more details on the formal implementation). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Simplified similarity
Following the view that objects are categorized based on their similarity to previously experienced objects (e.g., Nosofsky, 2011), a third, to date untested mechanism to cope with cognitive load may be the assessment of a simplified similarity. Traditionally, theories of psychological similarity assume that people compare pairs of objects by summing up the objects' feature value differences (Nosofsky, 1986), which hereafter will be referred to as elaborate similarity. The larger the difference between feature values, as the less similar the two objects are perceived. To cope with cognitive load, people may simplify their similarity assessment: rather than computing continuous feature value differences, they may assess if the feature values of two objects are identical or not, resulting in a binary ''same-or-different'' assessment per feature. This will be referred to as simplified similarity (for related approaches, see Hamming, 1950 andTversky, 1977). For example, to categorize a slow worm as ''harmless'' or ''dangerous'' people can assess exactly how similar a slow worm's short size is to the sizes of pythons, cobras, and other members of the category ''dangerous'', which corresponds to an elaborate similarity. Alternatively, people could assess if the ''dangerous'' category members are short as well or not, and ignore the precise differences to the slow worm's size, which would be a simplified similarity. Because the latter comparisons are arguably simpler than the former ones (see below), the computation of simplified similarities is an additional mechanism, yet to be empirically tested, through which people may cope with cognitive load.
The binary ''same-or-different'' checks that are implied by simplified similarities are known to have low computational complexity (e.g., Stabili, Marchetti, & Colajanni, 2017). Using simplified instead of elaborate similarities might therefore reduce the computational complexity of decisions under cognitive load. Specifically, simplified similarities, for which the number of features on which objects differ is central, allow neglecting the precise, continuous feature value differences and may thus require less precision than elaborate similarities. Related literature on same-or-different judgments shows that people can typically tell very quickly if two objects are identical (fast-same effect, Goulet & Cousineau, 2020;Krueger, 1978;Proctor, 1981). For instance, when comparing two lines, people can readily tell if the lines have the same length (provided that the eventual differences are perceptible). Assessing the precise length difference, however, may require much more estimation processing, as one needs to gauge how many units need to be added to the shorter line to obtain the longer line. In other words, computing continuous feature value differences demands a different mode of processing likely involving more computational complexity than binary ''same-or-different'' checks based on a quick visual glance. In addition to reducing computational complexity, the resulting simplified similarities can strongly correlate with the corresponding elaborate similarities and can therefore lead to the same qualitative categorization predictions. As an extreme case, for binary features (features taking only two possible values), which are widely used in categorization tasks (e.g., Lamberts & Brockdorff, 1997;Shepard et al., 1961), the result of elaborate and simplified similarities perfectly match, and both similarities produce identical category inferences. Hence, simplified similarities, which sum up the number of differing features but ignore the precise, continuous feature value differences, may not only save computational effort but also maintain categorization accuracy, and thus are another psychologically plausible coping mechanism for cognitive load.

Formal model framework
We investigated how people cope with cognitive load by means of cognitive modeling within the framework of the generalized context model, a highly prominent psychological model of categorization (Nosofsky, 1986(Nosofsky, , 2011. In addition to the original model (Nosofsky, 1986), we implemented three model versions representing an attention focus, the simplified similarity, and a combination thereof. All model versions include a choice rule that maps model predictions to more or less deterministic choices, which allows to model choice sensitivity. We will now outline the generalized context model.

Generalized context model
The generalized context model assumes that people categorize objects based on their similarity to previously experienced category members (i.e., exemplars). The more similar an object is to the exemplars of one category, the more likely the object belongs to this category. For a binary categorization problem, given the similarity between an object and previous exemplars with category labels and , the generalized context model computes the evidence that a new object belongs to category as where is the psychological similarity between object and exemplar . The evidence for belonging to category is thus the share of the similarity between and all category members relative to the similarity between object and the members of all categories.
The similarity is computed from the distance between an object pair and by means of Shepard's universal law of generalization (Shepard, 1987) where is the distance between object and exemplar . The formula has two parameters; the sensitivity to the distance (with ≥ 0) and the exponent that determines how distance relates to psychological similarity. We used an exponential decay function (i.e., = 1) which has been shown to describe categorizations of discriminable stimuli like we used in our experiments (see the method sections below; Ennis, 1988;Nosofsky, 1985).
In the original generalized context model, the distance between a pair of objects and follows an elaborate similarity, that is a weighted Minkowski distance function given by where and are the th feature's values of object and exemplar , respectively. There are two parameters; the attention weight to feature (with 0 ≤ ≤ 1 and ∑ = 1), which models individual differences in relative attention across object features, and the norm of the distance (with ≥ 1). We used the Manhattan norm (i.e., = 1) which has been shown to describe categorizations of stimuli with highly separable features like we used in our experiments (see the method sections below; e.g., Garner, 1974;Shepard, 1964). In what follows, we call the original version of the generalized context model the multidimensional Minkowski model (abbreviated in figures and tables as MULTI-MINK).

Attention focus: Unidimensional Minkowski model
Within the generalized context model, we implemented a model version that represents a narrow attention focus, which restricted the attention weight parameter in the distance function (Eq. (3)) to attend to only a subset of the available features. We implemented a strong attention focus according to which the model attends to one feature (say feature ) and ignores the remaining features by setting = 1 and * = 0 for all features * ≠ . Which feature receives attention was a free parameter (i.e., ∈ {1, 2, 3} as our stimuli had three features, see the method sections below); this allowed us to model individual differences in the focus of attention. We call the model version with a narrow attention focus the unidimensional Minkowski model (UNI-MINK).

Simplified similarity: Multidimensional discrete model
We implemented the simplified similarity in the generalized context model by a new distance function, which is given in Eqs. (4) and (5). Unlike the Minkowski distance, which computes continuous differences between two objects' feature values (elaborate similarity), the new distance function checks for each feature if the two objects' feature values are the same or not (simplified similarity). This distance function is a special instance of the discrete distance (Deza & Deza, 2009)namely one that is applied to every feature (see also the Hamming distance, Hamming, 1950). For simplicity, we will adopt the name ''discrete distance''. The discrete distance function ( , ) checks if the th feature value is identical for object and exemplar , and if so, the difference is 0, otherwise it is 1.
The discrete distance ignores the magnitude of the feature value difference and therefore makes binary feature-wise ''same-or-different'' checks between object pairs. To implement the discrete distance in the generalized context model, we substituted the Minkowski distance function (Eq. (3)) by the discrete distance function, resulting in where is the attention weight attributed to feature , is the norm of the distance (as before we restricted = 1), and ( , ) is defined in Eq. (4). The model version with the simplified similarity is called the multidimensional discrete model (MULTI-DISC).

Attention focus and simplified similarity: Unidimensional discrete model
We further implemented a model that represents a combination of attention focus and simplified similarity by including both unidimensional attention and the discrete distance in the model. This represents a decision maker who copes with cognitive load by both narrowing their attention and using a simplified psychological similarity. Specifically, this model version included the discrete distance function as introduced in Eqs. (4) and (5), but attention weights were set to = 1 and * = 0 for all features * ≠ (again, which feature has exclusive attention was a free parameter, see also the cognitive modeling sections below). In the following, we call the model version with an attention focus and the simplified similarity the unidimensional discrete model (UNI-DISC).

Choice sensitivity
To model choice sensitivity, we implemented choice rules for all model versions that govern how deterministically the model predictions are executed as manifest responses. We used a different choice rule in each experimental task-the softmax function (Bishop, 2006) in the categorization task (Experiment 1) and the normal distribution in the similarity judgment task (Experiments 2 and 3).

Softmax function.
To predict binary categorizations, we transformed the predictions of each model version (Eq. (1)) with the softmax function given by where * ( | ) is the softmax-transformed probability to assign object to category . The softmax function has the free parameter called temperature (with > 0) which governs the determinism of response selection; large values for shift model predictions towards random choice (i.e., * ( | ) = * ( | ) = .50), low values shift predictions towards deterministic choice (i.e., * ( | ) = 1 if ( | ) > .50 and Normal distribution. We modeled people's continuous similarity judgments between pairs of objects and as being sampled from a normal distribution with the mean equaling the similarity between the objects (Eq. (2)) and the standard deviation being a free model parameter, ∼  ( , ). Higher values for allow the similarity judgments of object pairs to vary more around the assessed psychological similarity , representing lower choice sensitivity.

Relation to other models
A related approach to modeling an attention focus is the extended generalized context model (Lamberts, 1995) which is based on the generalized context model but additionally incorporates the idea of a time-dependent inclusion of features in the categorization process. This time-dependent feature inclusion starts with the assumption that people need to perceptually process a feature in order to use it for further categorization. The probability that a given feature is perceptually processed increases with the time people devote to the stimulus (and with the perceptual salience of the feature; see Lamberts & Brockdorff, 1997). As processing time increases, so does the number of features that are processed and included into the categorization. Thus, with increasing cognitive load decision makers can process and include fewer features in the categorization process. In a series of experiments with binary and multivalued features, two or more features, and various response deadlines that were either predictable or unpredictable, the authors were able to predict people's categorizations at the stimulus level (Lamberts, 1995(Lamberts, , 1998(Lamberts, , 2002Lamberts & Brockdorff, 1997;Lamberts & Freeman, 1999a, 1999b. Importantly, the authors could account for the fact that time pressure decreased categorization accuracy for those stimuli, for which many features needed to be included for accurate categorization, while not affecting accuracy for other stimuli, for which fewer features were necessary for categorization (Lamberts, 1995). Note that the feature inclusion process of the extended generalized context model does not directly model a focus on the attentional level as we have implemented, but rather a feature inclusion restriction on the perceptual level. Yet, the extended generalized context model describes the similar idea that people with time pressure categorize stimuli based on a subset of all available features. We did not implement the extended generalized context model as we did not model the explicit time course of speeded categorization.

Summary
Within the framework of the generalized context model for categorization (multidimensional Minkowski model), we implemented model versions that formalize a focus of attention (unidimensional Minkowski model), the simplified similarity (multidimensional discrete model), and a combination thereof (unidimensional discrete model). For each model, we included a choice rule (the softmax function for categorizations and the normal distribution for similarity judgments) to investigate choice sensitivity. We compared the mechanisms to cope with cognitive load with data from a categorization task (Experiment 1) and a similarity judgment task (Experiments 2 and 3) by means of inferential statistics with the lme4 package (v.1.1.23, Bates, Mächler, Bolker, & Walker, 2015) and cognitive modeling with the cognitivemodels package (v0.0.12, Jarecki & Seitz, 2020) in the statistical programming framework R (v3.6.1, R Core Team, 2017).

Reanalysis of Wills et al. (2015)
While past research has found evidence that people can cope with cognitive load by focusing their attention (Lamberts & Brockdorff, 1997) and reducing their choice sensitivity (Olschewski & Rieskamp, 2021), simplifying psychological similarity to cope with cognitive load has to date not been empirically tested. As a first test of the empirical plausibility of simplified similarities under cognitive load, we reanalyzed Wills et al.'s (2015a) data from a triad task, in which participants chose the odd one out of three two-feature stimuli under different levels of time pressure ([dataset] Wills, Inkster, & Milton, 2015b, Experiment 1; originally reported in Milton, Longmore, & Wills, 2008, Experiment 5). Time pressure was manipulated by randomly assigning each participant to one of five levels of stimulus presentation duration ranging from 640 ms to 7500 ms.

Method
Data of 145 participants were available. Stimuli were drawings of boats, varying in the length of the hull base and in the length of the sail base (for an example, see Wills et al., 2015a, Fig. 4A). Three boats were shown in each trial; two boats had one feature value in common and one feature value was maximally different. The third boat's feature values corresponded to one of the other boats but shifted by 1 unit on each feature (the resulting stimulus structure is given in Wills et al., 2015a, Fig. 3). In each trial, participants saw a triad of boats for a certain duration, followed by a mask, and then pressed a key to select the boat they regarded as the odd one out. Wills et al. (2015a) found that increasing time pressure was associated with people focusing their attention on one of the two available features. Accordingly, a unidimensional model described the majority of participants (75%) best in the conditions with shorter stimulus presentation duration (≤ 2048 ms), but it described only a minority of participants (28%) in the condition with the longest stimulus presentation duration (7500 ms; see Wills et al., 2015a, Table 2).

Results
In our reanalysis, we tested if a simplified similarity can describe people's choices better with increasing time pressure. To this end, we implemented the models that Wills et al. (2015a) tested (overall similarity model, unidimensional model, and identity model) in the formal exemplar-modeling framework detailed above: the overall similarity model equals the multidimensional Minkowski model, the unidimensional model equals the unidimensional Minkowski model, and the identity model equals the unidimensional Minkowski model that focuses in each trial on the feature on which two stimuli of the triad match. We compared these models to the multidimensional discrete model to test the psychological plausibility of simplified similarities under time pressure. For each model, we computed the similarities (as per Eq. (2) with the parameter set to 1) of all stimulus pairs in the triads, resulting in three pairwise similarities per triad and model. The three similarities of stimulus pairs within triads were then rescaled to sum up to 1. The rescaled similarity between any stimulus pair denoted the ''odd one out'' choice probability for the remaining stimulus in the triad. For instance, the rescaled similarity between the first two stimuli equaled the predicted choice probability to select the third stimulus. Given the models' choice predictions, we then computed the log likelihood of each participant's observed choices in the data and selected for each participant the model with the highest evidence strength (i.e., Akaike weights, Wagenmakers & Farrell, 2004). Fig. 2 shows how many participants the models can describe in the different time pressure conditions. In general, the results are in accordance with the findings of Wills et al. (2015a): With increasing time pressure, participants' responses were more in line with the predictions of the unidimensional Minkowski model, representing an attention focus. However, the results also indicate that the multidimensional discrete model, which implements the simplified similarity, describes more participants with increasing time pressure. In the condition with the highest time pressure (640 ms), the multidimensional discrete model describes 11 of 30 participants best (36.66%), with an average evidence strength of M = .84 (Md = .82, SD = .16). In this condition, it performs similarly to the best model in the original analysis-the unidimensional Minkowski model-which in our reanalysis describes 12 participants best (40%), with an average evidence strength of M = 1 (Md = 1, SD = .01).
These results provide preliminary evidence that simplified similarities might be a viable mechanism to perform cognitive inferences under time pressure. In the following, we specifically compare simplified similarities to an attention focus and lower choice sensitivity as mechanisms to cope with time pressure in the domains of categorizations (Experiment 1) and similarity judgments (Experiments 2 and 3).

Experiment 1: Categorization
Experiment 1 used a trial-by-trial supervised, binary category learning task (e.g., Nosofsky, 1989), followed by a speeded transfer task with new feature value combinations to test how people categorize under cognitive load (e.g., Lamberts, 1995). We manipulated cognitive load by assigning half of the participants to an individually-calibrated level of time pressure in the transfer task after category learning. The data from the transfer task was used to test the coping mechanisms for time pressure (attention focus, choice sensitivity, simplified similarities). The category structure and the transfer stimuli were selected using a simulation-based optimal experimental design (Jarecki, Meder, & Nelson, 2018;); the experimental design, the cognitive models, and the analyses were preregistered (https://osf.io/t84b2). Our preregistration formulates the analyses for the attention focus and the simplified similarity; the subsequent analyses that test choice sensitivity are exploratory and accordingly designated.

Participants
On the basis of a model-based power simulation (Gluth & Jarecki, 2019, for details, see Appendix A), this experiment aimed for 60 participants. In total, 71 psychology students from the University of Basel, recruited online, participated in a laboratory experiment in exchange for course credit. Ten participants were excluded, n = 2 for not reaching the category learning accuracy criterion in one hour (see also below) and n = 8 for reporting that the task was somewhat or absolutely unclear. This yields a final sample of = 61 (43 women, = 24.13 years, = 6.39 years, age range: 19-50 years) in which n = 30 participants had individual time pressure during transfer and the remaining n = 31 participants had no time pressure. The experiment lasted on average about 45 min and was approved by the ethics board of the psychology department of the University of Basel (#025-18-2).

Task design: optimal experimental design
In the experiment, participants assigned stimuli consisting of three features with four possible values each to one of two categories. Participants first learned a category structure through trial-by-trial feedback and then categorized transfer stimuli with new feature value combinations without feedback (e.g., Nosofsky, 1989). The category structure (feature value combinations and their categories) and transfer stimuli were designed using simulation-based optimal experimental design (Jarecki et al., 2018;Myung, Cavagnaro, & Pitt, 2013;) that aimed to optimally discriminate between the Minkowski and the discrete models. To this end, we simulated model learners from both cognitive models who learned each possible category structure and inferred the remaining possible stimuli's category membership given the learning stimuli, using a range of model parameters for the simulation. We searched for (a) a category structure in the learning task that both the Minkowski and the discrete models can learn and (b) new stimuli in the transfer task for which the Minkowski and the discrete models make on average maximally opposite category predictions across parameter values (details are given in Appendix B). Fig. 3 shows the resulting optimized task design; the model predictions for the learning and transfer stimuli in the optimized task design are given in Tables 1 and 2, respectively. Table 1 shows that the multidimensional Minkowski model and the multidimensional discrete model can learn the true category membership of the learning stimuli: Both models assign all learning stimuli to the true categories. The two models' predictions are identical because the learning stimuli differ from each other by 0 or 1 unit per feature with the result that the Minkowski metric's similarity equals  Note. Median model predictions for classifying the learning stimuli into category . The Minkowski (MINK) and discrete (DISC) models make equal predictions; they classify the learning stimuli correctly in the multidimensional (MULTI) but not in the unidimensional (UNI) version. Stimuli with an asterisk were also shown in the transfer task as filler stimuli.
the discrete metric's similarity, avoiding any bias towards one of the models during learning. The unidimensional model versions, in turn, cannot learn the categories of the learning stimuli because the category structure is not linearly separable (see Fig. 3 and Table 1). This means that participants needed to attend to multiple features during learning F.I. Seitz et al. Fig. 4. Experiment 1, illustration of the learning task. Participants categorized by pressing a key and then received feedback about the true category (a smiley for a correct categorization as in panels (a); a frowning face for an incorrect categorization as in panels (b)). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) Note. Median predictions for classifying the transfer stimuli into category according to the discrete (DISC, simplified similarity) and the Minkowski (MINK, elaborate similarity) models with multidimensional attention (MULTI) or unidimensional attention focus (UNI).
(these tasks are generally well learnable, e.g., Levering, Conaway, & Kurtz, 2020), which could cause them to try and attend to multiple features during transfer as well and slightly bias them against an attention focus. However, a design that requires multiple-feature attention allows to control for inter-individual differences in feature attention during learning and thereby avoids floor effects, where participants attend to a single feature during learning (and are unable to focus their attention any further under time pressure). Table 2 displays the six new transfer stimuli that discriminate best between the model versions after they learn the category structure of the learning phase. For instance, Table 2 shows that the multidimensional Minkowski model assigns stimulus ''003'' to category with ( |003) = .09, which implies ( |003) = .91, whereas the multidimensional discrete model assigns it to category with ( |003) = .63.

Materials
The computerized experiment was programmed in Expyriment (Krause & Lindemann, 2014) using Python 2.7 (Van Rossum & Drake, 1995). The stimuli were based on Albrecht et al. (2020); each feature was represented by a gray beam in which feature values were displayed as one to four colored squares; for an example, see Fig. 4. The cover story described the stimuli to participants as products with three ingredients, and participants were asked to classify the products into two brands (brand L and brand R) by pressing the left and right arrow keys. The category labels, the key-label association, the colorfeature association, and the visual mapping of features to positions on the screen were randomized across participants. For all analyses, the category labels were derandomized to equal Table 1; category A was coded as 1 and category B as 0.

Procedure
Participants' task was to categorize stimuli into two categories. At the beginning of the experiment, participants were informed about a learning phase with feedback and a transfer phase without feedback. Then, participants were familiarized with the possible feature values and feature value combinations. Participants in the time pressure condition were informed about a response deadline in the transfer phase; time pressure was individually set to 400 ms plus 30% of each participants' median response time in the final 100 learning trials. Participants were informed about their response deadline (in seconds) before the beginning of the transfer phase.
Learning and transfer phases. Participants first learned the category structure (Table 1) by repeatedly categorizing the stimuli. Participants categorized one stimulus at a time and got feedback (a smiley or frowning face and a notification ''correct'' or ''wrong'' in German, see Fig. 4). The eight learning stimuli were shown in blocks with random order within blocks. There was no time pressure. After the first 100 learning trials, participants also received performance feedback every 50 trials (% correct in the last 100 trials) to encourage learning. Learning ended when participants correctly classified 80% of the last 100 trials and 100% of the last 24 trials (similar to Jarecki et al., 2018). If participants did not meet this accuracy criterion in 60 minutes, the experiment was discontinued.
After learning, half of the participants experienced time pressure. All participants started the transfer phase by classifying four blocks of the learning stimuli; this served to familiarize the participants in the time pressure condition with the speeded categorization. Then, participants categorized the six new transfer stimuli ( Table 2) and four of the learning stimuli (Table 1) in 14 blocks (140 trials), with the order randomized within blocks. There was no category feedback. The next stimulus appeared automatically after 500 ms. Participants with time pressure who exceeded the time limit were informed that they were too slow and then continued with the next trial.  Note. Coefficient estimates are on the log-odds scale, and the model uses sum-to-zero contrasts Singmann and Kellen (2017), meaning the intercept is the grand mean and the fixed effects are the differences from the intercept (e.g., for stimulus ''003'' in the time pressure condition, this gives the log-odds coefficient:

Categorization of the transfer stimuli
In the following, we analyze how time pressure affected participants' responses to the new stimuli in the transfer task (i.e., the transfer stimuli). Participants classified the individual transfer stimuli differently depending on the time pressure condition, as was shown by a linear mixed model with logit link, see Table 3 for the resulting coefficients. 1 Specifically, the interaction term between time pressure condition and stimulus was significant (see Table 3), and the model outperformed a restricted model without the interaction term in a likelihood ratio test, 2 (2) = 145.51, < .001.
Analyses of the effects of time pressure on the categorization of the individual transfer stimuli suggest that participants in general did not choose opposite categories as a function of time pressure as we would expect from the simplified similarity hypothesis. Rather, participants selected the same categories with and without time pressure, but with less determinism under time pressure (see Fig. 5). Fig. 5 shows that irrespective of time pressure participants tended to assign stimulus ''003'' to category , stimulus ''100'' to category , and stimuli ''221, 231, 321, 331'' to either category.
This response pattern is reflected in post-hoc analyses of the logodds coefficients 2 of the linear mixed model in Table 3. Specifically, in the condition without time pressure the coefficients were 100 = 2.52 > 221,231,321,331 = 0.52 > 003 = −3.02, and in the condition with time pressure they were 100 = 0.73 > 221,231,321,331 = 0.30 > 003 = −1.24, and all coefficient differences within conditions were significant in post-hoc contrasts with the Holm-Bonferroni alpha-level correction (Holm, 1979), s ≤.001. To test for decreasing determinism in category choices with time pressure, we calculated exploratory Holm-Bonferroni corrected post-hoc contrasts across the time pressure conditions, finding less deterministic categorizations for stimulus ''003'' (OR = 5.93, SE = 2.45, = 4.32, < .001) and stimulus ''100'' (OR = 0.17, SE = 0.07, = −4.44, < .001), but not for the stimuli ''221, 231, 1 As the model did not converge in its preregistered form, the number of estimated parameters was reduced by aggregating the stimuli ''221'', ''231'', ''321'', and ''331'' with analogous model predictions (see Table 2) and by fitting a random intercept but no random slope per participant. The final model had the participant response as criterion, the time pressure condition, the stimulus (with three levels), and the interaction thereof as fixed effects, and a participant-wise random intercept.
2 The log-odds = 0 means equal numbers of category and responses, > 0 means more responses. 321, 331'' (OR = 0.80, SE = 0.29, = −0.61, = .54). We thus do not find that time pressure simplified participants' way of computing psychological similarity; rather, time pressure seemed to reduce their choice sensitivity in the sense of less deterministic choices.
Cognitive modeling. To gain further insights into the cognitive processes underlying participants' categorizations, we used cognitive modeling within the framework of the generalized context model (Nosofsky, 1986). The free parameters of the multidimensional Minkowski model and the multidimensional discrete model (i.e., three attention weights s, sensitivity with 0 ≤ ≤ 5, and temperature with 0.1 ≤ ≤ 10) were fit with maximum likelihood to individual participants' last 100 learning trials. 3 The estimates of the attention weight parameter indicate more attention to the second feature (M = .57, Md = .64, SD = .24) than to the first feature (M = .23, Md = .20, SD = .13) or to the third feature (M = .21, Md = .15, SD = .15). Furthermore, the parameter estimates are high for sensitivity (M = 4.52, Md = 5, SD = .87) and low for temperature (M = .17, Md = .10, SD = .09) which suggests a deterministic classification of well-identified stimuli in line with the accuracy criterion of the learning phase. The unidimensional Minkowski model and the unidimensional discrete model were fit to individual participant's responses to the learning stimuli in the practice and transfer trials after learning to test if time pressure shifts the attention to a single feature. Importantly, the model parameters and were fixed to each participant's parameter estimates from the multidimensional models to avoid over-fitting, and therefore only the attention weight parameter (which equals 1) was fit. 4 The unidimensional models' best-fitting attention weight parameter estimates were such that the attention of 26 of 61 participants was placed on the first feature, and for 35 participants it was placed on the third feature.
To test if people under time pressure focus their attention or simplify their similarity, we compared the performance of the cognitive models and of a baseline random choice model (predicting categorizations of ( |stimulus) = .50 and abbreviated as RANDOM in tables and figure) on the new transfer stimuli. For each participant and model, the best-fitting parameter estimates were used to make predictions for the new transfer stimuli (i.e., the hold-out data), which were then used to compute the log likelihood based on the participant's categorizations. The individual log likelihoods were transformed into evidence strengths (i.e., Akaike weights, Wagenmakers & Farrell, 2004). Within each time pressure condition, the models were then compared pairwise by means of the ratio of their Akaike weights (called the evidence ratio, i.e., the normalized probability of one model over the other, see Wagenmakers & Farrell, 2004, p. 194). The winning model of a pair was defined by an evidence ratio above .90; else, inconclusive evidence resulted. Summing up the winning models across all pairwise model comparisons yielded a rank order of models for each time pressure condition.

Model comparison at the aggregate level.
At the aggregate level, models were compared using the mean Akaike weight across the participants within time pressure conditions. 5 The model comparisons indicate that participants' categorizations in both conditions are on average not well F.I. Seitz et al.    Table 4 for the Akaike weights and additional fit indices). Table 4 shows that in both conditions the mean Akaike weights are similar for the different models, except for the underperforming unidimensional discrete model. Consequently, the evidence ratios of the pairwise model comparisons do not reveal a clear winning model in either time pressure condition (see the aggregate rank order in Table 4). These findings do not suggest an evident conclusion for coping with time pressure and are somewhat surprising in the condition without time pressure given that central categorization literature assumes the multidimensional Minkowski model (e.g., Nosofsky, 1986). The aggregated analyses, however, need to be interpreted with caution, as different people may rely on different cognitive processes, leading to a similar performance of the cognitive models in the aggregate analyses (this potential confound is addressed by the individual analyses below).

Model comparison at the individual level.
At the individual level, models were compared for each participant using the evidence strengths (i.e., the Akaike weights) of the respective participant; Fig. 6 shows the results. The pairwise model comparisons based on evidence ratios yielded a rank order of models for each time pressure condition, reported below (see also Table 4). Additionally, we assigned each participant to the model with the highest Akaike weight, given the Akaike weight exceeded .70, and then computed how many participants each model describes in each time pressure condition. 6 Note, that even if two models describe the same number of participants, one model could still outperform the second model in the pairwise model comparisons.
In the condition without time pressure, the following rank order of models was found (with the number of described participants in brackets): unidimensional Minkowski model (n = 11, 35.48%) > multidimensional Minkowski model (n = 8, 25.81%) > random choice model (n = 8, 25.81%) > multidimensional discrete model (n = 3, 9.68%) > unidimensional discrete model (n = 0, 0%); the remaining participants could not be assigned to a model (n = 1, 3.23%). However, none of the models described a significantly larger number of participants than the other models, 2 (3) = 4.40, = .22. The family of Minkowski models described a majority of participants (n = 19, 61.29%), which is in line with previous categorization literature without time pressure (Nosofsky, 1984(Nosofsky, , 1986(Nosofsky, , 1989, and outperformed the remaining models. Interestingly, the random choice model still described a substantial proportion of participants,  In the condition with time pressure, no model was able to describe a majority of participants; rather, different participants seemed to follow different cognitive models. The following rank order was found: multidimensional Minkowski model (n = 7, 23.33%) = random choice model (n = 7, 23.33%) > multidimensional discrete model (n = 7, 23.33%) = unidimensional Minkowski model (n = 6, 20%) > unidimensional discrete model (n = 0, 0%); the remaining participants could not be assigned to a model (n = 3, 10%). The number of described participants did not vary significantly across models, 2 (3) = 0.11, = .99, suggesting that the cognitive processes used during speeded categorization may have varied substantially across participants and impeding a clear-cut conclusion concerning the effect of time pressure on transfer categorization.
A model comparison across the time pressure conditions showed that the models performed quite similarly irrespective of time pressure. Specifically, the models did not differ in the number of participants they described across time pressure conditions as was shown by a Fisher's Exact Test for Count Data, = .42. There was only a tendency that participants with time pressure followed the multidimensional discrete model more often (n = 7 of 30) compared to the no time pressure condition (n = 3 of 31) and followed a Minkowski model less often with time pressure (n = 13 out of 30) than without (n = 19 out of 31), but this tendency was not significant, OR = 3.31, 95% = [0.61, 23.56], = .15. Our findings, although somewhat inconclusive, thus suggest that a stable psychological similarity may have driven categorization behavior with and without time pressure, which most often corresponded to elaborate similarities in line with a Minkowski model.
Exploratory choice sensitivity analyses. The previous model comparisons allowed to test if time pressure leads to an attention focus or to a simplified similarity, for each of which we found no evidence. To further investigate the effect of time pressure on choice sensitivity, we conducted additional exploratory choice sensitivity analyses regarding the softmax parameter . Specifically, for all participants assigned to a version of the generalized context model (n = 22 without time pressure and n = 20 with time pressure), the parameter estimates of the respective model were used to make raw predictions for the trials following category learning (as per Eq. (1), i.e., the softmax function in Eq. (6) was omitted). Based on these raw predictions, was then refit to individual participants' responses. Estimates for were higher for participants with time pressure (M = .95, Md = .38, SD = 2.15) than for participants without time pressure (M = .24, Md = .27, SD = .11), which was corroborated by a t-test (using the log of to ensure normal distribution), (30.91) = 3.50, 95% = [0.35, 1.31], = .001. Because higher estimates for denote less deterministic responses, this indicates that the participants who most likely followed a similaritybased categorization process did so with less choice sensitivity in the condition with time pressure than in the condition without time pressure.

Relation to a decision tree
Since our category structure (Table 1) can also be learned perfectly with a decision tree, in which people first pay attention to the second feature and then to the one of the remaining features that perfectly discriminates the categories given the value on the second feature (the third feature if the second feature has a value of 0, and the first feature otherwise, see Fig. 3), we additionally implemented this decision tree and investigated whether it would describe participants' responses. However, it only accounted for the transfer categorizations of a few participants (with time pressure: n = 2, without time pressure: n = 3; the procedure was the same as in the individual analyses above).

Discussion
This experiment investigated how people perform category inferences with cognitive load (operationalized as between-subjects time pressure during category transfer), testing the new hypothesis that time pressure may lead to a simplified psychological similarity metric. A comparison of three coping strategies (attention focus, lower choice sensitivity, and simplified similarities) using inferential statistics and cognitive modeling showed that participants predominantly had lower choice sensitivity during category transfer to cope with the time pressure. In turn, there was no credible evidence that time pressure induced a focus of attention or simplified the computation of similarities. A substantial proportion of participants in each experimental condition were best described by a cognitive model using elaborate similarities (the Minkowski distance; without time pressure: 61.29%, with time pressure: 43.33%), with a general tendency to focus their attention irrespective of time pressure. Yet, the results were not clear-cut, as a substantial amount of participants in both time pressure conditions were described by the random choice model. One reason for the mixed results is that participants ambiguously classified the analogous stimuli ''221, 231, 321, 331'' (Fig. 5). Some participants did not assign all four stimuli to the same category on average ( = 12, 38.71%, without time pressure and = 4, 13.33%, with time pressure) -a response pattern that the cognitive models cannot predict (Table 2). Despite this mixed evidence, the experiment in sum suggests that time pressure does not lead to a specific change in the cognitive categorization process. Rather, to a large extent people seem to follow elaborate similarities but map similarity less consistently to a category response with the onset of time pressure, indicating lower choice sensitivity.

Experiment 2: Similarity judgments
To extend the evidence of Experiment 1, Experiment 2 used a similarity judgment task, in which participants rated the similarity of pairs of stimuli. This served to investigate psychological similarity under cognitive load more directly than in a categorization task. In Experiment 2, we manipulated cognitive load within participants by letting participants judge the similarity of the same stimulus pairs with and without a participant-specific time pressure. The experimental task was generated to discriminate the Minkowski models and the discrete models; the experimental design, the cognitive models, and the analyses were preregistered (https://osf.io/deqw9). Again, all analyses concerning choice sensitivity were not preregistered and are designated as exploratory in the results.

Participants
The experiment aimed for 60 participants, predetermined by a model-based power simulation (Gluth & Jarecki, 2019, see also Appendix A). In total, 66 participants recruited online participated in the laboratory experiment, which lasted about 30 minutes on average and was approved by the ethics board of the psychology department of the University of Basel (#025-18-2). Participants received a reimbursement of CHF 5.00 (equal to USD 5.56 at the time of study realization) for every quarter of an hour. Two participants were excluded from data analysis, as they reported the task was somewhat or absolutely unclear to them. This leaves a final sample of = 64 (42 women, = 27.08 years, = 11.17 years, age range: 18-64 years).

Design
In the experiment, participants rated the similarity of pairs of stimuli consisting of three features with five possible values each. Participants experienced four critical types of stimulus pairs that discriminated well between the Minkowski and the discrete models by trading off the number of differing features against the amount of difference on a feature (see Table 5). Specifically, there were stimulus pairs which we call the two-large type, which differed maximally on two features and matched on the third feature (e.g., ''315-351''); pairs of the one-large type differed maximally on one feature and matched on two features (e.g., ''153-553''), pairs of the all-small type differed on every feature by one unit (e.g., ''234-343''), and pairs of the two-small type differed on two features by one unit each and matched on the third feature (e.g., ''523-532''). Table 5 illustrates the task design as well as the predicted similarity judgments according to each model. The table shows, for instance, that the multidimensional Minkowski model assigns a higher similarity to two-small pairs than to two-large pairs, whereas the multidimensional discrete model assigns the same similarity to pairs from these two types as they have the same number of differing features.
For each critical type (stimuli that discriminated the models well), each participant experienced three different, randomly selected stimulus pairs. The design counterbalanced within participants which of the features matched and which features differed. This means that each feature differed in one of the pairs of type one-large (and matched in the two remaining pairs) and matched in one of the pairs of types twolarge and two-small (and differed in the two remaining pairs). For all critical types that had at least two differing features (i.e., all except for the one-large type), the first stimulus had a larger value and a smaller value than the second stimulus on at least one feature each.

Materials
The material resembled those in Experiment 1, but features had five possible values and differed in both color and shape to increase feature discrimination (see Fig. 7). The color-feature and shape-feature associations were randomized across participants.

Procedure
Participants' task was to judge the similarity of pairs of stimuli on a visual slider ranging from ''completely different'' to ''identical'' (see Fig. 7). At the beginning of the experiment, participants were informed about a phase with time pressure and a phase without time pressure and were familiarized with the feature values, the feature value combinations, and the slider. They were told to respond at the ''identical'' mark for identical stimuli and that their subjective similarity assessment should determine their response for all non-identical stimuli. On any trial, participants could first contemplate the stimulus pair (contemplation stage). Then, they clicked on a button to make the stimulus pair disappear and reach the slider, and finally clicked on the slider to enter their similarity judgment (response stage). If participants exceeded any possible deadline to click on the button that leads to the slider or to enter their judgment on the slider, the trial expired and participants continued with the next trial after 500 ms.
Familiarization and test phases. Participants first familiarized themselves with the task by judging the similarity of 34 familiarization stimulus pairs in random order with no contemplation deadline and a 3 s response deadline to prevent further processing of the stimulus pair after having accessed the slider. After familiarization, participants judged the similarity of new stimulus pairs in two test phases, with counterbalanced order of the test phases across participants. One test phase was without time pressure as during familiarization, and the other test phase contained participant-specific time pressure in which the contemplation and response deadlines equaled 50% of the median F.I. Seitz et al. Note. To make predictions, the values of , , and were set to 1. Attention was allocated equally to all features (all = 1∕3) in multidimensional models (MULTI) and to either a differing feature ( diff = 1) or a matching feature ( match = 1) in unidimensional models (UNI, attention focus). MINK = Minkowski model (elaborate similarity); DISC = discrete model (simplified similarity). contemplation time and the 90 ℎ percentile of the response times, respectively, of the last 15 familiarization trials. Participants read about the contemplation and response deadlines (in seconds) and familiarized themselves with time pressure by rating once all stimulus pairs from the familiarization phase just before the start of the test phase with time pressure. If either the contemplation or the response deadline was exceeded, the trial expired. Participants experienced the same stimulus pairs in both test phases. Specifically, participants rated the similarity of twelve stimulus pairs (i.e., three different pairs for each of the four critical types) four times per test phase; the sequence within each of these four blocks was randomized. Additionally, the nine pairs of the all-equal and all-different types were interspersed twice across all four blocks, resulting in a total of 66 trials per test phase.

Results
In the test phase without time pressure, participants contemplated the stimulus pairs on average M = 4044.96 ms (Md = 3062, SD = 3941.15) before accessing the slider; in contrast, in the test phase with time pressure contemplation time was on average restricted to M = 2017.61 ms (Md = 1937.25, SD = 685.32). Similar to Experiment 1, participants exceeded the time limit on average in 15% of the practice trials (Md = .12, SD = .09), but responded in time to almost all test trials (misses: M = .06, Md = .05, SD = .05), indicating familiarization to the time pressure. Participants used the complete response scale from 0 (completely different) to 1 (identical) to rate the similarity of the stimulus pairs (across participants, the minimum rating was M = .02, Md = 0, SD = .06, and the maximum rating was M = 1, Md = 1, SD = 0). Yet, we rescaled each participant's ratings to the range of 0 (lowest rating) to 1 (highest rating) to ensure comparability. 7

Similarity judgments of the critical stimulus pair types
In general, participants judged the critical stimulus pairs to be slightly more similar in the condition with time pressure (M = .50, Md = .50, SD = .11; values are computed across each participant's mean similarity rating) than without time pressure (M = .46, Md = .46, SD = .11), as was shown in a linear mixed model with identity link, see Table 6 for the resulting regression coefficients. 8 The coefficients indicate that participants responded to the critical stimulus pair types differently depending on the time pressure condition, and the model outperformed a restricted model without the interaction between time pressure condition and stimulus pair type in a likelihood ratio test, 2 (3) = 86.19, < .001. 7 We deviated from the preregistered rescaling to the range of 0 to 100, as model predictions are within the range of 0 to 1. 8 To achieve convergence, the number of estimated parameters was again decreased with respects to the preregistration by estimating only a random intercept but no random slopes per participant. We also excluded the order of the test phases as predictor, as it did not influence participants' similarity judgments. The final model had the participant response as criterion, the time pressure condition, the stimulus pair type (with four levels), and the interaction thereof as fixed effects, and a participant-wise random intercept. Note. The model used sum-to-zero contrasts (Singmann & Kellen, 2017), meaning the fixed effects are the differences from the grand mean intercept (e.g., for type ''two-large'' in the time pressure condition: 0.48 + 0.02 − 0.17 − 0.01 = 0.32).
Irrespective of time pressure, participants' mean similarity judgments were highest for ''two-small'' pairs and lowest for ''two-large'' pairs (see Fig. 8). This response pattern is in line with the predictions of the multidimensional Minkowski model (see Table 5 and Fig. 8) and is reflected by the coefficients derived from the linear mixed model in Table 6 32 in the test phase with time pressure. All pairwise coefficient differences within test phases were reliable in planned post-hoc contrasts with the Holm-Bonferroni alpha-level correction (Holm, 1979), s <.001, except for the difference between ''all-small'' and ''one-large'' pairs in the phase without time pressure, t.ratio(5897) = −0.08, = .94. Corroborating the results of Experiment 1, we thus do not find evidence that time pressure changed the way participants compute psychological similarity. Rather, elaborate similarities as postulated by the multidimensional Minkowski model seem to describe participants' similarity judgments well both in conditions with and without time pressure.
To investigate choice sensitivity, we conducted exploratory analyses of the variability of participants' similarity judgments within each time pressure condition. Specifically, within each time pressure condition we calculated the standard deviation of the similarity judgments per participant and stimulus type (resulting in one variability score across 12 similarity judgments per participant, stimulus type, and time pressure condition). The variability scores were found to be higher in the condition with time pressure (M = .17, Md = .17, SD = .07) than in the one without time pressure (M = .14, Md = .13, SD = .06), suggesting that the onset of time pressure induced more variability in participants' similarity judgments, (255) = 7.67, 95% = [0.03, 0.05], < .001.
Cognitive modeling. The same cognitive modeling framework as in Experiment 1 was used; however, as in this experiment choice sensitivity was modeled with a normal distribution, the different models now had the free variance parameter (with ≥ 0) instead of the softmax parameter . The free model parameters were fit to individual participants' test phase data, separately for both time pressure conditions,  using a constrained leave-p-out cross-validation (Shao, 1993). 9 The constraint required the hold-out data to contain all data from the filler types and half of the data from the four critical types (2 out of 4 judgments per stimulus pair) with 1296 cross-validation sets per participant and test phase, each containing 24 fitting trials and 42 hold-out trials.
The parameter estimates of the multidimensional models, aggre- Similar to Experiment 1, we compared the performance of the cognitive models and of a baseline random choice model (predicting a uniformly distributed response probability, again abbreviated as RANDOM) to test if time pressure leads to an attention focus or a simplified similarity. For each model and participant, the optimal parameter estimates in each cross-validation set were used to make predictions and compute the log likelihood for the critical types of the hold-out data (the filler types do not discriminate between models and were thus discarded). The log likelihoods were then averaged across the 1296 cross-validation sets for each participant and time pressure condition using the median. Based on the resulting log likelihoods (one per participant, time pressure condition, and model), we conducted the same pairwise model comparison procedure as in Experiment 1, computing evidence ratios of the models' Akaike weights (Wagenmakers & Farrell, 2004).

Model comparison at the aggregate level.
At the aggregate level, the evidence ratios were computed using each model's mean Akaike weight across participants, like in Experiment 1. The results from the pairwise model comparisons show that the multidimensional Minkowski model predicts the out-of-sample similarity judgments well, with an evidence strength of ( ) = .67 in the condition without time pressure and ( ) = .72 in the condition with time pressure (see Table 7 for the fit indices). 10 The remaining models implementing a form of coping were not able to describe the aggregate responses in the test phase with time pressure (nor in the test phase without time pressure, see the aggregate rank order in Table 7). Again, we thus do not find that time pressure causes an attention focus or a simplified similarity, and the good performance of the multidimensional Minkowski model is in line with previous literature (Nosofsky, 1984(Nosofsky, , 1986(Nosofsky, , 1989.

Model comparison at the individual level.
At the individual level, the evidence ratios were computed using participants' individual Akaike weights for the hold-out data (shown in Fig. 9), and each participant was assigned to the model with the highest Akaike weight (like in Experiment 1, the Akaike weight had to exceed .70, for the results with the preregistered criterion of .90, see Appendix C). The results corroborate the finding from the aggregate analyses by showing that the multidimensional Minkowski model describes the majority of the 64 participants (n = 41, 64.06%, in the condition without time pressure and n = 45, 70.31%, in the condition with time pressure, see Fig. 9).
For the condition with time pressure, a similar rank order of models was observed: multidimensional Minkowski model (n = 45, 70.31%) > random choice model (n = 11, 17.19%) > multidimensional discrete model (n = 5, 7.81%) > unidimensional Minkowski model (n = 0, 0%) = unidimensional discrete model (n = 0, 0%); the remaining participants could not be assigned to a model (n = 3, 4.69%). Again, the number of described participants varied across models, 2 (2) = 45.77, < .001, in 10 In the results with the preregistered Akaike weight of the median log likelihood, the multidimensional Minkowski model strongly outperforms all remaining models with an evidence strength of ( ) = 1 in the condition without time pressure and ( ) = .99 in the condition with time pressure.    In line with our findings from Experiment 1, the models performed similarly across conditions, corroborated by a Fisher's Exact Test for Count Data, = .08, suggesting that similar cognitive processes comprising elaborate similarities in line with a Minkowski model may have driven similarity judgments irrespective of time pressure.
Exploratory choice sensitivity analyses. To gain deeper insights into people's choice sensitivity under time pressure, we conducted exploratory analyses using the variance parameter of the normal distribution that mapped the model predictions to the similarity judgments. For each time pressure condition, the analyses were conducted on the participants described by one of the versions of the generalized context model (n = 53 in the condition without time pressure and n = 50 in the condition with time pressure; among them n = 44 participants were described by a version of the generalized context model in both time pressure conditions). For each of these participants, we computed the median estimate for the variance parameter of the respective model version across the cross-validations, resulting in 53 values in the condition without time pressure and 50 values in the condition with time pressure. Participants' median estimates for are higher in the condition with time pressure (M = .16, Md = .16, SD = .03) than in the condition without time pressure (M = .14, Md = .14, SD = .03), indicating that the onset of time pressure induced more variance in similarity judgments, (99.43) = 3.36, 95% = [0.01, 0.03], = .001 (a paired t-test using only the 44 participants described in both time pressure conditions by a version of the generalized context model was significant, too, p < .001).

Discussion
In our second experiment, we used a similarity judgment task to directly investigate if the psychological similarity is simplified given time pressure in similarity-based cognitive processes. Clarifying the findings in Experiment 1, the results from inferential statistics and cognitive modeling suggest with strong evidence that the onset of time pressure reduces choice sensitivity. We do not find a qualitative change in psychological similarity such as an attention focus or a simplified similarity. Rather, a majority of participants seemed to adopt a cognitive process based on elaborate similarities (the multidimensional Minkowski model; 64.06% in the condition without time pressure and 70.31% in the condition with time pressure), but translated psychological similarity less consistently to a manifest judgment in the condition with time pressure.

Experiment 3: Generalizing to stronger time pressure levels
The results of Experiments 1 and 2 suggest that when multiple salient features are relevant to solving an inference task time pressure affects people's inferences quantitatively by making responses more variable (a lower choice sensitivity) rather than qualitatively (in terms of an attention focus or a simplified similarity). However, these experiments tested only one relatively mild level of time pressure (i.e., 30% of each participant's median response time during learning in Experiment 1 and 50% of each participant's median contemplation time during familiarization in Experiment 2). To generalize our findings and test to what extent time pressure makes similarity-based inferences more random, we conducted a replication of the similarity judgment task, but varied time pressure more strongly with 15% and 30% of each participant's median contemplation time during familiarization (subsequently called 15%-condition and 30%-condition in contrast to the time pressure condition of Experiment 2, which we will call the 50%-condition).
There are multiple ways how more time pressure could make similarity judgments more random. More time pressure may simply increase the variability of people's responses, leaving the mean responses unchanged (e.g., Olschewski & Rieskamp, 2021). This entails that people rate the similarity of a stimulus type on average the same irrespective of time pressure, but get more noisy as time pressure increases. Additionally, more time pressure could make people's similarity judgments less extreme (e.g., Ashourian & Loewenstein, 2011). This means that people's responses drift towards the center of the response scale as time pressure increases, ultimately leveling out the differences across stimulus types. To this end, Experiment 3 tests how various levels of time pressure affect the distribution of people's similarity judgments, extending the results from the first two experiments.

Participants
Based on Experiment 2, we aimed for data of 120 participants ( = 60 in each time pressure condition). In total, 121 participants recruited online participated in the laboratory experiment, which was approved by the ethics board of the psychology department of the University of Basel (#025-18-7). Ten participants were excluded from data analysis, as they reported the task was somewhat or absolutely unclear to them. This leaves a final sample of = 111 (77 women, = 29.06 years, = 10.45 years, age range: 18-61 years), with = 55 in the 30%-condition and = 56 in the 15%-condition.

Design, materials, and procedure
Experiment 3 used the same design, materials, and procedure as Experiment 2 (see Table 5 and Fig. 7), except that in the test phase with participant-specific time pressure, the contemplation deadlines equaled 15% or 30% of the median contemplation time of the last 15 familiarization trials.

Similarity judgments of the critical stimulus types
Participants by and large judged the critical stimulus pairs similarly across time pressure conditions, with the largest similarity ratings given to ''two-small'' pairs, then to ''all-small'' pairs, then to ''one-large'' pairs, and finally to ''two-large'' pairs (see Table 8, upper half). As for Experiment 2, this response pattern is in line with the predictions of an elaborate similarity that integrates continuous feature value differences across multiple features (see the multidimensional Minkowski model in Table 5). This response pattern proved robust in a linear mixed model with identity link that had the participant response as criterion, the time pressure condition (four levels), stimulus pair type (four levels), and the interaction thereof as fixed effects, and a participantwise random intercept. The coefficients derived from the model were 43 in the 15%-condition. All pairwise contrasts between the stimulus pairs of a time pressure condition were reliable, with Holm-Bonferroni corrected s ≤.02. Furthermore, participants' ratings did not differ between the 50%-condition and 30%-condition, with = 1 for ''two-small'' pairs, = .50 for ''all-small'' pairs, = .31 for ''one-large'' pairs, and = .31 for ''two-large'' pairs (all s Holm-Bonferroni corrected). Only the 15%-condition differed from the remaining time pressure conditions, and Table 8 indicates that participants' similarity judgments became less extreme in this condition, reducing the judgment differences across stimulus types.
The smaller judgment differences across stimulus types in the 15%condition compared to the other conditions proved robust in interaction contrasts, all Holm-Bonferroni corrected s < .001, except for the ''two-large''-''one-large'' difference between the 15%-condition and the 30%-condition ( = .10) and between the 15%-condition and the 50%condition ( = .08), the ''all-small''-''two-small'' difference between the 15%-condition and the 30%-condition ( = 1) and between the 15%condition and the 50%-condition ( = 1), the ''one-large''-''two-small'' difference between the 15%-condition and the condition without time pressure ( = 1), and the ''one-large''-''all-small'' difference between the 15%-condition and the condition without time pressure ( = 1). The reduction of judgment differences across stimulus types was specific to the 15%-condition, the only other cases in which more time pressure reliably reduced judgment differences across stimulus types was for the ''two-large''-''one-large'' difference between the 30%-condition and the condition without time pressure ( < .001) and between the 50%condition and the condition without time pressure ( < .001) as well as for the ''all-small''-''two-small'' difference between the 30%-condition and the condition without time pressure ( = .003) and between the 50%-condition and the condition without time pressure ( = .01). These findings suggest that participants judged the stimulus pairs similarly across conditions, yet less extremely in the strongest time pressure condition, reducing judgment differences across stimulus types.  Note. The table shows the mean (upper half) and the standard deviation (lower half) of individual participants' similarity ratings for different time pressure conditions and stimulus types, further aggregated across participants using the mean. Note. Coefficients were calculated using treatment contrasts (Singmann & Kellen, 2017), to highlight how the time pressure conditions increase response variability in comparison to the baseline condition without time pressure.
In the following, we will analyze how more time pressure affects the variability of people's similarity judgments. The lower half of Table 8 suggests that the variability of individual participants' similarity judgments, measured as the standard deviation across stimulus pairs of the same type, increased with time pressure, seeming to peak in the 30%-condition. This was corroborated in a linear mixed model with identity link that had the standard deviation of the similarity judgments per participant as criterion, the time pressure condition (with four levels) and the stimulus pair type (with four levels) as fixed effects, and a participant-wise random intercept. The resulting regression coefficients in Table 9 indicate that participants judged the stimulus pairs more variably in each time pressure condition than in the baseline condition without time pressure. The coefficients derived from the linear mixed model reflect this, with { 30 = .19; 15 = .18; 50 = .17} > = .13 for the ''two-small '', ''all-small'', and ''two-large'' types, and { 30 = .20; 15 = .19; 50 = .18} > = .14 for the ''onelarge'' type. The coefficients of the condition without time pressure differed reliably from the coefficients of any time pressure condition in planned post-hoc contrasts with the Holm-Bonferroni alpha-level correction (Holm, 1979), s <.001. In contrast, regarding the conditions with time pressure the pairwise coefficient differences were reliable in planned post-hoc contrasts only between the 30%-condition and the 50%-condition, t.ratio(1338) = 2.86, = .01, but not between the 15%-condition and the 30%-condition, t.ratio(1338) = −1.14, = .25, or between the 15%-condition and the 50%-condition, t.ratio(1338) = 1.67, = .19. This suggests that people's response variability increases with time pressure and plateaus at relatively strong, but not necessarily maximal, time pressure.
The results so far suggest that participants' similarity judgments do not change but become more variable with increasing time pressureat least up to the 30%-condition. Interestingly, in the 15%-condition participants' responding seems to change somewhat; the judgment differences across stimulus types are less pronounced, and the rating variability within a stimulus type stagnates (see Table 8). One possibility for this responding could be that the time pressure is so high in the 15%-condition that participants start failing to evaluate similarity and instead start clicking on an easily accessible position on the slider (e.g., the center of the scale) repeatedly. To test this possibility, another linear mixed model with identity link was run which had the participant response's closeness to the response scale center (1 − | − .5|) as criterion. For some stimulus types, participants' mean similarity judgments were already approaching .5 in the condition without time pressure (see Table 8) and could thus hardly get any more moderate in the 15%-condition. Therefore, the model also included the ''all-equal'' and ''all-different'' (filler) stimulus types, for which the similarity ratings in the condition without time pressure differed substantially from .5. The ''all-different'' type was further split by the amount by which stimuli differed on each feature (i.e., 1, 3, or 4 leading to ''all-1-different'', ''all-3-different'', and ''all-4-different'' types). The final model had the time pressure condition (with four levels), the stimulus pair type (now with eight levels), and the interaction thereof as fixed effects, and a participant-wise random intercept.
In line with the previous results, only the 15%-condition shifted participants' similarity ratings to the center of the response scale, with reliable contrasts between the 15%-condition and any other condition for all stimulus types that were not rated close to the center of the response scale in the condition without time pressure (i.e., the ''two-large '', ''all-equal'', ''all-1-different'', ''all-3-different'', and ''all-4different'' types), Holm-Bonferroni corrected s ≤.02. To exemplify, the resulting coefficients in the 15%-condition and the condition without time pressure were 15 = .79 > = .76 for the ''two-large'' type, 15 = .73 > = .52 for the ''all-equal'' type, 15 = .78 > = .75 for the ''all-1-different'' type, 15 = .75 > = .67 for the ''all-3-different'' type, and 15 = .72 > = .58 for the ''all-4-different'' type. In contrast, the similarity judgments for the remaining stimulus types were already close to the response scale center in the condition without time pressure, namely = .82 > 15 = .78 for the ''onelarge'' type, = 1; = .82 = 15 = .81 for the ''all-small'' type, = .91; and = .81 > 15 = .78 for the ''two-small'' type, = 1. Combining all results suggests that time pressure at first increases people's response variability (conform to a reduced choice sensitivity) and then starts shifting people's judgments to the center of the response scale, explaining why response variability might have plateaued in the 30%-condition.

Cognitive modeling
The same cognitive modeling framework as in Experiment 2 was used to make additional analyses with the variance parameter of the normal distribution that mapped the cognitive model predictions to the similarity judgments. Again, for each condition the analyses were conducted on the participants described by one of the versions of the generalized context model (n = 152, 86.86% in the condition without F.I. Seitz et al. time pressure, n = 50, 78.13% in the 50%-condition, n = 32, 58.18% in the 30%-condition, n = 21, 37.5% in the 15%-condition, see Appendix C for the complete numbers of participants described by the individual models). Participants' estimates for were M = .14, Md = .14, SD = .03 in the condition without time pressure, M = .16, Md = .16, SD = .03 in the 50%-condition, M = .16, Md = .17, SD = .03 in the 30%-condition, M = .16, Md = .16, SD = .03 in the 15%-condition. Again, the onset of time pressure induced more variance in similarity judgments, showing in the form of a higher value for . This was reflected by the coefficients from a linear mixed model with identity link that takes participants' estimates for as criterion, the experimental condition as fixed effect (four levels), and a participant-wise random intercept, with 30 = .17 = 15 = .16 = 50 = .16 > = .14. All pairwise coefficient differences between the condition without time pressure and a condition with time pressure were reliable in planned post-hoc contrasts with the Holm-Bonferroni alpha-level correction, s ≤.006. However, there were no credible differences between the coefficients of different time pressure conditions, all s = 1. This analysis thus corroborates the evidence that time pressure lowers people's choice sensitivity (thereby making their responses more variable), but does not indicate that the responses are most variable in the strongest time pressure condition (see also  Table 9).

Discussion
Our third experiment generalized the findings from the first two experiments to other time pressure levels. We replicated the similarity judgment task of Experiment 2 and manipulated time pressure more strongly to equal 30% and 15% of each participant's median contemplation time during familiarization. Again, the results suggest that the onset of time pressure reduces choice sensitivity. The variability of participants' similarity judgments increased with time pressure, but stagnated at a relatively strong time pressure level, probably because participants started to fail evaluating similarity and instead just repeatedly began clicking on the easily accessible center of the response scale, reducing the judgment differences across stimulus types.

General discussion
In a categorization task (Experiment 1) and a similarity judgment task (Experiments 2 and 3), we investigated how people cope with cognitive load, implemented as time pressure, during similarity-based cognitive inferences. While previous research has shown that restricting cognitive capacities is associated with reduced effort in information intake (attention to fewer features, e.g., Lamberts & Brockdorff, 1997) and reduced effort in response selection (lower choice sensitivity and precision, e.g., Olschewski & Rieskamp, 2021), this article additionally tested if time pressure reduces the complexity of the psychological similarity assessment in the sense of a simplified similarity measure between objects. The simplified similarity measure assesses only if the feature values of a pair of objects are identical or not and computes a binary ''same-or-different'' value per feature. This same-or-different value differs from a continuous, metric difference among two feature values, which is traditionally assumed by more elaborate psychological similarity measures. We compared these three strategies for coping with time pressure (focus of attention, lower choice sensitivity, and simplified similarities) at the aggregate and individual levels using inferential statistics as well as cognitive modeling within the framework of the generalized context model (Nosofsky, 1986). In what follows, we summarize our main findings and their implications for theory. First, in our experiments we find no evidence that time pressure affects how psychological similarity is computed when multiple features are relevant to solving a similarity-based inference task. We found no credible evidence that time pressure is associated with a reduction of effort in the sense of information intake (attention focus) or information processing (simplified similarities). For instance, participants with time pressure judged stimulus pairs to be more similar if they differed by a small amount on each feature (all-small type) than if they differed on only one feature but by a large amount (one-large type). This behavior is not in line with the simplified similarity which focuses only on the number of differing features. It is also difficult to reconcile with an attention focus on one feature which for all but one case attends to a matching feature in one-large pairs and would therefore on average assign a higher similarity to one-large pairs than to all-small pairs (see Table 5). In turn, this behavior is compatible with elaborate similarities which integrate the number of differing features and the precise feature value differences. Accordingly, similarity judgments under time pressure were better described by a cognitive model implementing elaborate similarities (the multidimensional Minkowski model) than by a model implementing simplified similarities (the multidimensional discrete model) or an attention focus (the unidimensional Minkowski model), see Appendix C. Although less clear, the results from the categorization experiment point in the same direction: None of the cognitive models implementing an attention focus or simplified similarities were able to outperform the multidimensional Minkowski model in the time pressure condition, nor were they able to consistently describe more participants with than without time pressure. Yet, it is important to note that in the categorization experiment we found evidence that quite a few people focus their attention irrespective of time pressure (the unidimensional Minkowski Model; n = 11 out of 31 participants without time pressure and n = 6 out of 30 participants with time pressure). Furthermore, past research has shown that people can focus their attention to cope with time pressure (Lamberts, 2002;Lamberts & Brockdorff, 1997;Milton et al., 2011;Wills et al., 2015a), in particular to features that are perceptually more salient or more diagnostic for category membership (e.g., Lamberts, 1995). Our results suggest that time pressure leads to an attention focus only in specific settings, and that otherwise psychological similarity remains substantially elaborate by keeping to integrate the continuous feature value differences of multiple features.

Time pressure lowers choice sensitivity, making choices more variable
Second, although we did not find that time pressure affects psychological similarity itself in our tasks, time pressure seemed to lower the precision with which people mapped psychological similarity to a response. In other words, our experiments suggest that time pressure reduces the effort at the response selection stage, leading to less deterministic and more variable categorizations and similarity judgments. For example, participants with time pressure tended to categorize the individual transfer stimuli in the categorization task less deterministically than participants without time pressure (see Fig. 5). This behavior suggests that participants with time pressure may have been less sensitive to the differences between alternative categories, in line with a lower choice sensitivity. In a similar vein, as time pressure increased, participants rated the similarity of stimulus pairs more variably, to a point where variability stagnated at high time pressure, probably because participants began to click only on easily accessible central positions of the slider. These findings suggest that time pressure lowered the precision with which people respond and are in line with related research in other cognitive domains, such as risky choices with a dual task (e.g., Olschewski et al., 2018). More generally, our findings add to growing evidence that time pressure (and other forms of cognitive load) do not necessarily change the way core cognitive processes such as similarity assessments are performed but rather lower the precision of response selection (Olschewski & Rieskamp, 2021). F.I. Seitz et al. 7.1.3. The results are clearer for similarity judgments than for categorizations Third, whereas time pressure seemed to consistently reduce choice sensitivity in our experiments, we found some differences between categorizations and similarity judgments. While the results in the similarity judgment experiments suggest that people compute elaborate similarities irrespective of time pressure, the results in the categorization experiment were more mixed in this regard. Specifically, no model was able to reliably outperform the remaining models in the condition with time pressure, complicating an evident conclusion for coping with time pressure in similarity-based categorizations. The cognitive differences between psychological similarity and categorization themselves are subject to a research tradition (e.g., Clapper, 2019), that has been initiated with the study of Rips (1989), showing that an object can be similar to one category yet be more likely to belong to a second category. This raises the question to what extent time pressure equally affects similarity judgments and categorizations. One difference between our experiments was that the perceived psychological similarity could directly be reported in the similarity judgment task but needed to be further processed to perform the categorization task (e.g., the similarities to the exemplars needed to be integrated into a summed similarity to each category). Provided that this further processing of similarity during categorization is time-consuming, it is conceivable that time pressure affects categorization in other ways than through pure similarity computation (e.g., reducing the number of exemplars retrieved from memory). Furthermore, contrasting with the complex categorization process, it has been argued that similarity computation can be a automatic, nonanalytic process that requires few cognitive capacities (e.g., Ward, 1983, but see Wills et al., 2015a for an alternative viewpoint). As such, our results correspond to previous literature, showing that time pressure did not primarily affect the computation of similarity but rather the precision with which responses are selected.

Limitations and further research
In the following, we discuss potential limitations of the present experiments and how these limitations may be addressed by future research.

Manipulation of cognitive load
Concerning the experimental induction of cognitive load, our studies used one specific form of cognitive load, namely a participantspecific time pressure. By determining the precise amount of time pressure separately for each participant based on the participant's previous reaction times, we maximized the probability that the subjective level of time pressure was comparable across participants within each experiment. While this procedure has the advantage of reflecting the cognitive processes involved at one precise level of time pressure with a high degree of comparability across participants, our design did not use various levels of time pressure for one participant (as for instance did Wills et al., 2015a). However, past findings have shown that psychological similarity may depend on the precise amount of time pressure induced (e.g., Lamberts & Brockdorff, 1997;Wills et al., 2015a) and that this dependency can be non-monotonic (e.g., Milton et al., 2008). Furthermore, our first two experiments used a rather moderate time pressure, compared to some other research with similar designs (e.g., Lamberts & Brockdorff, 1997). Experiment 3 aimed to mitigate this drawback and generalized the similarity judgment task to stronger time pressure levels, corroborating our results from the first two studies. Future research could test to what extent our findings also generalize to different forms of cognitive load (e.g., a concurrent task).

Choice of material
In our experiments, we used objects that consist of perceptually separable and equally salient features with multiple discrete values. Using multivalued features was a deliberate choice to discriminate elaborate from simplified similarities and to extend previous research which primarily focused on objects with binary features (e.g., Lamberts, 1995;Milton et al., 2008Milton et al., , 2011Wills, Milton, Longmore, Hester, & Robinson, 2013; but see Lamberts & Brockdorff, 1997;Wills et al., 2015a). Discrete feature values minimize the risk that cognitive models process psychologically unnoticeable feature value differences. However, by being directly countable (see e.g., Fig. 4) discrete feature values might lower any potential differences in the costs of computing continuous feature value differences versus binary feature-wise ''same-or-different'' checks, which might reduce the potential computational benefit of simplified similarities. Future research may investigate to what extent our findings generalize to objects with continuous, integral features (e.g., colors), which, compared to discrete, separable features, might increase the computational challenge of assessing elaborate similarities. Furthermore, the features of our stimuli had similar perceptual salience and were often equally relevant to solve the experimental task. In such a case, people might be reluctant to focus their attention and favor one feature over the others. In other tasks, in which some features for instance are more diagnostic for category membership (e.g., a rule-based task), people with low cognitive capacities available might readily focus their attention to the relevant features. In other words, multiple external characteristics such as the task environment and the choice of material act on similarity-based inferences-dependencies that require further research in order to get a full picture of how people compute similarity under cognitive load.

Relation to other models
From a methodological perspective, we conducted all cognitive modeling within the framework of the generalized context model (Nosofsky, 1986). The generalized context model belongs to the most influential categorization models (Kruschke, 2008) and assumes that people use the psychological similarity between objects for category inference; therefore, it presents a natural modeling framework for investigating if cognitive load induces changes in psychological similarity. Still, it is important to underline that alternative approaches, which are not necessarily based on the psychological similarity of objects, can describe cognitive inferences. For instance, rule-based models (e.g.,  define categories by their content, meaning that an object needs to satisfy a set of features in order to belong to a specific category (Kruschke, 2008). To address this issue, we implemented a formal decision tree that can perfectly learn the category structure in Experiment 1; however, the decision tree failed to accurately predict participants' categorizations for the new transfer stimuli. Furthermore, the similarity judgment task used in Experiment 2 directly limited the applicability of cognitive processes that are not based on similarity. In light of this methodological choice, our findings suggest that people do use psychological similarity to make cognitive inferences and that they assess similarity by relying mostly on continuous object comparisons irrespective of time pressure. Nevertheless, future research will be helpful to see to what extent these findings generalize to different categorization tasks (e.g., when categories can be readily separated by a unidimensional rule).

Conclusion
We investigated within the framework of the generalized context model (Nosofsky, 1986) if people with cognitive load simplify the way they assess similarity from continuous feature value differences to binary ''same-or-different'' assessments between objects. Results from a categorization experiment and two similarity judgment experiments showed no evidence that people compute simplified similarities under cognitive load. Rather, people seem to compute elaborate similarities akin to those they presumably use without cognitive load. Yet, people under cognitive load select responses with less choice sensitivity and thus with lower precision. On a large scale, we find no evidence that the amount of cognitive capacities available strongly influences the computation of psychological similarity when multiple features are relevant for an inference task. Rather, people seem to assess similarity in a consistent and stable way, and cognitive load influences at most the consistency with which people map their similarity representation to a manifest response.

CRediT authorship contribution statement
Florian I. Seitz: Developed the study concepts, contributed to the design, performed the data collection, analysis, and interpretation, and wrote the manuscript. Bettina von Helversen: Developed the study concepts, contributed to the design, and wrote the manuscript. Rebecca Albrecht: Developed the study concepts, contributed to the design, and provided significant revisions to the manuscript. Jörg Rieskamp: Developed the study concepts, contributed to the design, and provided significant revisions to the manuscript. Jana B. Jarecki: Developed the study concepts, contributed to the design, and wrote the manuscript.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
The data, analysis code, experimental code, and preregistrations are publicly available at https://osf.io/94e6u/.