Marine Cleaning Mutualism Defies Standard Logic of Supply and Demand

Supply and demand affect the values of goods exchanged in cooperative trades. Studies of humans and other species typically describe the standard scenario that an increase in demand leads to a higher price. Here, we challenge the generality of that logic with empirical data and a theoretical model. In our study system, “client” fishes visit cleaner wrasse (Labroides dimidiatus) to have ectoparasites removed, but cleaners prefer client mucus, which constitutes “cheating.” We removed 31 of 65 preselected cleaners from a large isolated reef patch. We compared cleaner-client interactions at the reef and a control reef before removal and 4 weeks after removal. Cleaner fish from the experimental treatment site interacted more frequently with large clients (typically visitors with access to alternative cleaning stations), but we did not observe any changes in service quality measures. A game-theoretic analysis revealed that interaction duration and service quality might increase, decrease, or remain unchanged depending on the precise relationships between key parameters, such as the marginal benefits of cheating as a function of satiation or the likelihood of clients responding to cheating as a function of market conditions. The analyses show that the principle of diminishing return may affect exchanges in ways not predicted by supply-to-demand ratios.


Introduction
The values of goods exchanged in human economic markets follow the rules of supply and demand (Smith 1776), wherein goods in high demand become more expensive. How supply and demand predict payoff distribution among cooperating individuals in human markets has been successfully applied to other species within the framework of biological market theory (Noë et al. 1991;Noë and Hammerstein 1994). The theory focuses on how partner choice shapes payoff distributions among partners in cooperative interactions (for reviews, see Barclay 2016;Hammerstein and Noë 2016). Mathematical models of biological market theory typically predicted that an increase in the demand for a given service or commodity would cause an increase in the price through partner choice (Noë and Hammerstein 1995;Schwartz and Hoeksema 1998;Johnstone and Bshary 2008;Mazancourt and Schwartz 2010;Akcay et al. 2012; for partial exceptions, see Wyatt et al. 2014). While these models invariably explore evolutionary timescales, evidence from short-term empirical studies also conforms to the demandprice relationship. Examples range from interspecific mutualisms involving mycorrhiza and plant interactions (Kiers et al. 2011), fig trees and their pollinators (Herre and West 1997), and lycaenid larvae and ants (Leimar and Axén 1993) to intraspecific interactions in meerkats (Kutsukake and Clutton-Brock 2010), hyenas (Smith et al. 2007), Mediterranean wrasse (Hellmann et al. 2020), and paper wasps (Grinsted and Field 2017), as well as in mating markets (Metz et al. 2007;Noë 2017) and primate grooming markets (Barrett and Henzi 2001;Fruteau et al. 2009).
Various experiments are designed to test crucial predictions of biological market theory (Leimar and Axén 1993;Bshary and Grutter 2002b), although studies that manipulate the number of one class of traders relative to the other class in the wild are rare (Fruteau et al. 2009). The marine cleaning mutualism involving the cleaner wrasse Labroides dimidiatus and its variety of "client" coral reef fish is a promising study system to manipulate the biological market over an extended period. Cleaners are obligate mutualists that feed on clients' ectoparasites and dead skin, with an interaction rate up to 2,000 times per day (Grutter 1996). This service benefits clients' health, growth, and abundance (Ros et al. , 2020Waldie et al. 2011;Triki et al. 2016;Demairé et al. 2020). Nevertheless, conflicts may arise when cleaners bite skin mucus, which is preferred over ectoparasites (Grutter and Bshary 2003;Oates et al. 2012). This behavior constitutes "cheating," because mucus protects the client's skin (Eckes et al. 2008), and biting events are noticeable by a client's sudden body jolt immediately after a skin-mouth touch, as opposed to no jolt at ectoparasites removal (Bshary and Grutter 2002a). Also, clients often terminate the interaction upon a cheating event, either by swimming away or by chasing the cleaner (Bshary and Grutter 2002b). These partner control mechanisms cause cleaners to eat more ectoparasites and refrain from biting the client's mucus (Bshary and Grutter 2005). On the other hand, cleaners manipulate clients' decisions by providing tactile stimulation with their pelvic fins (i.e., to stop passing clients that are unwilling to interact, prolong an ongoing interaction, or reconcile with a client after a cheating event; Bshary and Würth 2001). The provisioning of tactile stimulation is a veritable service, as it lowers client stress levels . Overall, cleaners do not bite all the time but rather maintain a certain frequency of mucus bites (Bshary 2001).
The L. dimidiatus mutualism has been repeatedly used to model biological market theory (Johnstone and Bshary 2008;Quiñones et al. 2020) and to test predictions derived from this theory. Foremost, it has been shown that cleaners give service priority to "visitor" species over "resident" species (Bshary 2001). The former are large-bodied fish with larger home ranges and hence access to several cleaning stations, while the latter are small-bodied fish with access to the local cleaner only (Bshary 2001). Cleaners prioritize visitors because these exert partner choice by switching to another cleaner if made to wait for the cleaning service (Soares et al. 2013). The costs for choosing a cleaner partner are tightly linked to the magnitude of the cleaning service supply, such that the cost is higher at a low supply level (i.e., a small number of cleaners). If cleaner densities are low, visitors become more willing to wait and lose preferential treatment (Triki et al. 2018. Partner choice also explains client responses to cheating. By lacking partner choice options, residents rely on aggressive chasing (Bshary and Grutter 2002b), whereas visitors switch to a different cleaner partner (Bshary and Schäffer 2002). Thus, cleaners must outbid each other (see Noë and Hammerstein 1995) to attract visitors by cheating less and giving them service priority. Also, individual cleaners can flexibly adjust their service quality to changes in their social environment (Triki et al. 2018Wismer et al. 2019). Indeed, cleaners ap-pear to be under selection to show social competence )-an individual's ability to adjust social behavior flexibly to local social conditions (Taborsky and Oliveira 2012). Therefore, we expected cleaners to be able to respond to experimental changes in market conditions adaptively.
In the current study, we conducted an experimental manipulation of the cleaner fish biological market in the wild and fitted two game-theoretical models to understand how such changes can affect cleaner fish behavior over time. For the experiment, we reduced cleaner density by ∼50% on one reef site (at Lizard Island, Australia), while an adjacent but distinct reef site served as a control. The two reefs were separated by at least 130 m of sandy space, preventing most coral reef fish species from crossing, including cleaners . We then tested whether the remaining cleaners on the experimental site would change their cleaning patterns and adjust their service quality after 1 month from the manipulation. We expected a decline in the supplied cleaning services, increasing the demand-to-supply ratio. First, we expected cleaners to become choosy and interact more often with visitors than residents, mainly because visitors offer larger food patches (Bshary 2001). Second, the average interaction duration should decrease based on the logic of the marginal value theorem (Charnov 1976), wherein clients as food patches will get depleted during a cleaning interaction and the optimal stopping point depends on when the next client invites for cleaning. In a higher-demand scenario, the intervals between interactions should become shorter, leading to shorter interactions. Third, the client jolt rate-as a correlate of cheating by cleaners-would increase. Fourth, cleaners would reduce the amount of tactile stimulation they provide to clients. Finally, we had no specific predictions regarding the amount of time cleaners would spend interacting. This is because we could not predict to what extent cleaners may aim for a higher food intake. If cleaners are generally limited in food intake, then the total cleaning time should increase. On the other hand, the increase in client demand for cleaning should cause an increase in the cleaners' foraging efficiency, and hence cleaners would need to spend less time cleaning to reach a similar energy input.
The game-theoretical models complemented the empirical part aimed at predicting how changes in supply and demand might affect two important cleaning interactions parameters: mean interaction duration and client jolt rate. We derived a reward function from the first principle that captures the diminishing marginal return to the cleaner with increasing interaction duration. We investigated how shorter intervals between interactions may negatively affect interaction duration, while an increase in client ectoparasite load may have positive effects. An increase in ectoparasite load due to reduced supply should also affect the marginal benefits of cheating. Cleaner fish cheating rates should further be influenced by clients showing increased tolerance and cleaners being on average more satiated.

Material and Methods
Empirical Study Field Site. Fieldwork was conducted at the Lizard Island Research Station (LIRS), Great Barrier Reef, Australia (147 40 0 08.0 00 S, 145727 0 34.0 00 E), between July and September 2015. The two study reef sites were Deep Vicki (14741 0 17.8 00 S, 145726 0 33.2 00 E) and Horseshoe (14741 0 13.2 00 S, 145726 0 37.9 00 E), and they cover a surface area of 7,000 m 2 and 8,700 m 2 , respectively. They were 135 m apart and entirely isolated from each other by a sandy area. Deep Vicki is separated from all nearest reefs by 1100 m, while horseshoe has a close neighboring reef (10-m minimal distance). The depth at these reefs ranges from 2 to 8 m.
Experimental Design. Deep Vicki and Horseshoe served as experimental and control reef sites, respectively. At either site, we randomly selected 23 female cleaners for repeated observations while ensuring their stations were evenly spread over each reef. Choosing female cleaners helped identify cleaners to the individual level based on their territory (i.e., cleaning station). On the other hand, males can be challenging to find repeatedly given that their territory is far larger than the females' (Robertson 1973). Therefore, each female was identified to the individual level by its cleaning station, tagged with a small numbered float attached by a fishing line and plastic zip tie. On July 23, two scuba divers estimated the natural population size of cleaners at Deep Vicki (hereafter, "reduced density site") by counting all adult cleaners, yielding a total of 66 cleaners. On the following day, the divers caught 31 not previously observed cleaners using a barrier net (2 m#1 m; 0.5-cm 2 mesh size) and hand nets, thus causing a reduction of cleaner densities by 47%. The Horseshoe reef (hereafter, "control site") remained undisturbed. The captured cleaners were transported and housed at LIRS facilities. We know from previous research that total cleaner removal can lead to a loss in client fish numbers after several months or even years after removal . We therefore opted for partial removal of cleaners and for a limited period (1 month). At the end of the experiment, captured cleaners were released at their respective capture sites, and all plastic tags were successfully removed from the two reef sites.
Behavioral Observations. Behavioral observations were collected in the format of video recordings of 30 min each. Videos were recorded between 08:30 and 16:00, using Canon G15 and Sony DCR-SR50 cameras. To assess potential changes in the cleaner-client interaction patterns, behavioral observations were collected at either site right before (July 21 and 22) and 1 month after (August 25) the manipulation. The order of data collection at either the reduced site or the control site was randomized, and the morning and afternoon sessions were counterbalanced. In total, we had repeated behavioral observations from the n p 22 female cleaners tagged at the control site and n p 23 at the reduced density site.
Afterward, from the video recordings we obtained the following information: genus and species of clients, interaction frequency, the average duration of a cleaner-client interaction, the proportion of time spent by a cleaner in interactions, the presence/absence of tactile stimulations by the cleaner fish, the frequency of client's jolt, and the client's response to a cleaner bite, such as chasing the cleaner, swimming away, or no reaction. Chasing a cleaner or swimming away was then summed up as the client's negative response, which was calculated as a proportion of the total jolts. Finally, clients were categorized into two classes: large-bodied clients (total body length more than 10 cm) and small-bodied fish (total body length 10 cm or less). For further details, see .
Statistical Analysis. All statistical analyses were carried out in R version 3.6.2 (R Core Team 2020). To test for potential changes in cleaner-client interactions, we first deducted all of the values obtained from behavioral observations collected before the experimental manipulation from those collected after the manipulation. This step resulted in negative and positive values in a way such that they could be interpreted as a decline or an increase, respectively, in the measured variable over the 1-month gap. In contrast, zero values indicate no change. We then fitted a set of linear regression models (LMs) using the type II sum of squares Anova() function in the car package in R (Fox et al. 2012). The response variables were the different behavioral observations describing the cleaner-client interaction patterns and quality, while treatment (i.e., control vs. reduced density) was the predictor variable. Post hoc analyses, using the emmeans() function in the emmeans package in R (Lenth and Lenth 2018), allowed comparison of reduced density with control as well as testing whether there was any change from the baseline behavioral recordings (i.e., null hypothesis of zero change). Model assumptions were assessed using standard diagnostic plots (quantile-quantile plots, plots of residuals vs. fitted values) and statistical tests (Kolmogorov-Smirnov test). We provide supplementary statistical tests comparing the two study reef sites before the manipulation to capture potential preexisting differences between the two sites in the supplemental PDF (available online).

Mathematical Modeling
The Optimal Interaction Duration. We make the simplifying assumption that only cleaners terminate interactions based on optimal foraging decisions, while clients are passive agents. In reality, clients may terminate the interaction prematurely upon cleaner cheating, and therefore the optimal interaction time from the perspective of cleaners predicted by our model should set an upper limit to the interaction's duration. During cleaner-client interactions, cleaners experience the law of diminishing return. As the parasite density on the client decreases, it takes longer and longer for a cleaner to remove one more parasite on the same client. Assuming that the reward-gaining rate of a cleaner from cleaning a given client is proportional to the remaining number of parasites on a client, we could derive that the accumulated reward for the cleaner over time follows the function where t is the time spent on cleaning and a and b are shape parameters (see the supplemental PDF for detailed derivations). Because of the diminishing marginal reward from cleaning the same client over time, the cleaner should terminate the current interaction at some point and start to search for the next client. Assuming that the cleaners try to maximize the efficiency of gaining reward, the optimal interaction time depends on the shape of the reward curve and the waiting time during which the cleaner searches for the next client.
Assuming that client density is constant, a sudden reduction in cleaner density can lead to a shortage of supply on the service market so that each cleaner now has more clients on average. Furthermore, as long as the remaining cleaners do not fully compensate for the reduction of conspecifics, the clients will be cleaned less and hence become better food patches because of parasite accumulation (Grutter 1996;Grutter et al. 2018;Demairé et al. 2020). This leads to a shorter waiting time for cleaners before finding the next client and a change in shape of the reward curve characterized by a slower/delayed diminishing of return (the new reward curve after cleaner density reduction should lay above the original curve). We analyzed whether the optimal interaction time increases or decreases after cleaner density reduction.
The Biting Rate Dynamics. A cleaner might dynamically balance the benefit and cost to adjust how often it bites (i.e., increasing the biting rate when the benefit of biting exceeds the cost and vice versa). We therefore model the dynamics of the biting rate r of cleaners using where B and C are functions representing the benefit and cost of biting, respectively. We use a negative Gompertz function to model the benefit of biting as satiation increases (eq. [3]). The function has the desirable feature of having a monotonically decreasing S shape as satiation (s) increases so that the rate of decrease is slower when a cleaner is either sufficiently hungry or satiated while the rate of decrease is faster at intermediate satiation. The parameter b 1 controls the minimal level of satiation before the benefit of biting starts to decrease, and g 1 controls the overall speed of decrease. The values of b 1 and g 1 are supposed to be speciesspecific features of the cleaner. Assuming that the satiation level is proportional to the amount of resources available to each competing individual, we model satiation simply as the reciprocal of cleaner density: We expect the cost of cheating by cleaners to decrease monotonically with the parasite load (p) on the clients with an S shape and therefore use a negative Gompertz function (eq. [4]) again to model the relationship. Here, the parameters b 2 and g 2 capture different patterns of client reaction to cleaner cheating (i.e., the way a client adjusts the severity of its negative responses [fleeing or chasing] when being bitten as a function of its current parasite load; Johnstone and Bshary 2008). Assuming that the total number of parasites in the environment is constant and clients are equally likely to be infected, we model the parasite load (p) on each individual as the reciprocal of cleaner density: Results

Empirical Results
The reduced density site and control site had some differences in cleaners' interaction patterns before the experimental manipulation. These differences were found in the interactions with small clients and not with large clients. As such, cleaners at the control site often had higher cleaning interaction frequency, longer time spent cleaning, more interactions with tactical stimulations, and more chasing and switching, but only with small clients (see the supplemental PDF; fig. S1; table S1; figs. S1-S5, table S1 are available online). One month after the cleaner density reduction, we found that the only change was the increase in the frequency of interaction with large clients at the reduced density site (emmeans: N p 23, estimate p 23:8, t ratio p 2:597, P p :012; fig. 1a), with no change at the control site (emmeans: N p 22, estimate p 20:56, t ratio p 20:06, P p :952). However, when comparing the reduced density site directly to the control site, the probability of this increase showed only a trend (LM: N p 45, F 1, 43 p 3:54, P p :069, adjusted R 2 p 0:052; fig. 1a). The relatively modest sample size employed in this study may have yielded limited statistical power. A post hoc power analysis indicated that given our effect size, we had a 46% probability of finding a significant result, while a minimum sample size of N p 50 per study site would have been needed to obtain statistical power at the recommended level of 80%. The remaining behavioral measures were not significantly affected by reducing cleaner fish densities (figs. 1b, 1c, 2). All of the statistical outcomes are summarized in table 1.

Mathematical Model
The Optimal Interaction Time. By assuming that the cleaners maximize the efficiency of gaining reward, given the reward function in equation (1) and a waiting time t w before they find the next client, the optimal interaction time t * i can be calculated by finding the tangent of the reward curve that goes through the point (2t w , o) on the taxis. See figure S2 for a graphic illustration. The optimal interaction time t * i satisfies the condition b(e t * i =b 2 1) p t * i 1 t w : This tells us that the optimal interaction time is determined by the shape parameter b of the reward function and the waiting time t w . The reward function before cleaner density reduction is y(t) (eq. [1]). We denote the reward function after cleaner density reduction as ky(t=m), where k and m are positive constants, and the shape of this function is obtained by proportionally stretching/squeezing the original reward function y(t) along the y-axis and the t-axis k and m times, respectively. Stretching or squeezing the reward function along the y-axis increases or decreases the reward at any given interaction time t, while stretching or squeezing the reward function along the t-axis would require a longer or shorter interaction time to obtain any given level of reward. To ensure that the new reward curve lays above the original one, the values of k and m must satisfy the conditions k ≥ m 1 0 and k ≥ 1, excluding k p m p 1 (see the supplemental PDF for the proof ). Now we show how a decrease in the waiting time and a shape change in the reward curve affect the optimal interaction time t * i . Assuming that the initial waiting time before cleaner density reduction is t w1 , given the initial reward curve (shown in red in fig. 3a), the optimal interaction time is t * i . After reducing cleaner density, the reward curve changes its shape (now in blue), and the new waiting time changes to t w ∈ (0, t w1 ). In the boundary case when t w p t w1 , the optimal interaction time becomes t † i ( fig. 3a; the magenta point of tangency on the blue curve), which is longer than the previous optimal interaction time t * i . In x-axis labels refer to the client's body size (large p total body length more than 10 cm; small p total body length 10 cm or less). Pairwise comparisons show no significant differences (P 1 :05) between the control and reduced sites. However, tested against the null hypothesis (zero line as an indicator of no change), cleaner fish from the reduced site increased their interaction frequency with large clients. All statistical outcomes for this figure are reported in table 1.
addition, there exists a threshold waiting time t w2 such that if the new waiting time t w p t w2 , the new optimal interaction time remains the same as before ( fig. 3a; the orange point of tangency on the blue curve and the green point of tangency on the red curve are located on the same vertical line). Therefore, when the waiting time after cleaner density reduction is between the two boundaries (t w2 ! t w ! t w1 ), the optimal interaction time t i is then between t * i and t † i , which is longer than the previous optimal interaction time t * i before cleaner density reduction. In this parameter region, we obtain a novel result that the new optimal interaction time increases. When t w ! t w2 , the new optimal interaction time decreases, in agreement with our intuition.
The counterintuitive result of an increased optimal interaction time can occur only when the threshold waiting time t w2 is shorter than the original waiting time t w1 . This depends on the shape of the reward curve ky(t=m) after cleaner density reduction. Only the parameter m that stretches/squeezes the reward curve along the t-axis plays a role, while the parameter k does not affect the location of t w2 . Figure 3b shows that the case of t w2 ! t w1 occurs only when m 1 1. This means that the new reward curve is formed by first stretching the original curve along the t-axis to the right, then stretching it up along the y-axis with the same proportion or larger (to ensure that the new reward curve lays above the original one, we need k ≥ m). The larger the value of m, the broader the parameter space that increases the optimal interaction time. When m ! 1 (i.e., the new reward curve is formed first by squeezing the original curve to the left along the t-axis and then either stretching or squeezing along the y-axis), the optimal interaction time will always decrease.
Biting Rate Dynamics. We calculated the equilibrium biting rate r * by setting ṙ p 0 in equation (2). It has a closed-form solution r * p 1 2 exp(2b 1 e 2g 1 =n ) 1 2 exp(2b 2 e 2g 2 =n ) : When the density of cleaners is large (n → 1∞), lim n→1∞ r * p 1 2 e 2b 1 1 2 e 2b 2 1 0: When the density of cleaners is very small (n → 0), lim n→0 r * p ( 1∞ when g 2 1 g 1 , b 1 =b 2 when g 2 p g 1 , 0 when g 2 ! g 1 : Therefore, as the density of cleaners decreases, the equilibrium biting rate r * may increase, decrease, or stay unchanged depending on the relative values of the shape parameters of cleaner and client traits. When g 2 1 g 1 or when g 2 p g 1 and b 2 1 b 1 , r * has a general trend toward increasing; when g 2 ! g 1 or when g 2 p g 1 and b 2 ! b 1 , r * has a general trend toward decreasing, although the change may not be monotonic. In the following, we use some numerical examples to show how clients' responses to cleaner cheating influence r * under the special case of b 1 p b 2 . Detailed mathematical derivations of the results and more numerical illustrations of the more general cases when b 1 ( b 2 can be found in the supplemental PDF. As shown in figure 4a, we keep all other parameters fixed and only change the shape of the cost function C in equation (4) by varying g 2 . In this way, the severity of negative responses decreases more or less quickly as the parasite load on clients (represented by the reciprocal of cleaner density, 1=n) increases. When g 2 is relatively large, the decrease is more rapid, and we consider those clients to be more "easygoing," while when g 2 is relatively small, the decrease is slower, and we consider those clients more "cautious." Figure 4b shows that the equilibrium biting rate varies little when cleaner density is relatively high (eq. [7]). But at relatively low cleaner density, the biting rate may increase, decrease, or stay constant as cleaner density decreases (eq. [8]). Thus, if the clients are relatively easygoing, the biting rate increases (service quality decreases) after cleaner density reduction (similar to the classic human society market prediction), but if the clients are more cautious and reluctant to reduce negative responses even at relatively high parasite loads, the biting rate decreases as the density of cleaners decreases-in this case, cleaners avoid the risk of facing negative consequences and improve service quality when they are more satiated.

Discussion
On the basis of the standard logic of the biological market theory, we had predicted that the increase in demand, as a consequence of reduced supply, should yield a decrease in service quality. Our findings fitted the predictions that supply reduction would increase interaction frequency with visitor clients through a reduction in cleaner fish density. Still, unexpectedly this was not accompanied by a change in interaction duration. Also, service quality, such as jolt frequency and tactile stimulation, remained stable. Thus, while clients a b Figure 4: a, In the C function, keeping the other parameters and variables fixed (b 2 p 10, r p 1), varying the value of g 2 changes the steepness of the negative response probability decline as parasite load (1=n) increases. Values of g 2 were set to 7, 9, 10, 12, and 15 from right to left. b, Equilibrium biting rate r * as a function of the density of cleaners. The parameters of the B function are fixed with b 1 p 10 and g 1 p 10; curves of the same color across panels share the same parameters for the C function (b 2 and g 2 ).
Figure 3: a, After cleaner density reduction, the reward curve changes from red to blue. At the previous waiting time t w1 , the optimal interaction time shifts to a larger value t † i . At the threshold waiting time t w2 , the optimal interaction time t * i remains the same as before. The optimal interaction time is between t * i and t † i when t w2 ! t w ! t w1 . Parameter values are a p 1, b p 1, k p 2, and m p 2. b, Threshold waiting time t w2 given the initial waiting time t w1 at different values of the shape-changing parameter m. The condition t w2 ! t w1 occurs only when m 1 1. suffered from receiving overall less cleaning because of our manipulation, the service remained of "standard" quality. Importantly, the mathematical models showed that this could be expected when considering optimal foraging theory (Cuthill and Houston 1997) and the logic of variable investment in repeated games (Johnstone and Bshary 2008). Below, we discuss first the empirical data, then the mathematical model, and finally the general implications of the study.

Empirical Data
With any negative result, a valid question is whether the data and methodologies are of sufficient quality to yield confidence in the conclusions. For instance, we manipulated only one reef site because of both logistical and ethical constraints. Nevertheless, we had a nearby control reef, which helped rule out potential confounding variables that might have caused changes in interaction patterns after 1 month (i.e., control vs. zero change; table 1). In addition, we know that cleaners readily learn to adjust their cleaning behavior to changes in the social environment (Bshary and Grutter 2002a;Triki et al. , 2020. We expected our matched observations on 23 cleaners to yield sufficient power in detecting potential changes in cleaning interactions at the manipulated site. A power analysis revealed that we would have detected intermediate effect sizes ( f p 0:42) with 80% probability.
Another potential limitation is that we did not monitor changes in clients' parasite load. The classic protocol to quantify such a variable in nature involves killing the fish, while alternative methods prove unreliable (Grutter 1995). We would have needed to kill a large number of fish to acquire such information. The number of fish needed to assess baseline levels of parasite loads would have negatively affected the client's densities and thus the demand for cleaning. The overall sample size to measure parasite load would have raised serious ethical concerns. Since we know much about the triadic parasites-hosts-cleaners dynamics, we opted to not directly measure clients' parasite load. For instance, cleaners efficiently reduce the average client's parasite loads (Grutter et al. 2018 but not the densities of the free-living parasites . Gnathiid isopods, the main ectoparasites eaten by cleaners (Grutter 1997), stay and feed for only about 30 min on a client (Grutter 2003). Thus, iterated cleaning visits help clients get rid of their current parasites, but they do not prevent reinfections (Grutter et al. 2018. Clients benefit from accessing cleaning services, such as having improved body condition (Clague et al. 2011;Waldie et al. 2011;Ros et al. 2020) and higher hematocrit levels (Demairé et al. 2020). The study by Demairé et al. (2020) was a parallel project involving the very same market manipulation we have in our study. Demairé and colleagues found that resident clients living nearby the cleaning station, from which we removed the cleaner, had reduced hematocrit levels (i.e., an indicator of high hematophagy by gnathiid parasites; Triki et al. 2016). Also, we know from previous work that exposure to these parasites for a day, without access to cleaners, significantly reduces hematocrit levels (Jones and Grutter 2005). This suggests that cleaners from the reduced density site obtained more food overall by interacting more frequently with large clients, without changing their efforts. These clients must have had increased parasite loads from the first day of manipulation, since reduced cleaning interaction rates cause a higher standing parasite load.
Our experimental manipulation reduced cleaner density from 0.9 to 0.5 cleaners per 100 m 2 of reef area. This resulting density was lower than that at high-quality reefs yet was still higher than natural densities at other nearby reefs . Furthermore, major coral bleaching in 2016 (Ceccarelli et al. 2016;Hughes et al. 2017;Pizarro et al. 2017) created similar changes in the biological market, where cleaner densities were reduced by 80% while overall client densities were reduced only by about 60% (Triki et al. 2018;. This caused a decrease in the cleaner-to-client ratio, with an increase in the interaction frequency of cleaners with visitor clients (Triki et al. 2018), similar to what we found here. Thus, we are confident that the results are biologically meaningful and warrant an extended biological market theory. In addition, the gametheoretical models offer potential explanations for the results and provide predictions for further empirical testing.

Game-Theoretical Models
In our model, we have introduced diminishing return functions to classic biological market models. That is, the logic of the marginal value theorem was applied to clients as food patches that are depleted during interactions (Charnov 1976;Bshary et al. 2008), and cleaners' overall benefits were limited by satiation. This approach illustrates that qualitative predictions are difficult to make, as the use of different reasonable functions (i.e., the reward curve ky(t=m) after cleaner density reduction) yields opposing optimal responses by cleaners toward supply reduction (i.e., the optimal interaction time t * i may increase, decrease, or remain unchanged). According to the model, the main reason why reduced cleaner densities do not automatically lead to shorter interactions is that the remaining cleaners do not fully compensate for the amount of service by doubling their number of interactions. Subsequently, intervals between interactions do not become shorter, and clients likely harbor more parasites (see above). According to the marginal value theorem, shorter intervals would make shorter interactions more beneficial to cleaners, while higher parasite loads should prolong interactions. The net effect on the resulting interaction duration depends on the relative marginal changes in these two variables.
The effects of reduced cleaning supply on cheating rates are also nontrivial. On the one hand, increased demand for cleaning and increased parasite loads lower the marginal benefits of biting mucus. On the other hand, clients should be more tolerant of cheating if demand is high and if they harbor more parasites, proven in laboratory-based experiments involving ectoparasite manipulation (Bshary and Grutter 2002a). Increased tolerance promotes cheating, while reduced temptation lowers cheating (Johnstone and Bshary 2002;Gingins and Bshary 2014). As with interaction duration, the marginal changes in the benefit of cheating function and the cost of cheating function will determine whether cheating rates change and, if so, in which direction. Future studies should thus test whether variation in tolerance increase or decrease cheating rates at a higher demand as a function of client species. A working hypothesis would be that species with high maneuverability are "cautious" client species, which are more likely to respond to poor service with evasive action and therefore receive a service above market value (Roche et al. 2021).

General Discussion
Ecological and evolutionary theoretical models often aim at generalizing their predictions and apply them to a wider range of species. For example, there is a current urgent need to predict how climate change will affect between-species relationships with consequences on communities and ecosystems (Brooker et al. 2007;Cabral et al. 2016). However, more general models that explain enhanced positive feedback between species under environmental stress (stress gradient hypothesis based on Bertness and Callaway 1994) fail at specifying the mechanics of individual interactions that cause these positive feedbacks (Brooker et al. 2007). As soon as potential mechanics are specified, these models become more idiosyncratic by default. Alternatively, theoretical models are developed to capture the specificity of a given system, as exemplified by Wyatt et al. (2014) exploring very specifically the evolutionary dynamics of exchanges of carbon for nitrogen in plant-mycorrhizae mutualisms. Our model provides another example. Nevertheless, some general aspects (discussed below) can be inferred from our model and applied to other mutualisms, potentially improving predictions regarding the effect of climate change.
Most importantly, benefit functions in markets invariably follow the logic of diminishing returns. In essence, there is only so much a cleaner fish or an ant or a pollinator can eat. Similarly, the benefit functions for clients (reduction in parasite load), ant partner species (protection from herbivores/predators), and plants (pollen transfer, nitrogen transfer) are not linear. The same logic holds for intraspecific markets, such as baboon females grooming a mother to obtain access to her infant (Barrett et al. 1999). There is only so much grooming a mother needs before she would be better off using her time differently. Our model shows that the diminishing return function's precise shape is crucial to predict how shifts in supply-to-demand ratios affect exchange rates. The function shape can be influenced by external factors, such as parasite abundance in our study or nitrogen availability in soils in plant-microorganism mutualisms. Therefore, precise knowledge of the shape of benefit functions may explain, for example, why manipulations of available nitrogen and bacterial strains in leguminous plant-rhizobia mutualisms yield rather variable results. Although there is documentation that plants are choosier when plant demand for nitrogen is low (Kiers et al. 2010), there are cases in which plants in nitrogen-rich soil appear to be more tolerant toward inefficient partners (Bertness and Callaway 1994;Zhang et al. 2020), and there can be no apparent adjustment (Regus et al. 2014;Grillo et al. 2016). The evolutionary model of plant-mycorrhizae mutualisms (Wyatt et al. 2014) also provides a few seemingly counterintuitive results. For example, the model predicts that an increase in plant partners per mycorrhiza has a small positive effect on mycorrhizal cooperativeness. The authors identified two opposing effects: receiving more carbon from plants per unit phosphorus transferred increases cooperativeness, while receiving more plant carbon reduces the carbon's value and hence lowers the exchange rate. The authors did not vary their diminishing return functions, though. Our model results suggest that varying the shape of the diminishing return functions could eventually produce conditions in which mycorrhizae become less cooperative when the number of plant partners increase, the opposite prediction of their system. Previous biological market models make the key assumption that individuals play evolved strategies in a stable market in which supply and demand are in equilibrium. This assumption can rarely be met in nature. Instead, individuals can grow up in a patchy environment with locally differing market conditions or may experience significant shifts in supply and demand during their lifetime (Neuhauser and Fargione 2004;Sachs and Simms 2006;Werner et al. 2014;Noë 2016). Our model is not set up to find evolutionarily stable strategies. It can thus be interpreted as individuals reacting to changes in market conditions over time, as documented in cleaning mutualism (Triki et al. 2018). How well individuals of different species may adapt during their lifetime to variable conditions will depend on the underlying mechanisms (i.e., learning mechanisms and/or phenotypic plasticity). Incorporating these mechanisms into market models will be a challenging future task. For example, learning as a major adaptation of animals to cope with a variable environment may be constrained to varying degrees based on a species' perception limitations, memory capacities, and so on. Indeed, while learning models sometimes yield similar results to evolutionary models (Dridi and Lehmann 2014;Dridi and Akcay 2018), they may also help identify essential constraints on animals finding the optimal solution (Quiñones et al. 2020). Regarding phenotypic plasticity, experiments on plants and microorganisms have shown divergent responses to changes in market conditions-that is, sometimes there are market effects, and sometimes there are none or even the reverse effect (Regus et al. 2014;Grillo et al. 2016;Zhang et al. 2020). Again, the challenge will be to determine the constraints of the reaction norms. For example, the few studies documenting that some plants in nitrogenrich soil appear to be more tolerant toward inefficient partners (Bertness and Callaway 1994;Zhang et al. 2020) could suggest that these plants have evolved phenotypic plasticity mainly to cope with harsh conditions. Thus, a new generation of market models should incorporate mechanisms to explore ecological and evolutionary timescales to cover the ecological diversity of exchanges in intraspecific cooperation and interspecific mutualism.