Regular positive reinforcement training increases contact-seeking behaviour in horses

.


Introduction
Horses are commonly trained with negative reinforcement, where an aversive stimulus such as physical pressure is removed contingently on a response to increase the likelihood of future behaviour (McLean, 2005;Murphy and Arkins, 2007).For example, to teach a horse to stop a rider will pull on the reins and then release when the horse responds by decelerating.Recently, training horses with positive reinforcement, where an appetitive stimulus is presented contingently on a behaviour to increase the future likelihood of that behaviour, has become increasingly popular.A growing body of evidence supports positive reinforcement as an efficient training method for horses (Innes and McBride, 2008;Sankey et al., 2010aSankey et al., , 2010b;;Hendriksen et al., 2011;Briefer Freymond et al., 2014;Fox and Belding, 2015;Dai et al., 2019).For example, horses trained with positive reinforcement show fewer stress-related behaviours, fewer avoidance behaviours and more signs associated with positive emotions compared to those trained with negative reinforcement (Hendriksen et al., 2011;Briefer Freymond et al., 2014;Dai et al., 2019).Positive reinforcement training is also suggested to improve the horse-human relationship, as horses trained with positive reinforcement show increased contact-seeking behaviour towards an unfamiliar person (Sankey et al., 2010a(Sankey et al., , 2010b;;Lundberg et al., 2020) and have a more positive attitude towards their trainers (Briefer Freymond et al., 2014) compared to horses trained with negative reinforcement.
There are different methods of using positive reinforcement in horse training, and horse trainers often mix positive and negative reinforcement.As this is a relatively new research field, there is still limited knowledge about the effects of combining positive and negative reinforcement in horse training.While previous studies as exemplified above show promising results with regards to the effects of positive reinforcement training, they are mostly based on controlled training in an experimental setting.There is a need to study the long-term effects of adding regular positive reinforcement training as a supplement to negative reinforcement training on horses' perception of humans, their emotional state, and their stress levels.In the present study we used the motionless human test (Sankey et al., 2010a(Sankey et al., , 2010b)), to study the horse's behaviour and willingness to interact with an unfamiliar person.
In order to assess the horse's emotional state, we used the cognitive bias test (Harding et al., 2004;Mendl et al., 2009).This test is commonly used to compare emotional states in animals living in different conditions or to measure changes in emotional state over time in the same individual.In some versions of this test the animal is trained to distinguish a rewarded location from an unrewarded location (Lagisz et al., 2020).Once it reliably differentiates between the two locations, an ambiguous location in between the rewarded and unrewarded location is presented.The animal's response to this ambiguous location is suggested to be affected by its current emotional statea positive emotional state will result in a quick approach to the ambiguous location, whereas a negative emotional state will result in a slow approach (Harding et al., 2004;Mendl et al., 2009).Indeed, previous research has found that animals with good welfare are in a more positive emotional state, with a more optimistic cognitive bias (Harding et al., 2004;Mendl et al., 2009).
Welfare and stress are closely related, and therefore assessing longterm stress levels can give an indication of the animal's welfare.A reliable method to assess stress levels is by measuring cortisol concentrations in for example saliva or hair (Russell et al., 2012;Roth et al., 2016;Heimbürge et al., 2019;Sauveroche et al., 2020).While saliva cortisol is a biomarker for acute stress responses (Russell et al., 2012;Vieira de Castro et al., 2020), hair cortisol concentration is a non-invasive method to assess cortisol release over an extended period forming a retrospective calendar of long-term stress levels (Roth et al., 2016;Heimbürge et al., 2019;Sundman et al., 2019;Sauveroche et al., 2020).Hence, by analysing hair cortisol concentrations in the mane we can assess long-term stress in horses (Roth et al., 2016;Heimbürge et al., 2019;Sauveroche et al., 2020).
Even though training horses with positive reinforcement is becoming more popular there is still limited research on potential long-term benefits in terms of stress, emotional state, and the horse-human relationship.The aim of this study was therefore to investigate the effects of adding a small but regular amount of positive reinforcement training to privately owned horses' normal negative reinforcement training.Our hypotheses were that owners who added positive reinforcement training would experience an improvement in their relationship to their horses, that their horses would become more optimistic, show more contactseeing behaviour towards an unfamiliar person, and decrease their long-term stress levels.

Subjects
All horse owners were informed and gave their written consent that they voluntarily participated in the study.No special ethical permit was needed according to Swedish regulations.
We recruited 40 horses (19 mares and 21 geldings) of various breeds through social media and personal contacts, according to the following criteria: the horse was stabled within geographical proximity to Linköping University in the southeast of Sweden, the stable had a confined space such as a riding arena, the owner visited the horse at least five to six days a week, and the horse was not normally rewarded with treats during any kind of training.More specifically, only owners who initially reported that they used food rewards during training less frequently than once a month were included.The participating horses were allowed to follow their normal feeding routines throughout the study.
All horses were pseudo-randomly assigned to either a training group or a control group and the groups were balanced for sex, age, and type (horse or pony).Three horses from the training group and one horse from the control group did not complete the study.In total, 36 horses participated in the study, 17 horses (nine mares, eight geldings) in the training group (mean age 11.2 years ± 1.1 SE) and 19 horses (nine mares, ten geldings) in the control group (mean age 10.2 years ± 1.3 SE).

Study design
The owners of the horses in the training group received detailed written instructions for a nine-week positive reinforcement training program.It was developed specifically for this study by the experimenter, who is an experienced clicker trainer, to suit horse owners with no previous experience of using food rewards in training.No other tools or materials were provided.The program focused on progressive foundation behaviours, such as standing still, touching a target, following a target, and going to a target (see Supplementary section for the detailed training program).The owners were instructed to train according to the program for at least five minutes, four times a week, in addition to their normal negative reinforcement training.Owners in the control group did not receive the written training program until after the study, and were instructed to continue with their normal negative reinforcement training for the duration of the study.
The training program was designed to have a clear, logical progression of behaviours, with each new exercise building on the previous exercises.The aim of the first week was to reinforce a default behaviour of turning the head away from the owner, in order to prevent mugging for treats and other unwanted behaviours often associated with food rewards.It was also an opportunity for those owners who decided to use a verbal marker signal as a secondary reinforcer to teach their horses to associate it with a food reward, in preparation for the next exercises.During the second week, the horses learned to touch a target with their muzzle.The owners were instructed to ask for the default behaviour first, before presenting the target to their horse, in order to further reinforce this behaviour.During the third week, the owners were instructed to hold the target in different positions around their horse and reinforce when their horse touched the target.During the fourth week, the owners were instructed to repeat the exercises from week two and three in different environments, and during the fifth week they were asked to introduce different objects as targets to their horses.During the sixth week, the owners were instructed to train their horse to follow a target, and during the seventh week they were asked to have their horses follow a target over a simple obstacle course of poles and cones.During the eight week the owners were instructed to teach their horses to go to a stationary target.A bonus exercise for the ninth week was to introduce a leg target, where the horse uses one of its front legs to touch a target in front of it.
The owners were instructed to follow the training program and make sure their horse was consistently trained using positive reinforcement up until the second test occasion.The experimenter contacted the owners regularly during the training period to answer questions, receive updates and encourage the owners to follow through with the program.
All horses were tested in their home environments, once at the start of the training period and a second time after the training period was completed.All horses were subjected to a motionless human test and a cognitive bias test.Furthermore, mane hair samples were taken from all horses at the start of the training period and again at the end of the training period.All owners filled out a relationship questionnaire at the start of the training period and again at the end of the training period.The questionnaire was the same on both occasions.
All horses were handled by their owners throughout the study.The training period started immediately after the first test occasion in September 2020 and ended on the second test occasion in December 2020, after eight or nine weeks.As the horses were tested on different days, the exact weeks of the training period were different for each horse.

Test arena
On each of the two testing occasions a portable arena (6 ×8 m) of plastic poles and non-electrified fence tape was built in a grass free location where possible, such as gravel, dirt, arena substrate etc., and two Canon Legria cameras were placed on each side facing the arena (Fig. 1).If no grass-free location was available, a location with as little grass as possible was chosen.The horses were allowed to familiarize themselves with the arena before testing on both occasions: each horse was released in the arena to explore on its own.Once it stopped exploring and stood calmly the motionless human test began.

Motionless human test
At the start of the motionless human test, the owner was asked to enter the arena and lead the horse by the halter to position it by the side closest to the cameras (Fig. 1), facing the outside.The experimenterwho was unfamiliar to the horsepositioned herself on the opposite side of the arena, facing the horse but standing in a neutral position with a stopwatch in her hand.
The test started when the owner released the horse and exited the arena to stand behind one of the cameras, facing away from the arena.Two minutes were video recorded and later analysed using a predetermined ethogram (Table 1).During this time the experimenter (and the owner) were standing completely motionless and did not interact with the horse.After two minutes the experimenter asked the owner to return and lead the horse out of the arena.

Cognitive bias test
Once the motionless human test was completed the arena was set up for the cognitive bias test.The short side closest to the cameras was opened and plastic cones were placed at the entrance to mark the starting line (Fig. 1).The experimenter prepared a test bucket by stacking two shallow grey buckets on top of each other with a thin slice of carrot placed between them, to control for potential olfactory cues.
Every time the bucket was placed in a particular corner of the arena it would contain a few carrot slices.Every time it was placed in the opposite corner it would be empty.Each horse was pseudo-randomly assigned either the left or right corner opposite the entrance to the arena as their rewarded corner.The rewarded and unrewarded corners remained the same for each horse throughout the test, on both test occasions.

Initial training
The cognitive bias test was divided into three phases: initial training, a learning phase, and a test phase.The initial training phase consisted of eight trials with the bucket in the rewarded corner to teach the horse to approach the bucket.First, the owner led the horse into the arena and to the bucket, walking on the horse's left side, so the horse could eat the carrots.This was repeated five times, and between each trial the experimenter refilled the bucket with carrot slices while the owner and the horse faced away from the arena.
During the last three trials of the initial training phase the owner released the horse at the starting line and allowed the horse to approach the bucket alone.If a horse was hesitant to leave the owner's side the owner was instructed to take another few steps into the arena to encourage the horse to approach the bucket.The distance from the bucket was then gradually increased over the consecutive two trials until the horse would approach from the starting line.All horses completed the initial training phase of eight trials successfully.

Learning phase
The learning phase started immediately after the initial training phase.During the learning phase the experimenter alternated the location of the bucket between trials according to a predetermined pattern: sometimes it would be placed in the rewarded corner and sometimes in the unrewarded corner.The bucket was never placed in the same corner for more than two trials in a row.In the rewarded corner it always contained a few slices of carrot; in the unrewarded corner it was always empty.
The owner released the horse at the starting line, as in the last three trials of the initial training phase.The experimenter placed herself to the right of the starting line and recorded the time from the moment one of the horse's front legs crossed the starting line to the moment the horse's muzzle disappeared into the bucket.The trial was stopped after a maximum duration of one minute or if the horse exited the arena, in which case the maximum time was assigned.
The learning phase was completed once the horse approached the bucket at least one second faster when it was placed in the rewarded corner compared to when it was placed in the unrewarded corner, over six consecutive trials (Mendl et al., 2010;Duranton and Horowitz, 2019).

Test phase
The test phase followed immediately after the learning phase and consisted of just one trial.The experimenter placed the bucket in the ambiguous location, in the centre of the short side halfway between the rewarded and unrewarded corners (Fig. 1).The owner released the horse at the starting line and the experimenter recorded the time it took for the horse to approach the bucket.

Relationship questionnaire
All owners completed a short questionnaire about their relationship to their horse on both test occasions.The questions were the same both times and were scored using a seven-point Likert scale (Table 2).The scores for questions five and seven were reversed before the sum was calculated.

Sample collection
Hair samples were collected from all horses on the two test occasions,  in September and December 2020.The samples were cut from the mane at the poll, close to the follicles, (Sauveroche et al., 2020) using scissors and put in plastic bags for later analysis.Only a small amount of hair is needed for the cortisol analysis, and it was not possible to find the exact location of the first samples on the second test occasion.

Sample analysis
The samples were analysed in January and February 2021 according to the following protocol (Karlén et al., 2011;Roth et al., 2016;Sauveroche et al., 2020): one centimetre of the proximal hair shafts was cut into 1-2 mm long pieces and weighed.The mane is reported to have a relatively constant growth rate and one centimetre reflects approximately three weeks (Dunnett and Lees, 2003).The cut hair samples were frozen in liquid nitrogen for two minutes, then pulverised with steel beads using a Tissue lyser II at 23 hertz for another two minutes.One mL of methanol was added to each sample, then they were placed in a tube shaker overnight.
On the following day the samples were first centrifuged for one minute at 23 G and four degrees Celsius.Eight hundred microliters of supernatant were extracted from each sample and then evaporated in a Savant Speed-Vac Plus SC210 for one and a half hours.The remaining pellets were dissolved in 150 µL each of radioimmunoassay buffer with a pH of 7.4.Then, fifty µL were extracted from each sample and mixed with 100 µL of primary antibody, giving a 30-40 per cent rate of binding to the radioligand.
After an incubation period of 48 h, 100 µL of radioactive 125 I conjugated cortisol were added to each sample.They were then incubated for another 24 h, after which 75 µL of Anti-Rabbit IgG SAC-CEL were added to each sample.The binding reaction was stopped after half an hour by adding two mL of water to each sample.The samples were then centrifuged for 15 min at four degrees Celsius and 3000 revolutions per minute.Finally, the water was decanted from each sample, and they were prepared for final analysis.
The samples were analyzed using a PerkinElmer 2470 Wizard gamma counter.All samples were analyzed in duplicates.The values were converted from counts per minute and nanomoles per liter to picograms of cortisol per milligram of hair, then normalized with the recorded weights of the initial hair samples.

Data analysis
All statistical analyses were performed with the software IBM SPSS statistics (version 28).We used non-parametric tests as the data were not normally distributed.We used Mann-Whitney U tests to compare the training group to the control group and Wilcoxon signed rank tests to compare differences withing the groups.Means ± 1SE are reported in the results, and the level of statistical significance was set at P ≤ 0.05.

Motionless human test
We found that horses in the training group (N = 17) spent more time in physical contact with the experimenter after the training period compared to before (Fig. 2a; Z = − 1.96, P = 0.050).There was no difference in proximity between test occasions (Z = − 1.48, P = 0.14).
For the horses in the control group there was no difference in neither physical contact (N = 18, one horse was excluded because it was very stressed during the test; Fig. 2a; Z = − 1.58, P = 0.11), nor proximity (Z = − 1.57, P = 0.12).Nine of the 17 horses (53%) in the training group spent more time in physical contact after the training period compared to before, while two horses in the training group (12%) spent less time in physical contact after the training period.Seven of the 18 horses (33%) in the control group spent more time in physical contact after the training period compared to before, while three horses (17%) spent less time in physical contact after the training period.
Eleven horses in the training group did not engage in any physical contact with the experimenter on the first test occasion, before the training period.On the second test occasion, after the training period, five of them (45%) had physical contact with the experimenter.Nine horses in the control group did not engage in any physical contact with the experimenter on the first test occasion, and only one of them (11%) had physical contact with the experimenter on the second test occasion.
We found no difference between the training and control groups in the amount of time they spent in proximity to or in physical contact with the experimenter, neither before (Fig. 2a; physical contact: U=186.5, P = 0.27; proximity: U=177.0,P = 0.44) nor after (Fig. 2a; physical contact U=148.00,P = 0.88; proximity: U=142.00,P = 0.73) the training period.

Cognitive bias test
Three horses (two from the training group and one from the control group) did not complete the cognitive bias test on both occasions and were therefore excluded from the analysis.For both groups, the learning phase was shorter on the second test occasion.The horses in the training group (N = 15) needed a mean of 18.87 ± 1.41 trials to reach the learning criterion on the first occasion, but only 14.27 ± 1.14 trials on the second occasion (Z = − 1.99, P = 0.046).Similarly, the horses in the control group (N = 18) needed a mean of 18.0 ± 1.36 trials on the first occasion but only 12.28 ± 1.14 trials on the second occasion (Z = − 2.55, P = 0.011).We found no difference between the groups in the number of trials the horses needed to learn the task either before (U=124.5,P = 0.71) or after (U=94.0,P = 0.15) the training period.
On the first test occasion, the horses in the training group showed a mean latency time of 6.69 ± 0.23 s to approach the rewarded corner during the last three trials.On the second test occasion, they spent a mean time of 6.63 ± 0.26 s approaching the rewarded corner during the last three trials.For the control group the corresponding mean approach times were 7.41 ± 0.82 s on the first test occasion and 6.72 ± 0.22 s on the second test occasion.We found no difference in the latency to approach the rewarded corner between the two groups on either occasion (Fig. 2b; before training period: U=146.00,P = 0.71; after training period U=140.00,P = 0.86).
Neither the horses in the training group (Z = − 0.14, P = 0.89) nor the horses in the control group (Z = − 0.76, P = 0.45) approached the ambiguous bucket faster on the second test occasion, after the training period (Fig. 2b).Also, we found no difference between the groups in time to approach the ambiguous bucket (Fig. 2b; before training period: U=140.50,P = 0.85; after training period: U=146.00,P = 0.71).

Relationship questionnaire
The owners in the training group (N = 16, one owner did not complete the questionnaire at the end of the training period and was excluded from the analysis) might show a tendency to have a higher total score in the questionnaire at the end of the training period (39.88 ± 0.83) compared to the one they filled out at the start of the training period (38.63 ± 0.77; Z = 71.0,P = 0.072).
The owners in the control group (N = 19) did not differ significantly in their questionnaire scores, with a mean score of 42.37 ± 0.99 at the start and 41.84 ± 1.11 at the end of the training period (Z = 63.0,P = 0.52).
We found that the owners in the control group scored higher than the owners in the training group on both occasions (U=230.5,P = 0.008 at the start of the training period; U= 211.0, P = 0.052 at the end of the training period).

Hair cortisol concentration
Hair cortisol concentrations increased in both groups during the training period.The horses in the training group (N = 17) had mean hair cortisol concentrations of 8.96 ± 1.01 pg/mg hair in the samples collected in September 2020 and 19.66 ± 2.92 pg/mg hair in the samples collected in December 2020 (Z = − 2.68, P = 0.007).Similarly, the horses in the control group (N = 19) had mean hair cortisol concentrations of 7.04 ± 0.80 pg/mg hair in September 2020 and 16.49 ± 2.50 pg/mg hair in December 2020 (Z = − 2.94, P = 0.003).We found no difference in hair cortisol concentrations between the groups either before (U=110.00,P = 0.11) or after (U=137.00,P = 0.45) the training period.

Discussion
The aim of our study was to investigate effects of positive reinforcement training on horses' stress levels, emotional state, and the horse-human relationship.We found that a small but regular addition of positive reinforcement training increased horses' contact-seeking behaviour towards an unfamiliar person in a motionless human test.However, there was no effect on emotional state as assessed through a cognitive bias test or on long-term stress levels as assessed through hair cortisol concentrations.

Motionless human test
The horses in the training group spent more time in physical contact with the experimenter after the training period compared to before, and more horses in the training group showed an increase in physical contact with the experimenter after the training period compared with the control group.This could be an indication that the horses in the training group were more curious and less fearful of an unfamiliar person after being trained with positive reinforcement.This aligns with Lundberg et al. (2020) who found that horses trained with positive reinforcement showed more contact-seeking behaviour towards an unfamiliar person than those trained with negative reinforcement.
However, a potentially confounding factor is that the experimenter was the same person on both test occasions.The increase in physical contact could therefore be an indication that the experimenter was not viewed as an unfamiliar person on the second test occasion.It is also possible that the horses habituated to the test arena and were more comfortable focusing on the experimenter on the second test occasion.Still, as only the training group increased their physical contact with the experimenter, the addition of positive reinforcement training seems to have had an effect on their contact-seeking behaviour.
There were, however, large individual variations in our study which might in part be explained by the study design.To maximize the number of horses and minimize the familiarization time for each horse the tests were performed in the horses' respective home stables.This means that we could not control for environmental distractions that could have affected the horses' attention and performance during the test.
While we consider physical contact to be a sign of increased curiosity and decreased fearfulness, there may be a downside with large animals such as horses engaging in more physical contact such as biting or pushing.However, we did not observe any agonistic behaviour towards the experimenter in this study.If the positive reinforcement training is done properly, increased curiosity from the horse should therefore not be a problem.
Future studies might benefit from keeping the test environment as homogeneous as possible between horses and test occasions, or perform the test indoors, to avoid confounding distractions.Also, using a different experimenter when testing the same horses twice could be a way of minimizing any effects of familiarity on the horses' behaviour.

Cognitive bias test
Both groups were quicker to reach the learning criterion on the second test occasion, indicating that all horses remembered the task eight to nine weeks after learning it the first time.On average, the horses needed 18 trials to significantly discriminate between the rewarded and unrewarded corners on the first test occasion, and 13 trials on the second.Previous studies (Sankey et al., 2010a(Sankey et al., , 2010b;;Hendriksen et al., 2011;Dai et al., 2019) have found that horses trained with positive reinforcement learn a task faster and remember it better than horses trained with negative reinforcement.The cognitive bias test taught the horses to distinguish between locations using a food reward, and thus positive reinforcement.This could explain why the horses seemed to remember the task after several weeks, independently of belonging to the training or the control group, even though they only performed it on one occasion.
In previous studies, horses that experience an improvement in welfare become more optimistic (Löckener et al., 2016;McGuire et al., 2018).We therefore hypothesized that the addition of regular positive reinforcement training would improve the welfare of the horses in the training group, and consequently their performance in the cognitive bias test.Contrary to our hypothesis, however, the horses in the training group did not approach the ambiguous bucket faster on the second test occasion, indicating that the positive reinforcement training did not improve their emotional state.
Briefer Freymond et al. (2014) found similar results.In fact, they found that horses trained with negative reinforcement were more optimistic compared to horses trained with positive reinforcement, even though behavioural measurements suggested a more positive emotional state in the latter.As an explanation for this Briefer Freymond et al. (2014) suggest that the horses trained with negative reinforcement were more food motivated than those trained with positive reinforcement, as the positive reinforcement horses had been given food rewards during the whole training period while the negative reinforcement horses had not.Hence, the novelty of food rewards might bias the results.Burman et al. (2011) found similar results in dogs.In their study dogs that had not been given food rewards prior to a cognitive bias test were more optimistic than those that had been given food rewards.Taken together, these results indicate that motivational effects of previous training with food rewards could make the cognitive bias test less suited to investigating effects of training and more suited to assessing the emotional state of animals living in different environmental conditions.An additional explanation for our results is that the training period was not long or intensive enough to have an effect on the horses' emotional state.The owners were instructed to follow the positive reinforcement training program for five minutes, four times a week, in addition to their normal negative reinforcement training.This could have been too little to alter the emotional state of the horses in a significant way.It is also possible that the skill of the owners in using positive reinforcement varied, creating a frustration effect in some of the horses, or that the combination of positive and negative reinforcement training affected the results.In a similar study on dogs, for example, individuals trained with a mix of positive reinforcement, negative reinforcement and punishment were less optimistic than those trained with only positive reinforcement (Vieira de Castro et al., 2020).Hence, further research is needed to confirm the effectiveness of the cognitive bias test for measuring the effect of different training methods and to compare different aversive methods such as negative reinforcement and positive punishment, which may have different impacts on the animal's emotional state.
Our results could also have been affected by the study design.According to Mendl et al. (2009), a low cost of approaching the ambiguous location can make the animals more inclined to explore it regardless of emotional state.Our test setup involved no cost for the horses to approach the ambiguous bucket besides the energy to walk from the starting line, a distance of less than eight meters.Briefer Freymond et al. ( 2014) had 15 m.It is possible that a longer distance, or otherwise increased cost of approaching the buckets, could make horses less inclined to explore an ambiguous location and thus give a better assessment of their emotional state.
Finally, looking at the latencies to approach the ambiguous bucket on the first test occasion, it seems that the horses in our study may have been optimistic from the beginning, offering limited room for improvement.Recruiting privately owned horses tends to skew the selection towards enthusiastic and motivated owners who already work hard to fulfil their horses' needs.Hence, the absence of improvement in the cognitive bias test might simply be because the horses in our study were generally quite optimistic already, regardless of training method.

Relationship questionnaire
We formulated the seven questions to capture the owners' perception of their relationship and interaction with their horses.As the questionnaire has not been validated or used in any previous study, the scores need to be interpreted cautiously.Still, the owners of the horses in the training group might show a tendency to score higher after the training period compared to before.This could indicate that their perception of their relationships with their horses improved after regular positive reinforcement training.However, since the study could not be blinded to the participants, and the owners knew which group they belonged to, this might have influenced their self-assessment on the second occasion.
Both groups had overall high scores.We believe that this reflects the fact that owners who volunteer to participate in a study of this scope are enthusiastic and motivated horse owners who spend considerable time interacting with their horses.
The owners in the control group scored higher than the owners in the training group on both occasions.This indicates that they had an overall better perception of their relationship with their horses.As we divided the horses into the training and control groups pseudo-randomly before the results from the relationship surveys had been analysed, the groups were not balanced for relationship scores.

Hair cortisol concentration
Contrary to our hypothesis, the hair cortisol concentrations were similar in the two groups of horses after the training period.One possible explanation for this is that, as mentioned before, the length and intensity of the positive reinforcement training was not enough to affect measurable stress levels.The addition of five minutes of positive reinforcement training four times a week may not be sufficient to offset the effects of stabling and management routines on long-term stress levels.Sauveroche et al. (2020) and Mazzola et al., (2021) found, for example, that hair cortisol concentrations can differ significantly between stables.As most horses in our study lived in different locations, we were not able to control for the effect of stabling in our study.
Furthermore, the second hair sample was obtained on the day of the second test occasion.As cortisol is deposited in the hair follicle under the skin's surface, the cortisol concentrations at the end of the training period would not show up in the hair shafts until weeks later (Dunnett and Lees, 2003;Heimbürge et al., 2019).Therefore, our samples show the hair cortisol concentrations during earlier stages of the training period rather than after the full training period.
We found that all horses had higher hair cortisol concentrations in the second sample, taken in December 2020, compared to the one taken in September 2020.This is most likely due to seasonal changes in hair cortisol concentrations: increases in hair cortisol during the winter months have been seen in horses, dogs, cows, and hares (Roth et al., 2016;Uetake et al., 2018;Lavergne et al., 2020;Banse et al., 2020;Mazzola et al., 2021).To account for seasonal changes, a longer study of one year could be one approach.Also, future studies should aim to control for environmental factors such as type of stabling, as well as account for hair growth rate when deciding when to take samples.

Conclusions
We found that adding a small but regular amount of positive reinforcement training to horses' normal negative reinforcement training made them more interested in interacting with an unfamiliar person.This supports previous studies which have found that horses trained with positive reinforcement have a more positive perception of humans.
We found no measurable effects on emotional state or long-term stress levels.It is possible that a longer or more intensive period of positive reinforcement training may yield different results.Future studies could explore differences between, for example, horses trained mainly with positive reinforcement and those trained mainly with negative reinforcement.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Schematic picture of the test arena during the cognitive bias test.During the motionless human test, the arena was empty and closed on all sides.

Table 1
Ethogram of contact-seeking behaviours during the motionless human test.

Table 2
Relationship questionnaire; questions and scores.
Fig. 2.Boxplots with medians for the training and control groups.A, time spent in proximity of (light blue) and in physical contact with (dark blue) experimenter during the motionless human test.B, the latencies to approach the rewarded (light green), ambiguous (yellow) and unrewarded (red) buckets, before and after the training period.Circles indicate outliers and crosses specify extreme outliers (three times the interquartile range from a quartile).*P ≤ 0.05, * **P < 0.001.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)