Social relationship-dependent neural response to speech in dogs

In humans, social relationship with the speaker affects neural processing of speech, as exemplified by children's auditory and reward responses to their mother's utterances. Family dogs show human analogue attachment behavior towards the owner, and neuroimaging revealed auditory cortex and reward center sensitivity to verbal praises in dog brains. Combining behavioral and non-invasive fMRI data, we investigated the effect of dogs' social relationship with the speaker on speech processing. Dogs listened to praising and neutral speech from their owners and a control person. We found positive correlation between dogs' behaviorally measured attachment scores towards their owners and neural activity increase for the owner's voice in the caudate nucleus; and activity increase in the secondary auditory caudal ectosylvian gyrus and the caudate nucleus for the owner's praise. Through identifying social relationship-dependent neural reward responses, our study reveals similarities in neural mechanisms modulated by infant-mother and dog-owner attachment.


Introduction
In humans, neural processes supporting the perception of vocal social stimuli and their communicative content are affected by the social relationship with the vocalizer. The attachment relationship between mothers and their children provides a good model for investigating the underlying brain mechanisms of such modulatory factors ( Imafuku et al., 2014 ;Liu et al., 2019 ;Purhonen et al., 2004 ). The social relationship with the vocalizer is reflected both in auditory regions and regions associated with reward and motivational processes within the corticostriatal circuits ( Balleine et al., 2007 ;Kalivas and Kalivas, 2016 ), as evidenced by neuroimaging studies of children listening to their mother's' voice ( Abrams et al., 2016 ). The involvement of corticostriatal reward-and motivation-related regions in preferentially processing important individuals is further supported by visual studies on different attachment relationships, including mother-infant ( Strathearn et al., 2009 ) and romantic ( Scheele et al., 2013 ) dyads. Furthermore, the neurally encoded reward value of communicative content in social stimuli is modulated by the characteristics of the attachment relationship between perceiver and emitter ( Vrti čka et al., 2008 ).
There is ample behavioral evidence that multiple species differentiate important individuals based on their voices (e.g. human infants, companion dogs both as a 'secure base' during exploration in a novel environment ( Palmer and Custance, 2008 ) and as a 'safe haven' in case of danger ( Gácsi et al., 2013 ). Dogs' social relationship with their owners can thus be considered functionally analogous to the infant-mother relationship in humans ( Topál et al., 1998 ;Topál and Gácsi, 2012 ).
Behavioral studies revealed that dogs can identify their owners based not only on visual and olfactory cues (e.g. Polgár et al., 2015 ) but also on vocal cues ( Adachi et al., 2007 ;Gábor et al., 2019 ). Dog neuroimaging revealed the involvement of a secondary auditory region (caudal ectosylvian gyrus) in voice identity processing ( Boros et al., 2020 ). Further, dog brains process vocal emotional valence cues from both humans and dogs in non-primary auditory regions .
Neuroimaging studies provide evidence that in dogs key regions of the reward-and motivation-related corticostriatal circuits are sensitive to human stimuli of high relevance. Specifically, the caudate nucleus (CN) responds to human hand signals associated with food reward ( Berns et al., 2013( Berns et al., , 2012, to a familiar human's odor ( Berns et al., 2015 ), and to verbal praise spoken by a trainer ( Andics et al., 2016 ); and the amygdala (AM) responses to seeing the owner interact with a fake dog depend on dog temperament . Moreover, various dog brain regions, including the AM and the CN respond preferentially to the owner's vs. other persons' faces ( Karl et al., 2020 ). It is unclear, however, whether and how the characteristics of the social relationship between the dog and the speaker modulate neural responses in dogs.
Thus, in a combined behavioral-fMRI study on dogs, we aimed to reveal to what extent social relationship with a person affects brain activity evoked by this person's praising speech. We hypothesized that in dogs, the owner's voice, compared to a control person's voice, elicits greater neural responses in regions involved in auditory, reward-and motivation-related processes, and that this sensitivity is modulated by both speech content and the characteristics of the dog's attachment relationship with the owner. We also predicted that in dogs, similarly to humans, social relationship-dependent processing of vocal stimuli entails secondary rather than primary auditory regions and it is independent of the acoustic processing of speech content.

Participants
Sixteen family dogs were involved in this study from 7 breeds (6 golden retrievers, 5 border collies, 1 Chinese crested, 1 labradoodle, 1 Cairn terrier, 1 Hungarian vizsla, 1 German shepherd). Two of the originally involved dogs (a border collie and a German shepherd) could not complete all scans successfully and were excluded from all analyses. The final sample consisted of 14 dogs (mean ± SD age = 5.4 ± 3.37 years, range: 1-12; 6 males, 8 females).
All dogs were family dogs, living with their owners for 56.3 months on average (range: 7 to 105.6 months) and had various types of training independent of the research (nothing, basic obedience, agility, service dog). Before the test, dogs had been trained to lie motionless for around 10 min in the fMRI scanner using social learning and positive reinforcement (for details see Andics et al., 2014 ). Rajecki et al. (1978) defined the measurable criteria of attachment as displaying 1) proximity-and contact-seeking behaviors towards the attachment figure during exploration (secure base effect) and when experiencing danger (safe haven effect), 2) separation distress in the absence of the attachment figure, and 3) specific behavioral changes upon reunion (which are different from what is shown towards a stranger). Based on these criteria, to assess the dog-owner attachment, we used an adapted version of the SST ( Kovács et al., 2018 ;Lenkei et al., 2021 ).

Table 1
Detailed protocol of the Strange Situation Test.
The location of the test was an unfamiliar room and the procedure consisted of 6 two-minute-long episodes, which presented different levels of stress for the subjects ('Strange Situation') depending on the presence/absence of the owner and a female stranger. Two chairs (one for the owner and one for the stranger) were placed in front of each other in the middle of the room. In addition, there were two tables near two adjacent walls with building blocks on them. A playing area was also designated in the corner where dog toys (e.g. balls) were placed. The owner and the stranger performed different behaviors during the 6 episodes. For the detailed description of the SST protocol and of the owner's and stranger's behavior during the test episodes, see Table 1 , Kovács et al. (2018) and Lenkei et al. (2021) .
The dogs' relevant behaviors (i.e. exploration, play, greeting the entering person, vocalization, physical contact, standing by at the door, following the leaving person, etc.) were coded both in the presence of the human partners and in separation. The analysis resulted in three scores based on the subjects' behaviors during the test: Attachment (towards the owner), Anxiety (related to the unfamiliar place), and Acceptance (of interaction with a stranger). In the present study, we used only the Attachment score, which shows the dogs' specific preference for their owners (compared to the stranger), manifesting mainly in exploration/play in the presence of the owner, proximity and contact seeking behaviors (i.e. gazing at, being in physical contact or playing with the owner) especially after separation; following the leaving owner; greeting the owner during reunions; and showing signs of separation anxiety (i.e., vocalization, standing by at the door) in the absence of the owner (even in the presence of the stranger). The range of the score was 0-11; its value was higher for dogs that used the owner more (often) as The table shows the description of the scored dog behaviors to assess attachment. For description of SST episodes, see Table 2 . This scoring is adapted from the study of Kovács et al. (2018) and Lenkei et al. (2021) . D: dog, O: owner, S: stranger.
To check inter-rater reliability, an independent coder re-scored 50% of the videos. We calculated Cohen's Kappa values for each behavior item and averaged them for the attachment composite score. The mean value was 0.77, indicating substantial agreement.

fMRI stimuli
Stimuli were praising and neutral (meaningless for the dogs) speech spoken by the dogs' owner and by a control person (see Supplementary Materials for examples; Appendix, Stimuli Example S1-4). The control person for each dog was the owner of another dog. Control persons were familiar to the dogs, as they met frequently during the fMRI training sessions, but the dogs were never alone with them and they did not have any interaction in real life situations. According to our training method, more dogs were trained in parallel and all persons present praised the dog that was lying in the scanner. Therefore, praising interactions in the fMRI settings with both the owner and the control person (other owners) were common contexts for each dog. Dogs were paired up and each dog in a pair listened to the very same set of stimuli as the owner of dog A played the role of the control person for dog B and vice versa. Thus, speech stimuli from one owner constituted the control person's speech stimuli to the other dog in the pair. This way we fully controlled for acoustic differences between owner and control person conditions across pairs. Each dog-pair had their own stimulus set (6 stimuli from each of the two speakers, 12 stimuli altogether). To gain real-life stimuli, praising speech was recorded in live praising situations: all owners were asked to give a moderately hard task to their dogs and, in case of a successful implementation, praise them as they usually do. To maximize naturalness, the words used to praise the dogs were not standard phrases, so they could somewhat differ across speakers. For the neutral speech, all owners were asked to read out recipe sentences naturally. For neutral stimuli, the same sentences were spoken by both speakers for each dog-pair, but different sentences were used across different dog-pairs. All speakers were females as each dog had a female owner. We selected 3 praising and 3 neutral stimuli from each speaker. Stimulus length (praising stimuli: mean ± SD = 5.402 ± 0.273 s, range 4.800-5.790 s; neutral stimuli: mean ± SD = 5.391 ± 0.268 s, range 4.840-5.790 s) was counterbalanced across conditions ( T 52 = -1.083, P = 0.284). Praising and neutral stimuli differed in their mean fundamental frequency (F0 praise: mean ± SD = 252.33 ± 26.7 Hz; neutral: mean ± SD = 178.67 ± 24.81 Hz; T 13 = 11.06, P < 0.001) and F0 range (F0 range praise: mean ± SD = 380.63 ± 45.92 Hz; neutral: mean ± SD = 279.52 ± 74.12 Hz; T 13 = 4.21, P = 0.001) ( Fig. 1 A).

Fig. 1. Experimental design.
A Acoustic parameters of praising (P) and neutral speech (N) stimuli. F0: fundamental frequency. * : P < 0.001. Error bars, SEM. B Illustration of the fMRI experimental setting. During the whole scanning procedure, the owner and a control person were sitting in front of the dog covering their mouth with their hands and avoided direct eye contact with the dog. C Sparse scanning design. Stimuli ( ∼5.3 s on average) were presented within 6-second-long scanning gaps. Acq: volume acquisition. PO: praise by the owner, NO: neutral speech by the owner, PC: praise by a control person, NC: neutral speech by a control person, Sil: silence. N = 14.

fMRI experimental design
To provide a congruent situational context to dogs, both the owner and the control person were present during scanning. They sat in front of the dog's head (on randomized sides), did not look directly into the dog's eyes and covered their mouth with their hands during the measurement to avoid confusing the dogs while listening to their voices ( Fig. 1 B). We used a 2 × 2 factorial design with factors speech content (praising, neutral) and social relationship (owner, control person), resulting in 4 auditory conditions: (1) praising by the owner (PO), (2) neutral speech by the owner (NO), (3) praising by a control person (PC), neutral speech by a control person (NC). A silent condition (Sil) was also added. We applied three 6-minute-long runs with three semi-randomized (evenly distributed) condition orders. Specifically, every block of five stimuli contained one stimulus per condition in a random order. Stimuli from the same condition were never consecutive. Each of the three different stimuli per auditory condition was repeated three times per run, resulting in 9 trials per condition in each run. In total, each run consisted of 45 trials (including silent trials).

Scanning procedure
Stimulus presentation was controlled using Matlab (version 9.1) Psychophysics Toolbox 3 ( Kleiner et al., 2007 ) to be synchronized with the volume acquisitions through TTL trigger pulses. Dogs wore MRcompatible headphones through which they listened to the stimuli, and which had a protective function from loud scanner noises. For signal detection, we used a 3T Siemens MAGNETOM Prisma MRI with a single loop coil. For the functional measurements, a single-shot gradient-echo planar imaging (EPI) sequence acquired the volumes of the entire brain (31, 2.0 mm thick coronal slices in a R >> L sequence; TE: 30.0 ms; TR: 8000 ms including 2000 ms volume acquisition and 6000 ms silent gap; flip angle: 90°; matrix: 64 × 64). Sparse temporal sampling was used during which each stimulus was played in a 6-second-long silent gap (onsets at 0.02 s) between 2 s-long volume acquisitions ( Fig. 1 C). This approach is preferable to continuous sampling for detecting soundevoked activity both in dogs ( Bach et al., 2013 ) and humans, especially when using speech stimuli ( Adank, 2012 ;Blackman and Hall, 2011 ). A similar scanning procedure was successfully applied in our previous auditory dog fMRI studies ( Andics et al. , 2016Boros et al., 2020 ). In the current study, we acquired 45 volumes per run. A T1-weighted anatomical template brain image was acquired on a 3T whole body scanner with a single loop coil ( d = 11 cm) ( Czeibert et al., 2019 ). An important consideration was to use the brain template which provided the most detailed label map ( Czeibert et al., 2019 ), for reliable and exact identification of the reported anatomical locations. Although this template is generated from a single subject, it shows proper correspondence with other averaged brain templates (i.e. the one of Nitzsche et al., 2019 ). The use of the same template brain ( Czeibert et al., 2019 ) in this study and in many of our previous studies (e.g. Boros et al., 2020 ;Bálint et al., 2020 ;Bunford et al., 2020 ;Szabó et al., 2020 ) allows for a direct comparison of the coordinates and brain regions. As a result of our several-month-long fMRI training, dogs were able to lie motionless in the scanner (for > 6 min) without restriction. Exclusion threshold for overall motion was 3 mm and 3°for each direction. The runs in which dogs exceeded the given threshold were excluded. Data collection continued until all dogs had 3 successful runs. A run was considered successful if suprathreshold motion did not happen, or happened only in the last 10% of the run. In 6 cases, suprathreshold motion happened in the last volumes; in these cases, the last maximum 5 volumes were cut out, but the run was kept. The average number of dogs' attempts per run was 1.57. For the data included in the analyses, framewise displacement (FD) values ( Parkes et al., 2018 ;Power et al., 2014 ) (mean ± SD FD = 0.241 ± 0.302 mm) were comparable to typical FD values of human adults ( Siegel et al., 2014 ).

fMRI analysis
MATLAB R2016b ( http://www.mathworks.com/products/matlab/ ) and SPM12 ( www.fil.ion.ucl.ac.uk/spm ) were used to preprocess and analyze data. First, functional images of the 3 runs per dog were realigned. Second, all realigned-functional and mean-functional files were reoriented manually to correct for the different orientation of dog and human heads in the scanner. Third, via Amira 3D software platform, mean-functional images were transformed into a template brain (a preselected, average-sized individual brain; for details, see Czeibert et al., 2019 ), which resulted in a normalized mean-functional image for each dog. Fourth, we made a second normalization step in SPM12, when we added the normalized-mean-functional file as a template, the original non-normalized-mean-functional file as a source image and we applied the transformation matrix between the two previous images for all realigned-functional files, thus transforming all images into a common space. Finally, realigned-normalized functional files were convolved with an isotropic 3-D Gaussian kernel (FWHM = 4 mm) for spatial filtering. All images were centered on the rostral commissure. The horizontal (longitudinal) axis was defined as the line connecting the rostral and caudal commissures, where the zero coordinate was aligned with the rostral commissure. The vertical axis was identified with the midsagittal plane.
General linear model and statistical parametric mapping were used for data analyses. The model was built with condition regressors for the conditions (PO, NO, PC, NC, Sil; modelled as 5.3 long blocks) and for the 3 runs. We added realignment parameters as regressors of no interest. A whole-brain inclusive mask was used in the individual-level analyses. We had two group level models: one built with the 5 condition regressors only, and another one in which we added individual attachment scores as covariates. This statistical analysis was based on many human fMRI studies investigating behavioral-brain response correlations to understand the neural mechanisms mediating social behaviors and relationships (e.g. Scheele et al., 2013 ;Spitzer et al., 2007 ;Vrti čka et al., 2008 ). In each group-level model, we report both whole-brain and smallvolume corrected results.
Beyond whole-brain tests, we also performed small volume-corrected analyses in key regions of the reward-and motivation-related corticostriatal circuits, namely the bilateral CN and AM. We specified liberal anatomic masks based on Czeibert et al. (2019) . The single-subject fixed-effect analysis was followed by group-level random-effect tests. In case of both whole-brain and small-volume corrected analysis, an overall voxel threshold of P < 0.001 was applied, and only clusters surviving cluster-level FWE-correction for multiple comparisons at P < 0.05 with at least 3 voxels were reported.
To analyze possible hemispheric asymmetries in the peaks resulting from the small volume-corrected tests, first we mirrored peaks of the L CN (PO > PC: -12, -2, 8) and L AM (PO > NO: -14, -6, -4) to the right hemisphere (R CN: 12, -2, 8; R AM: 14, -6, -4), then we selected 4 mm-radius-spheres around each of them, and calculated average individual beta values of the 4 conditions compared to silence (PO > Sil, NO > Sil, PC > Sil, NC > Sil). Finally, a repeated-measures ANOVA with three factors (hemisphere, speech content and social relationship) was applied on the individual beta values to specify hemisphere effects or their interactions with other factors.
In a follow-up test, we investigated whether attachment score covariate effects on neural responses also appear independently of basic acoustic parameters known to be central for processing vocal cues of emotion and identity in a range of species (F0, F0 range) ( Kriengwatana et al., 2014 ;Latinus et al., 2013 ). First, for each dog, we calculated mean F0 and F0 range differences for the contrasts resulting in attachmentdependent neural responses. Then, we calculated partial correlations between attachment scores and neural responses (i.e. individual beta values for the covariate effect peaks), controlling for mean F0 and F0 range differences.

Ethics statement
The fMRI experiment was conducted at the Brain Imaging Centre of the Research Centre for Natural Sciences of the Eötvös Loránd Research Network, the template brain image was recorded at the MR Research Centre of the Semmelweis University, Budapest and the SST took place at Eötvös Loránd University in Budapest, Hungary.

Results
In an fMRI experiment, dogs listened to four auditory conditions varying in speech content and social relationship (praising by the owner: PO, neutral speech by the owner: NO, praising by a control person: PC, neutral speech by a control person: NC) and a silent condition (Sil) as a baseline. Besides whole brain analyses, we anatomically defined dog brain regions associated with reward-and motivation-related processes within the corticostriatal circuits (CN, AM) and used them as regions of interest (ROIs). In a behavioral test, based on the behavioral criteria of attachment ( Rajecki et al., 1978 ), we assessed dog-owner relationship using an adapted version of the Strange Situation Test (SST), which resulted in an attachment score for each dog ( Kovács et al., 2018 ). This score was used to test covariate effects on brain responses.
The whole-brain random-effects analysis ( Fig. 2 A, C) revealed robust bilateral auditory responses to speech (PO + NO + PC + NC > Sil) in subcortical (caudal colliculus: CC, medial geniculate body: MGB) and cortical regions (portions of the Sylvian and ectosylvian gyri: SG, ESG). We found a main effect of speech content, i.e. stronger response to praising than neutral speech overall (PO + PC > NO + NC) in the right thalamus (R THA), the R caudal (c) SG, and a cluster involving the L mid (m) ESG and L cSG. Specific contrasts revealed stronger response to the owner's praising compared to neutral speech (PO > NO) in the bilateral subcortical regions (including parts of the right mesencephalic tegmentum: MTg, MGB and THA; and L MTg/MGB), a bilateral cluster involving the mESG and the cSG, and in the L CN; and stronger response to the control person's praising compared to neutral speech (PC > NC) in a cluster expanding to the R mESG and the R cSG. At the wholebrain level, we found no main effect of social relationship (PO + NO > PC + NC), but a speech content by social relationship interaction effect (PO -PC > NO -NC) was revealed in the L cESG, and a cluster involving the L CN and L THA. Specific contrasts revealed stronger responses to the owner's compared to the control person's praising speech (PO > PC) in the L cESG; but no social relationship-sensitivity for neutral speech. In addition, we found a preference for the owner's praising speech over all other speech conditions (PO > NO + PC + NC) in bilateral auditory regions (L and R cESG, left mid ectosylvian sulcus: L mESS) and in the L CN. We found no suprathreshold effects for the reverse contrasts in any of the above comparisons. Significant main effects, interactions and specific contrasts from the random-effects whole brain analyses are summarized in Table 3 .
The small-volume random-effects analyses ( Fig. 2 B, C; Table 4 ) with anatomically defined ROIs (L/R CN and L/R AM) revealed speech responsivity (PO + NO + PC + NC > Sil) in the L AM. We found a main effect of speech content (PO + PC > NO + NC) in the L AM. Specific contrasts for speech content-effects revealed stronger response to the owner's praising compared to neutral speech (PO > NO) in the L CN and L AM; but in no ROIs to the control person's praising compared to neutral speech. We found no main effect of social relationship (PO + NO > PC + NC) in any of the ROIs, but a speech content by social relationship interaction effects (PO -PC > NO -NC) were revealed in the L CN. Specific contrasts revealed stronger responses to the owner's compared Table 3 Whole-brain random-effects tests of speech content and social relationship processing.

Contrast
Brain region x y z cluster size to the control person's praising speech (PO > PC) in the L CN. We found no suprathreshold effects for the reverse contrasts in any of the above comparisons, and for any contrasts in R CN or R AM. Lateralization tests in the bilateral ROIs (CN, AM) revealed hemisphere by speech content effects ( F 1,13 = 7.130, P = 0.019) in the AM, indicating a left bias for speech in general, and for praising speech in particular. We also found a hemisphere by speech content by social relationship effect ( F 1,13 = 15.397, P = 0.002) in the CN, indicating a left bias for the owner's praising speech.
The effects of behaviorally measured attachment characteristics on neural responses were tested by adding dogs' attachment score as a covariate to both whole-brain and ROI GLMs ( Fig. 2 D and Table 5 ). On the whole-brain level, we found that individual attachment scores positively correlated with the neural preference to the owner's voice for neutral speech stimuli (NO > NC) in a single cluster involving the L CN (but also extending to the dorsolateral reticular THA). A marginal effect for the same association was found for the owner's voice overall (PO + NO > PC + NC). ROI analyses revealed positive correlation between the attachment scores and neural preference for the owner's voice for neutral speech stimuli (NO > NC) in the bilateral CN, and a significant effect for the same association for the owner's voice overall (PO + NO > PC + NC) in the L CN. We found no further covariate effects in either positive or negative direction, for any contrasts, in either the whole-brain or the ROI analyses. Positive partial correlations between attachment scores and CN responses for each of these two contrasts (i.e. NO > NC; PO + NO > PC + NC) survived controlling for the contribution of either F0 or F0 range ( Table 6 ).

Discussion
In this combined behavioral and fMRI study on dogs, we presented evidence that, when listening to speech, dogs' social relationship with the speaker affects their neural responses in auditory regions and corticostriatal regions associated with reward and motivational processes. Regarding the auditory network, we revealed neural activity increase for the owner's voice (vs. a control person) in a secondary auditory region, and for praising speech (vs. neutral speech) in subcortical and multiple auditory cortical regions. Regarding the corticostriatal reward and motivational network, we demonstrated neural activity increase for praising speech in the caudate nucleus (CN) and the amygdala (AM), and for the owner's praise in the CN. Moreover, we revealed a positive correlation between dogs' attachment scores and CN-sensitivity to the owner's voice, which survived controlling for acoustic effects.

Table 4
Random-effects tests of speech content and social relationship processing within the predefined ROIs.

Contrast
Brain region x y z cluster size Threshold for reporting for all contrasts was P < 0.001 and small-volume cluster-corrected P < 0.05 for FWE. The single strongest peak is reported per cluster. All tests were performed within both of the pre-defined ROIs (L/R CN, L/R AM). L: left, R: right, CN: caudate nucleus, AM: amygdala, n.s.: no significant clusters in either ROI. PO: praise by the owner, NO: neutral speech by the owner, PC: praise by a control person, NC: neutral speech by a control person, Sil: silence. N = 14. Threshold for reporting the covariate effect for all contrasts was P < 0.001 and whole-brain or smallvolume cluster-corrected P < 0.05 for FWE. The single strongest peak is reported per cluster. All tests were performed over the whole brain and within both of the pre-defined ROIs (L/R CN, L/R AM). Italic indicates tendency. L: left, R: right, CN: caudate nucleus, PO: praise by the owner, NO: neutral speech by the owner, PC: praise by a control person, NC: neutral speech by a control person, Sil: silence. N = 14. There are multiple potential underlying mechanisms controlling dogs' social relationship-dependent auditory cortex responses. On the one hand, the owner's speech is clearly more familiar to the dogs than that of the control person. On the other hand, recent behavioral evi-dence indicate cross-species voice identity recognition capacity in dogs ( Adachi et al., 2007 ;Gábor et al., 2019 ). The secondary auditory area (L cESG), which exhibited social relationship-dependent activity, is located in the anterior/caudal temporal cortex. Anterior temporal cortex involvement in differentiating voices based on either voice familiarity ( Stevenage, 2018 ) or on voice identity ( Belin et al., 2004 ) has been reported in humans ( Andics et al., 2013b( Andics et al., , 2010Belin and Zatorre, 2003 ), macaques ( Petkov et al., 2009 ), and recently also in dogs ( Boros et al., 2020 ). While the present study was not designed to disentangle the contribution of voice familiarity and voice identity processing to the present findings, we argue here that other explanations, namely low level acoustic and novelty/expectancy accounts, are improbable. The former explanation can be excluded because acoustics was balanced across dogs for social relationship contrasts (owners and control persons were the same individuals across the sample). In support, social relationship-dependent effects were found in the secondary auditory cortex, but not in lowerstage auditory regions ( Pannese et al., 2015 ). A novelty/expectancy account for the social relationship-dependent effects can also be excluded here because the control persons were also well-known to dogs and were present during each test session, so the control person's voice was presumably perceived neither as novel nor as unexpected. Furthermore, in contrast with the reduced neural response to the control person's speech as reported here, novel/unexpected stimuli often elicit greater, and not weaker, brain responses (e.g. dogs, Prichard et al., 2018 ;humans, Schomaker and Meeter, 2012 ). We therefore suggest that the social relationship-dependent neural effects reported here are modulated by previous experience with the speaker, i.e. by the degree and quality of the social relationship.
The neural effects for speech content processing in dogs were robust: activity increase for praising compared to neutral speech was found for both the owner's and the control person's voice, in large clusters of the bilateral auditory cortex, and also subcortically. Praise-sensitive peaks in the subcortical brain were located more dorsally and, in the auditory cortex, more caudally than auditory peaks (from the all sounds vs. silence contrast). The neural mechanisms of speech content processing are in line with praising and neutral condition differences, both in acoustics (i.e. praising speech had higher mean F0 and greater F0 range) and in meaningfulness (praising speech contained more words and prosodic patterns that may have been meaningful for the dogs). Indeed, different primary auditory regions (e.g. mESG) in dogs are sensitive to acoustic features (i.e. F0) that are central in coding emotional valence ( Andics et al., 2016. Attentional and relevancy accounts may also contribute to both praising (over neutral) speech preference in auditory regions and to owner's praise (vs. control person's praise) preference in the cESG and in the CN.
According to behavioral studies, dogs attend more to dog-directed than to adult-directed speech ( Jeannin et al., 2017 ) and tend to follow instructions more readily if they are addressed to them ( Virányi et al., 2004 ). Neurally, attended stimuli elicit stronger neural responses ( Sevostianov et al., 2002 ) also during voice processing ( Andics et al., 2013a ).
Activity in the a priori selected subcortical structures within the corticostriatal circuits, namely the CN and the AM, was modulated by speech content, but the two regions' response profiles were different. The CN differentiated praising and neutral speech conditions more when they were spoken by the owner. The CN's reward-sensitivity in dogs has also been shown by other studies related to either food ( Berns et al., 2013( Berns et al., , 2012 or social rewards ( Andics et al., 2016 ;Cook et al., 2014 ). The AM exhibited an overall speech content effect (greater activity to praising than neutral stimuli). This is in line with the results of lesion studies in other mammals, showing the role of the AM in reward-and social behavior-related processes ( Aggleton and Young, 2000 ). It seems, however, that while the AM is sensitive to emotional vocalizations in dogs (as in humans; Ethofer et al., 2009 ), this sensitivity is independent of the speaker's identity. In both the CN and the AM, these activity differences between conditions were found only in the left hemisphere, with the AM also exhibiting a significant side bias. The finding of a left bias in the AM is in line with human neuroimaging results ( Baas et al., 2004 ). The modulatory effects of the social relationship with the speaker on processing praising speech in dogs are analogous to those in humans. Increased CN and secondary auditory responses for the attachment figure's (owner) compared to the control person's praising speech are in line with neuroimaging data in human children suggesting increased auditory, reward-and motivation-related responses for the voice of the attachment figure (i.e. mother) ( Balleine et al., 2007 ;Imafuku et al., 2014 ;Purhonen et al., 2004 ).
This study revealed a positive relationship between the attachment score and the neural preference of CN for the owner's (vs the control person's) voice. We argue here that this positive correlation reflects that hearing the owner's voice is more rewarding for dogs with higher attachment scores, rather than representing simply acoustic-or familiarity-related differences. We controlled for acoustics at two levels: across dogs for social relationship contrasts (the owner of a dog was the control person for another) and at the analysis level. As higher tones are linked to intense emotional valence in both dogs Pongrácz et al., 2006Pongrácz et al., , 2005Yin and McCowan, 2004 ) and humans ( Busso et al., 2009 ;Liscombe et al., 2003 ), it is a reasonable possibility that pitch variability may explain the above effect (because either dogs are more attached to a person with higher/lower pitched voice or a higher pitched voice with larger F0 variation is more rewarding; Fernald, 1989 ). Attachment score-CN response correlation, however, survived controlling for the fundamental frequency at the analysis level, which shows that this association is not simply driven by acoustics. Neither can this association be explained by familiarity, as the familiarity difference between the owner and the control person is presumably similar in case of each dog, but at least does not differ systematically. Thus, it seems that the neural mechanisms mediating interspecific attachment to the owner are connected to the reward processing system in the dog brain, similarly to neural mechanisms for intraspecific attachment in other mammals ( Numan and Young, 2016 ), including humans ( Abrams et al., 2016 ;Atzil et al., 2017 ). An interesting indirect consequence of the present findings is that secondary auditory regions of the dog brain may play an important role in individual recognition of owners based on voice alone.
In both human infants and dogs, one function of attachment is to keep the attached individual in the proximity of the attachment figure, which can be facilitated by making social interaction with them a rewarding experience. For companion dogs, not only interaction with, but even the presence of, their owner is rewarding ( Feuerbacher and Wynne, 2016 ). Due to the relevant meaning and intonation, the owner's praise could be rewarding for all dogs, while for dogs with higher attachment scores even the owner's neutral speech could be rewarding. This can explain why the positive correlation between attachment measures and owner voice sensitivity in CN was more pronounced in the neutral speech condition. This interpretation is supported by the results of a recent study showing that hearing the owners' voice attenuates the increase in dogs' cortisol level during separation ( Shin and Shin, 2016 ).
Overall, our findings identify dog auditory brain regions sensitive to speech content and social relationship with the speaker (owner/nonowner) and also show that, for dogs, the reward value of speech is influenced by the attachment with the owner. The attachment-dependence of dogs' neural reward responses suggests similarities in brain mechanisms modulated by intraspecific infant-mother and interspecific dog-owner attachment.

Declaration of Competing Interest
The authors declare no competing interests.

Acknowledgments
We are grateful to the owners and their dogs for their voluntary participation in this research. We thank Borbála Ferenczy for the illustration in Fig. 1

Data and code availability statement
The data that support the findings of this study are openly available in Google Drive at [ https://drive.google.com/drive/folders/ 1tJ5CzSAmJixB6kEhZAoj5tsMwmZ2Jr2m?usp = sharing ] or requests for materials could be addressed to A.G. (annagabor33@gmail.com).

Author contributions
AG: study design, stimulus collection and editing, methodology, data collection, data analyses, interpretation of data, manuscript writing. AA: study design, methodology, data analyses, interpretation of data, critical revision, manuscript writing. AG and AA contributed equally to this study. KC: definition of anatomic masks, specification of brain regions. CC: data collection, data analyses. AM: interpretation of data, critical revision. MG: study design, data collection, data analyses, interpretation of data, critical revision.

Supplementary materials
Supplementary materials associated with this article can be found in the online version at doi:10.1016/j.neuroimage.2021.118480 . Video abstract of the study: https://youtu.be/7ba01ggFUXg .