Speech and neuroimaging effects following HiCommunication: a randomized controlled group intervention trial in Parkinson’s disease

Abstract Speech, voice and communication changes are common in Parkinson's disease. HiCommunication is a novel group intervention for speech and communication in Parkinson’s disease based on principles driving neuroplasticity. In a randomized controlled trial, 95 participants with Parkinson’s disease were allocated to HiCommunication or an active control intervention. Acoustic analysis was performed pre-, post- and six months after intervention. Intention-to-treat analyses with missing values imputed in linear multilevel models and complimentary per-protocol analyses were performed. The proportion of participants with a clinically relevant increase in the primary outcome measure of voice sound level was calculated. Resting-state functional MRI was performed pre- and post-intervention. Spectral dynamic causal modelling and the parametric empirical Bayes methods were applied to resting-state functional MRI data to describe effective connectivity changes in a speech-motor-related network of brain regions. From pre- to post-intervention, there were significant group-by-time interaction effects for the measures voice sound level in text reading (unstandardized b = 2.3, P = 0.003), voice sound level in monologue (unstandardized b = 2.1, P = 0.009), Acoustic Voice Quality Index (unstandardized b = −0.5, P = 0.016) and Harmonics-to-Noise Ratio (unstandardized b = 1.3, P = 0.014) post-intervention. For 59% of the participants, the increase in voice sound level after HiCommunication was clinically relevant. There were no sustained effects at the six-month follow-up. In the effective connectivity analysis, there was a significant decrease in inhibitory self-connectivity in the left supplementary motor area and increased connectivity from the right supplementary motor area to the left paracentral gyrus after HiCommunication compared to after the active control intervention. In conclusion, the HiCommunication intervention showed promising effects on voice sound level and voice quality in people with Parkinson’s disease, motivating investigations of barriers and facilitators for implementation of the intervention in healthcare settings. Resting-state brain effective connectivity was altered following the intervention in areas implicated, possibly due to reorganization in brain networks.


Adherence and missing data
Acoustic pre-or post-data were missing due to defective speech recordings for some participants.In the HiCommunication group, one participant was missing voice sound level (text reading, monologue and noise) pre-intervention, one was missing voice sound level in noise post-intervention and one was missing Vowel Articulation Index (VAI) pre-and postintervention.In the active control group one participant was missing all acoustic data postintervention, one was missing voice sound level (text reading, monologue and noise) preintervention, one was missing voice sound level in monologue post-intervention, one was missing F0 variability post-intervention and two were missing VAI pre-and postintervention.
Six-month follow-up acoustic data were available for 20 participants (43%) in the HiCommunication group and 23 (48 %) in the active control group.Thus, only per-protocol follow-up analyses were performed.Out of those participants, some were missing one or more of the acoustic variables.In the HiCommunication group, one was missing voice sound level in monologue, F0 variability, VAI, Acoustic Voice Quality Index (AVQI) and Harmonics-to-Noise Ratio (HNR), one was missing voice sound level (text reading, monologue and noise) and one was missing VAI.In the active control group, two were missing voice sound level in monologue, F0 variability, VAI, AVQI and HNR, one was missing AVQI and HNR and one was missing VAI.
Out of the 28 HiCommunication participants with pre-and post-resting-state functional MRI (rsfMRI), one had missing pre-intervention data on voice sound level and was excluded.Due to missing Levodopa-equivalent daily dosage (LEDD) data, we excluded one further HiCommunication participant for the hierarchical model.Out of the 32 active controls, two had missing LEDD data and were excluded.

Voice function
The averaged speech loudness within a predefined time segment (text reading, monologue, or text reading in noise) and speech breathing (the maximum phonation time of a sustained vowel) were chosen to represent the speech domain voice function.

Voice sound level in text reading, monologue, and noise
Analysis was performed using the software Sopran (version 1.0.22 © Tolvan Data).For analysis of text reading the entire text without the title was used.For analysis of monologue, a 30 second interval from the mid portion of the monologue was used.In cases where the monologue was < 30 seconds, the entire monologue was used.For analysis of voice sound level in noise, the initial 215 syllables of the text were used.Participants read the text aloud whilst pink noise (70-72 decibel (dB)) was played in headphones (Sony MDR-ZX660AP).To reduce the impact of low-frequency background noise on the sound level, a C-weighted decibel (dBC) was used to report the voice sound level for all measures.

Maximum Phonation Time
Analysis was performed using the software Sopran (version 1.0.22 © Tolvan Data).The spectrogram was visually inspected to ensure that stable phonation was analysed.In cases where several repetitions of the sustained vowel were recorded, the best (i.e., longest) attempt was used for analysis.

Voice quality
The Acoustic Voice Quality Index (AVQI) and the Harmonics-to-noise ratio (HNR) were chosen to represent the speech domain voice quality.The AVQI (version 01.03, Phonanium, 2021) is a composite measure that combines several acoustic parameters to obtain a single score for the estimation of dysphonia 1,2 .The equation of the AVQI includes the smoothed cepstral peak prominence, Harmonics-to-Noise Ratio (HNR), shimmer local, shimmer local decibel (dB), general slope of the spectrum, and tilt of the regression line through the spectrum.The parameters are weighted together through linear regression analysis and converted to a score on a linear scale between 0-10.The limit for what is considered a dysphonic voice according to AVQI varies across languages.Since AVQI has not yet been evaluated for Swedish, the limit validated for Dutch speakers was used (AVQI score = 2.95).
Scores below the limit value are considered to represent a non-dysphonic voice quality.HNR is a measure of the proportion of harmonic sound to noise in the voice measured in decibels.
HNR quantifies the relative amount of additive noise.The lower the HNR, the more noise in the voice 3 .

Acoustic Voice Quality Index and Harmonics-to-noise ratio
The analysis tool AVQI (version 01.03, Phonanium, 2021) was used for AVQI analysis.The middle three seconds were extracted from a sustained vowel [a:], with a margin of 0.10 seconds.In the recordings which included repeated attempts at the sustained vowel, the last attempt was consistently used for analysis.In addition, extractions of a pre-chosen 45 syllables of text reading were analysed.HNR analysis was performed within the AVQI analysis in the same manner.

Prosody
Pitch (fundamental frequency (F0)) variability was chosen to represent the speech domain prosody.Pitch variability reflects the natural changes in voice pitch.

Fundamental Frequency standard deviation
Analysis was performed using the software Praat (version 6.0.36) 4 ).The entire text without the title was used for analysis.Because F0 detection by taking default settings in Praat may be error-prone, a Praat script ("Get_speakers_register.praat") using an algorithm for automatic estimation of pitch floor and pitch ceiling was used 5 .

Articulation
Measures of articulatory diadochokinesis (DDK) as well as of vowel articulation were chosen to represent the speech domain articulation.DDK measures are designed to estimate the rate and regularity of consonant-vowel syllable repetitions.Alternating motor rates (AMR) (measured in syllables per second) are regarded to reflect the motor abilities of speech articulators to reveal their movement limitations 6 .Sequential motor rates (repetition of the sequence /pa-ta-ka/) is generally more challenging to perform because of the alternation of bilabial, alveolar, and velar place of articulation.The DDK SMR rate has been shown to be altered in participants with PD compared to healthy controls 6 .The normative median value for DDK AMR (/pa-pa-pa/) is 6.4 (SD 1.0) syllables/second and for DDK SMR 5.8 syllables/second (SD 1.0) for Swedish healthy adults 7 .Measures of vowel space, including Vowel Articulation Index (VAI), may capture a reduced articulatory range of motion in hypokinetic dysarthria 8 .The method makes it possible to get an overall picture of a person's articulation by only measuring the formant frequency values of the corner vowels /a/, /i/ and /u/.The VAI is a theoretically driven and empirically tested metric developed to represent vowel formant centralization, i.e., formants that normally have high center frequencies tend to have lower frequencies, and formants that normally have low center frequencies tend to have higher frequencies.The VAI has shown promise to more effectively reduce interspeaker variability noise while maintaining high sensitivity to vowel centralization compared to the more traditional metric vowel space area (VSA) 8,9 .The VAI is expressed as:

Diadochokinesis sequential motion rate and alternating motion rate
Analysis was performed in Sopran (version 1.0.22 © Tolvan Data).Repetitions of the syllables /pa-pa-pa/ and /pa-ta-ka/, respectively, were visually inspected and a 5 second interval from a stable portion of the syllable repetition was used for analyses.In cases where 5 seconds of stable repetition was not available, the entire portion of stable repetition was used.

Vowel Articulation Index
Analysis was performed using the software Praat (version 6.0.36)).The aim was to extract 10 repetitions of each corner vowel from the speech material (sentences and text reading).
However, since the speech material was not priorly adapted to facilitate VAI analysis there were sometimes less than 10 repetitions for each corner vowel.Consequently, to minimise drop-out a lower limit was set to 6 repetitions of each corner vowel.To obtain the formant frequency values, 30 milliseconds in the middle of each vowel were analysed and each formant was visually inspected to ensure that the 30 milliseconds was extracted from a stable part of the formant.Furthermore, outliers were reexamined to ensure that the incorrect formant had not been measured.The VAI was then calculated from the mean values of the formant frequency values extracted from each vowel using the formula (1).

Deviations from preregistration
We deviated from the protocol by using multiple imputation to handle missing data.
Regarding the dynamic causal modelling (DCM) analyses, we additionally included LEDD as a regressor of no interest.We planned a paired t-test comparing pre-and post-intervention but decided to first perform an analysis of the group-by-time interaction effect using a hierarchical second-level model.We did not use the healthy control cohort for any rsfMRI comparisons.We did not include analysis of whether baseline characteristics of speech and voice predict intervention response.

Methodological discussion
Acoustic analysis is a widespread tool in clinics and research to analyse speech disorders and is often suitable for detection of hypokinetic dysarthria even at an early stage in the disease progression when symptoms may be relatively mild 10 .However, there are limited guidelines on which acoustic measures to use to specifically target disorders of relevant speech domains.
We used guidelines developed in a study by Rusz and colleagues as one of four criteria for which acoustic measures to use as outcomes of HiCommunication 11 .To ensure that the acoustic outcomes are valid in terms of representing the speech dimensions associated with hypokinetic dysarthria as well as capturing the treatment effects post-HiCommunication, further studies in the project will investigate whether the acoustic outcomes are correlated to e.g., audio-perceptual measures of speech and voice.

Supplementary Table 2. Results of ICC using single-rating, absolute-agreement, 2-way random-effects model
Fundamental frequency standard deviation in semitones (log transformed).AVQI: Acoustic Voice Quality Index, HNR: Harmonics to noise ratio, DDK-AMR and DDK-SMR: diadochokinetic rate alternating and sequential motion rates, VAI: Vowel Articulation Index ICC: Intraclass correlation coefficient.CI: Confidence interval.LL: lower limit.UL: upper limit.df: degrees of freedom.F0 variability:

Predictors for multiple imputation Supplementary Table 3. Correlation matrix of the demographic and outcome variables
The numbers in the first row correspond to the numbered outcomes in the first column.All values are Pearson's correlation coefficients.F0 variability: Fundamental frequency standard deviation in semitones (log transformed).AVQI:, Acoustic Voice Quality Index, HNR: Harmonics-to-noise ratio.DDK-AMR and DDK-SM: diadochokinetic rate alternating and sequential motion rates.VAI: Vowel Articulation Index.MDS-UPDRS Movement Disorders Society -Unified Parkinson's Disease Rating Scale.PDQ-39: the Parkinson's Disease Questionnaire-39.MoCA: Montreal Cognitive Assessment, higher scores reflect a higher level of global cognitive function.LEDD: Levodopa Eqivalent Daily Dose