Deep TMS H1 Coil treatment for depression: Results from a large post marketing data analysis

Phase IV study evaluated Deep TMS for major depression in community settings. Data were aggregated from 1753 patients at 21 sites, who received Deep TMS (high frequency or iTBS) using the H1 coil. Outcome measures varied across subjects and included clinician-based scales (HDRS-21) and self-assessment questionnaires (PHQ-9, BDI-II). 1351 patients were included in the analysis, 202 received iTBS. For participants with data from at least 1 scale, 30 sessions of Deep TMS led to 81.6% response and 65.3% remission rate. 20 sessions led to 73.6% response and 58.1% remission rate. iTBS led to 72.4% response and 69.2% remission. Remission rates were highest when assessed with HDRS (72%). In 84% of responders and 80% of remitters, response and remission was sustained in the subsequent assessment. Median number of sessions (days) for onset of sustained response was 16 (21 days) and for sustained remission 17 (23 days). Higher stimulation intensity was associated with superior clinical outcomes. This study shows that beyond its proven efficacy in RCTs, Deep TMS with the H1 coil is effective for treating depression under naturalistic conditions, and the onset of improvement is usually within 20 sessions. However, initial non-responders and non-remitters benefit from extended treatment.


Introduction
Major depressive disorder (MDD) is a highly prevalent and burdensome disease (Kesler et al, 2003;Lopez et al., 2006;Zhdavana et al., 2021). As traditional treatments are limited due to availability, costs, insufficient efficacy or limited tolerability, the development of new interventions is crucial. Repetitive transcranial magnetic stimulation (TMS) constitutes such an approach to treat MDD (Padberg and George, 2009). Deep TMS TM utilizes specially designed H-Coils to induce neuronal depolarization in deep and wide cortical regions. The H1 Coil is designed to bilaterally modulate larger cortical regions and their neuronal networks in the prefrontal cortex with a higher intensity on the left side. In contrast, traditional repetitive TMS (rTMS) protocols for MDD specifically target left DLPFC regions and are hypothesized to exert their effects via pathways between the left DLPFC and the sgACC (Siddiqi et al., 2021). However, Deep TMS allows a less focal cortical stimulation also at greater depth including projections to subcortical regions associated with the reward system, without significantly increasing the electric field induced in superficial cortical layers (Roth et al., 2002;Zangen et al., 2005;Roth et al., 2007;Zibman et al., 2021).
The safety and efficacy of monotherapy with the H1 Coil Deep TMS for MDD was shown in an international multicenter, randomized, controlled trial (RCT) with 212 subjects resulting in FDA clearance (Levkovitz et al., 2015), as well as in other RCTs (Filipcic et al., 2019;Matsuda et al., 2020). Deep TMS was also shown to be efficacious for depression and comorbid anxiety symptoms (Pell et al., 2022), which was demonstrated in several pilot studies (Levkovitz et al., 2009;Isserles et al., 2011;Harel et al., 2014;Berlim et al., 2014;Rapinesi et al., 2015). However, the majority of RCTs do not reflect the situation of mental health care in real world settings and include only relatively small samples. Thus, the possibility that outcomes of RCTs do not accurately reflect outcomes in community practice warrants collection and analysis of post-marketing data. Additionally, post-marketing data analysis may allow inferences about effects of protocol changes, variation in technical parameters, dose-response relationships or duration of effects beyond the follow-up periods of RCTs.
The present naturalistic study reports the largest data set for Deep TMS and may allow new insights into Deep TMS practice and its antidepressant effects. Several objectives were pursued in our analysis: (1) characterizing treatment response in a large naturalistic setting, (2) investigating the time course of clinical effects (3) investigating longterm durability, and (4) identifying predictors of response over time.

Methods
The post-marketing study was designed to collect treatment information, demographic data, and outcome data, on subjects treated with the Deep TMS H1 Coil for MDD. All Deep TMS clinics were asked to participate and sent instructions along with a template excel database to complete (template available in supplementary material). Depression severity was assessed by the 21-items Hamilton Depression Rating Scale (HDRS-21, Cusin et al. 2010), the Montgomery-Asberg Depression Rating Scale (MADRS, Cusin et al. 2010), the Patient Health Questionnaire-9 (PHQ-9, Kroenke et al. 2001), the Beck Depression Inventory-II (BDI-II, Wang and Gorenstein 2013), the Quick Inventory of Depressive Symptomatology (QIDS, Reilly et al. 2015), and/or the Inventory of Depressive Symptomatology 30-items (IDS-30, Rush et al. 1996). To incentivize participation and support the work of data entry, clinics received $5 per line of excel data and $70 per HDRS/MADRS assessment. A line of excel data corresponded to one treatment session with detailed treatment information.
All sites received device training and certification. The protocol was reviewed by Sterling IRB and granted exemption from informed consent provided patients were assigned only a patient code (not name/initials) and age (year, not date of birth). The study was registered at clinicaltrials.gov (NCT04679753).
Only 41 patients received less than 20 sessions and reached response or remission. Their inclusion had minor impact on the results.
The main analysis set was the whole set of post marketing data. An additional dataset comprised data of up to 12 weeks from first treatment. This timeline was consistent with the insurance approval of thirtysix treatment sessions in the US and the FDA-cleared treatment schedule of five days per week for four weeks followed by biweekly continuation treatments.

Interventions
Patients were treated with two Deep TMS protocols: either high frequency (HF) Deep TMS or intermittent theta burst (iTBS, Huang et al. 2005). Deep TMS was administered using the BrainsWay H1 Coil with a Magstim Rapid 2 (Magstim Company, Spring Gardens, UK) stimulator or with the BrainsWay stimulator (BrainsWay, Jerusalem, Israel), and typically administered according to the FDA approved protocol: 18 Hz, 120% intensity related to the resting motor threshold (rMT), 55 trains of 2 s duration, inter-train interval (ITI) 20 s, 1980 pulses per session.
The iTBS protocol typically consisted of bursts of 3 pulses at 50 Hz, 5 Hz bursts frequency, 2 s on and 8 s off, 1800 pulses per session at 80 or 90% of the hand rMT according to the classical protocol by Huang et al. (2005), but at a threefold higher pulse number (Cheng et al., 2016;Li et al., 2020).

Assessments and definition of core measures
Treatment monitoring data was available for most patients, although measured at varying time points. Analyses included remission and response rates in the whole dataset and within 12 weeks from treatment onset. Further analyses included response/remission rates for patients who had at least 30 Deep TMS sessions, sustained remission and sustained response (defined as ≥2 consecutive assessments meeting response/remission criteria) as well as number of sessions/days required to reach sustained response/remission. Numbers of patients assessed with MADRS, QIDS or IDS-30 were small, hence discrete outcomes were determined for the HDRS-21, PHQ-9, and BDI-II scales. Additionally, we recorded remission/response rates as indicated by any of the scales, as well as the median time until remission/response on any scale. Response/remission rates were analyzed for the following sub-groups: (i) patient treated with iTBS protocol. (ii) patients treated with the FDA-cleared HF protocol. (iii) elderly patients (aged >68 years) and (iv) young patients (aged 18-22 years). A continuous severity measure was derived using standardized symptom scores (z-scores, computed using the baseline standard deviation) from the most commonly used scale for each patient, to achieve a reliable and change-sensitive representation of the individual patient's symptom trajectory. First, each available score on a depression scale was standardized, where the mean of the scale's baseline values was subtracted and then divided by its baseline standard deviation. Zero represents an average severity at baseline. For each patient the most rated scale was determined. This scale was then used in its standardized version.
The relations between the original and standardized scores of each scale are presented in Fig. S1 in the Supplement. Criteria of remission and response were defined for the various scales according to established benchmarks (Table S1).
Durability of response over 60, 90, and 180 days after achieving response was analyzed based on the scale which was most commonly used in an individual patient. Rates were calculated as the number of patients showing response on all available measurements over the defined period following response, divided by the number of patients with at least one measurement at the end of the defined period following response.

Statistical analysis
All statistical analyses were performed in R Development Core Team R (2011). To test overall treatment effectiveness of the Deep TMS treatment, change in depression scores over the course of all available treatments (whole dataset) and over the course of 12 weeks was analyzed using 3-level hierarchical linear mixed models (LMM, Bates et al. 2007). Measurements were considered as clustered within patients, and patients were clustered within treatment sites. By using multilevel modeling, missing data and unbalanced data structures can be handled. A continuous time factor (weeks since start of treatment) was included as a fixed effect, which was log-transformed if this improved model fit.
To account for between-patient differences in baseline scores and change rates, random effects were added for the model intercept and slope. Effect size for change until endpoint was calculated using where, t is time, as recommended by Feingold et al. (2009). Days until (sustained) response and (sustained) remission were computed (1) for patients who showed the respective outcome within 12 weeks of treatment and (2) per Kaplan-Meier estimates using failure (i.e. events increasing over time) taking censored patients into account.

Exploration of moderators
Predictors of symptomatic improvement were investigated, for continuous and discrete clinical outcomes and for the time it took to achieve a discrete outcome.
(1) Continuous outcomes (standardized symptom score): Moderation effects of candidate predictors were assessed using the time x moderator interaction in a multiple LMM (as described above). As continuous predictors were measured on different scales, they were standardized prior to modeling and mean centered to allow interpretation of regression estimates. Extreme values were winsorized if they exceeded the upper and lower 10th percentile of the Weibull distribution.
(3) Time until discrete outcomes: Predictors were evaluated using cox regression.
Associations for each predictor were first explored in bivariate models to avoid issues related to multicollinearity and to identify nonlinear associations with the outcome in continuous predictors. If they showed significant effects on higher order polynomials (e.g. quadratic, meaning that the best fit was not linear) an additional term for that polynomial was added henceforth and the vertices (i.e. turning points) were computed. Predictors were then entered into multiple regression models encompassing one of 3 variable classes: (i) patient variables (including baseline severity, age, gender, number of lifetime episodes and resting motor threshold (MT)), (ii) treatment administration variables (number of sessions in 12 weeks, treatment density, stimulation density defined as number of sessions divided by number of treatment days), (iii) technical variables (stimulation intensity in percentage of rMT, total number of pulses per day).
Multicollinearity was inspected using The results were considered significant if p < 0.05. To avoid issues related to multiple testing, p-values were adjusted for the false discovery rate (FDR) using the Benjamini-Hochberg method across all 3 variable classes.

Continuous change over time
On a descriptive level, patients showed marked reduction in depression scores over time until day 20, followed by a slower but continuous further reduction (Fig. 1). Weeks were consequently logtransformed as the Bayesian information criterion (BIC) between the depression score and log(time) (BIC=23988) was improved compared to the BIC with time (BIC=25393). A significant continuous decrease was observed until week 12 (beta = -0.69, t_value (df = 11.3) = -18,65 p<.0001). Regression coefficients were retransformed to natural units by multiplying them with the natural log of the treatment duration, resulting in a change of -1.71 standardized depression score points until week 12 (d=-2.42).

Acute and sustained response and remission
Patients who received 20 sessions had a 73.6% response and 58.1% remission rate on any scale. This increased to 81.6% response and 65.3% remission rate for patients who received 30 sessions of Deep TMS (Table 2; Fig. 2). When looking at the individual rating scales, the physician administered scale (HDRS) was more sensitive to detecting remission and response at 20 and 30 sessions than patient oriented scales (PHQ-9, BDI). The results for 12 weeks are shown in Table S2 and Fig. S2 in the Supplementary Material. Table 3 presents the numbers of patients who achieved response/ remission and had subsequent assessment allowing further analysis of sustained response and remission, percentages of patients with sustained response/remission in subsequent assessment, as well as the medians, 25th and 75th quartiles of number of sessions and number of days required to reach sustained response/remission, based on the used scales (PHQ-9, BDI-II, and HDRS-21).
On all scales, 80% to 82% of patients who achieved remission and 83% to 86% of patients who achieved response maintained their remitter/responder status. The medians of number of treatment sessions required to reach sustained response and remission were 16 to 20 and 16 to 21, respectively. The medians of days until response and remission were 21 to 29 and 21 to 35, respectively.
Cumulative survival curves of time to response, remission, sustained response, and sustained remission among patients who achieved response/remission are shown in Fig. 1 B-E. The median time to response and remission (dotted lines) was 21 days.

Durability analysis
The number of responders who had assessments 60, 90, and 180 days after response are shown in Table 4, along with the percentages of patients who maintained the responder status throughout the period. The rates of durable response for 60, 90, and 180 days were 57%, 54.5% and 44.3%, respectively. Table 5 presents results for patients who received the FDA-cleared protocol (Levkovitz et al., 2015). Response (remission) rates in the whole dataset were 73% (55%) and 70% (70%) on any scale and on HDRS, respectively. The results for 12 weeks are shown in Table S3 in the Supplementary Material.

Theta burst protocols
Interestingly, 202 patients received Deep TMS with iTBS protocols. Table 6 presents the rates of response and remission for the various scales and endpoints, as well as the numbers of patients assessed with each scale. Response (remission) rates in the whole dataset were 72% (69%) and 79% (75%) on any scale and on HDRS, respectively. The results for 12 weeks are shown in Table S4 in the Supplementary Material.

Older adults and young MDD patients
Separately analyzing the sample of older adult patients (age >68 years, mean ± SD:74.2 ±5.3), 136 patients received Deep TMS treatment. Response (remission) rates in the whole dataset were 70% (49%) and 80% (73%) on any scale and on HDRS, respectively ( Table 7). The results for 12 weeks are shown in Table S5 in the Supplementary Material.

Predictors analysis
Results of the multiple regression analyses are shown in Table 9 for data within 12 weeks, where meaningful data were available, using the standardized depression score (z-score). The variability in stimulation frequency, train duration, inter-train interval and number of trains per session was very small, hence these variables were not included in the analysis.
Higher baseline severity was associated with larger continuous improvement but with lower remission and sustained remission rates (Fig. 3). Higher stimulation intensities (in % of the individual rMT) were  Table 2.      associated with a significantly superior continuous change from baseline and a shorter time to response/sustained response (Fig. 4). Older age, smaller number of lifetime episodes, and lower rMT were associated with better continuous improvement. More sessions in 12 weeks were associated with less continuous improvement but with higher response rates. Treatment density was associated with less continuous improvement and lower response/remission rates. Stimulation density showed a quadratic behavior with respect to continuous improvement. Total number of pulses per day showed a quadratic behavior with respect to continuous improvement and remission rates (Fig. 5). Gender did not show associations with any of the clinical outcomes (Table 9).

Discussion
This post-marketing study included 1351 MDD patients, making it the largest naturalistic study of Deep TMS in MDD to date. The comprehensive analysis showed that for all participants with data from at least 1 scale, 20 sessions led to a 73.6% response and 58.1% remission rate while 30 sessions led to an even greater 81.6% response and 65.3% remission rates. Various self-and observer rating scales for depressive symptoms were applied, but the respective scales differed between individual patients. The HDRS was applied in 470 patients and HDRS remission rates were much higher (69-76%) than patient self-reported rating scales (PHQ-9 and BDI-II). This difference may reflect the patients' tendency to self-classify as more severely depressed (O'Reardon et al., 2007;Zimmerman et al., 2012). Interestingly, remission rates with BDI-II were higher than with PHQ-9. This might at least partially reflect the better manifestation of anxiosomatic symptoms in the BDI-II questionnaire.
Sustained response onset typically occurred after 16 sessions (21 days) and sustained remission often occurred after 17 sessions (23 days). These are upper bounds since patients may have reached response/ remission earlier than their scheduled assessment. Remission rates were particularly high for individuals that received theta burst stimulation.
When assessing a treatment for MDD, both the probability to reach response/remission and the typical timing of response are important considerations. The STAR*D study showed that only about one third of patients reach remission with first line antidepressants Trivedi et al., 2006), and the chance of response/remission significantly decreases with further antidepressant trials . Furthermore, response/remission often occurs after ~8 weeks . In this large-scale naturalistic study, the onset of response was on average after 16 Deep TMS sessions and remission was achieved on average after 17 Deep TMS sessions (i.e. after 3 to 4 weeks). It can be concluded that the vast majority of MDD patients gain clinical benefit from Deep TMS, and that the average onset of effect is relatively fast compared to antidepressants. Furthermore, about 80% to 90% of patients who achieved response/remission maintained their status in subsequent assessments (Table 3). Additionally, MDD symptoms showed continuous improvement with increasing number of Deep TMS sessions (Fig. 1). Hence, many initial non-remitters and even non-responders may benefit from continued Deep TMS treatment. This hypothesis is further supported by previous findings (Yip et al., 2017) from the pivotal multicenter trial on Deep (H1 Coil) TMS for MDD that led to FDA clearance (Levkovitz et al., 2015).
A recent study reported the clinical outcomes from a large registry of MDD patients who underwent rTMS treatment with a figure-8 coil (Sackeim et al., 2020). For clinical rating, this registry included CGI-S (which is not specific to MDD hence was not used in the current study) and PHQ-9 scale. Direct comparison of results is confounded, yet response and remission rates with PHQ-9 in this study were higher than the rates reported by Sackeim et al. (response (remission) rates of 69% (42%) compared to 58% (28%)). According to Sackeim et al. (2020) patients with higher baseline severity had higher post-treatment scores and were less likely to attain response/remission. In our study, higher baseline severity was associated with larger continuous improvement, Table 9 Results of prediction models. Note: LMM Linear mixed model; GLMM Generalized linear mixed model with binomial distribution; β slope, i.e., difference in average symptom reduction until week 12 (raw coefficient multiplied with the natural log of treatment duration); continuous predictors are mean centered and scaled thus slope represents change in outcome with 1 unit increase over the grand average value of the sample with negative values indicating stronger symptom reduction; OR odds ratio with values >1 indicating higher odds for the event to occur with an increase in the predictor by 1 unit. HR hazard ratio with values >1 indicating higher chance of an event occurring by 1 unit. P values were corrected for the false discovery rate (FDR).
albeit with lower remission rates. It is obvious, that patients with higher baseline severity may be less likely to reach remission. In our study, older age was also associated with superior continuous improvement, while no association of age with clinical outcomes was found by Sackeim et al. (2020). Age was not associated with remission/response, and indeed, both the older patients beyond an age of 68 years as well as  young adults (range 18-22 years) showed high remission and response rates following Deep TMS (Tables 7 and 8). Gender was not associated with clinical outcome, while Sackeim et al. (2020) found a significantly superior outcome in females. A recent naturalistic study (Bouaziz et al., 2023) of figure-8 TMS in 435 MDD patients reported response and remission rates based on MADRS or HDRS of 31.0% and 22.8%, respectively. These rates are much lower than the rates found in this study for Deep TMS when assessed with HDRS (response and remission rates of 71.6% and 72.1%). Another naturalistic study (Ekman et al., 2023) of iTBS with figure-8 in 542 unipolar or bipolar depression patients, reported response and remission rates of 42.1% and 16.1%, respectively, assessed by CGI-I and CGI-S. Comparison is limited since these scales were not used in the current study, but those response/remission rates are much lower than the rates found in the current study for Deep TMS iTBS (Table 6) with either self-report questionnaires (PHQ-9, BDI-II) or clinician-based scale (HDRS).
Due to its naturalistic character, this post marketing study allowed the analysis of dose-response relationships which are normally not accessible in RCTs as protocols are more standardized. Higher stimulation intensity related to the individual rMT was associated with superior clinical outcomes. This finding is in line with our previous findings (Levkovitz et al., 2009) that Deep TMS stimulation intensity of 120% rMT resulted in 60% response and 50% remission, compared to 0% response and remission with Deep TMS at 110% rMT intensity. Similarly, a previous analysis of the multicenter MDD study (Levkovitz et al., 2015) demonstrated that none of the twelve patients who received <118% of their rMT remitted and only one responded (Gersner et al., 2018). A higher Deep TMS intensity leads to more extended neural stimulation in all dimensions of the prefrontal cortex, bilaterally recruiting more widespread circuits and networks. In contrast, Sackeim et al. (2020) observed a negative association between stimulation intensity and clinical outcomes.
Both stimulation intensity and number of pulses per day show a tendency towards an inverted U shape pattern for prediction pointing to a quadratic function for the respective dose response relationship. However, it is possible that for more resistant patients who had not responded to the standard protocol, clinicians increased the stimulation intensity as well as the number of pulses. The dose response relationship for different Deep TMS parameters should therefore be investigated in future studies along the translational pathway.
The H1 Coil produces a magnetic field with bilateral stimulation of the prefrontal cortex, with left DLPFC predominance. Previous studies have speculated that stimulating the left DLPFC and then the right DLPFC in series with traditional figure-8 coils may result in better outcomes than the left side alone (Fitzgerald et al., 2006). It is possible that simultaneous bilateral stimulation via the H1 coil (left and right prefrontal cortex at the same time, rather than sequentially) may represent another pathway of optimizing TMS treatment for depression. Our data supports both the safety and high efficacy of synchronous bifrontal rTMS with the Deep TMS H1 Coil.
The dataset showed that Deep TMS is indeed clinically most often applied at standard parameters (i.e., 18 Hz, 120% rMT intensity, 1980 pulses per session). Not fully to our surprise, some colleagues also used iTBS protocols which have a much shorter duration and are therefore more cost-effective. The iTBS protocols used here are based on classical iTBS (Huang et al., 2005), modified by a higher number of pulses (i.e. 1800 pulses) per session, as well as a lower stimulation intensity (80-90% rMT) compared to 120% rMT intensity used in the THREE-D study (Blumberger et al., 2018). Despite its lower intensity, H1 coil Deep TMS with iTBS protocols showed similar response and remission rates (Tables 5 and 6). Thus, iTBS with the H1 coil at subthreshold intensity may represent a promising treatment option to be investigated in future RCTs in comparison with standard protocols and sham conditions. Further iTBS data collection is ongoing.
There are several limitations to this study. As an uncontrolled naturalistic study, placebo effect was not accounted for. All Deep TMS providers were contacted and asked to provide data in the template excel. Yet, only 21 clinics provided data. Providers were reimbursed for any line of data irrespective of the results, and providers were motivated to send as much data as they can. Hence there is no reason to believe there was a bias. Yet, the data received was a small part of the whole data of patients treated with Deep TMS. As with all naturalistic studies, there was heterogeneity in the protocols given at the various sites that Curves represent log-transformed weeks as they provided better fit to the data according to the BIC. Shaded areas represent 95% CIs. Middle panel displays the same levels of Total number of pulses per day as they are associated with time to failure/event for Remission and Response. These curves refer to the Kaplan-Meyer estimate and take all patients into account (i.e., including patients who did not reach remission/response). The right panel presents rates of response, remission, sustained response, and sustained remission as a function of the Total number of pulses per day (color coded as noted in the legend below). contributed data as well as intermittently missing values. Missing data also influence the durability results. The data reported here are likely underestimations considering that patients with high response and remission rates are less likely to return to the clinic to seek further care. Finally, these data don't account for concurrent psychotherapy and medications that may have been initiated along with Deep TMS but not fully reported in this dataset.
In conclusion, real-world application of Deep TMS with the H1 Coil for MDD is confirmed with overall treatment results outperforming previous sham-controlled and open-label trials. Deep TMS offers treatment-resistant MDD patients the opportunity to remit in a rather short timeframe with durable clinical effects.