Routine Outcome Monitoring in Psychotherapy Treatment using Sentiment-Topic Modelling Approach

Despite the importance of emphasizing the right psychotherapy treatment for an individual patient, assessing the outcome of the therapy session is equally crucial. Evidence showed that continuous monitoring patient's progress can significantly improve the therapy outcomes to an expected change. By monitoring the outcome, the patient's progress can be tracked closely to help clinicians identify patients who are not progressing in the treatment. These monitoring can help the clinician to consider any necessary actions for the patient's treatment as early as possible, e.g., recommend different types of treatment, or adjust the style of approach. Currently, the evaluation system is based on the clinical-rated and self-report questionnaires that measure patients' progress pre- and post-treatment. While outcome monitoring tends to improve the therapy outcomes, however, there are many challenges in the current method, e.g. time and financial burden for administering questionnaires, scoring and analysing the results. Therefore, a computational method for measuring and monitoring patient progress over the course of treatment is needed, in order to enhance the likelihood of positive treatment outcome. Moreover, this computational method could potentially lead to an inexpensive monitoring tool to evaluate patients' progress in clinical care that could be administered by a wider range of health-care professionals.


I. INTRODUCTION
Depression is a common illness that has a major contribution to the overall global burden of disease. It can affect one's thinking styles and behavior, including low energy, loss of appetite, reduced concentration, and intense feelings of hopelessness and negativity. It has been reported that more than 264 million people of all ages suffer from depression [1]. Therefore, considering the high burden of mental illnesses, treatment recommendations for an individual patient are highly important. The common options for treating depression, i.e., medication, psychotherapy, or a combination of both [2]- [4], greatly depend on the individual cases. Psychotherapies such as cognitive-behavioral therapy (CBT) [4]- [7] and interpersonal psychotherapy (IPT) [4], [8] can have substantial effects for treating depression effectively.
In this paper, we tackle the research challenge of monitoring the progress of psychotherapy sessions. Individual psychotherapy counseling transcripts were used to identify the outcome of each therapy session. Evidence [9]- [11] shows that regularly identifying the patients who are regressing during the treatment could improve therapy outcomes. In contrast, the post-mortem analysis after the patient completed the treatment holds very limited adjustment that can be implied in order to improve the outcomes of the treatment (i.e., no sign of improvement or even worsen conditions) [9], [11].
Commonly in clinical settings, measuring outcomes of patients' therapy sessions are commonly based on the regular self-reports, i.e., patients filled out questionnaires at the beginning of treatment, during the treatment, and at the end of treatment [12]. This progress monitoring, known as Routine Outcome Monitoring (ROM) is important in psychotherapy sessions as it has been used as a tool to assess the patient's progress, evaluate the treatment, and decide its future course [13]. The main purpose of outcome monitoring is to serve as a threshold to evaluate a patient's progress and see whether the ongoing treatments have a positive or negative impact. Several rating scales based on the self-reported questionnaires system have been introduced in the mental health routine practice. These including Outcome Questionnaire System (OQ-45) [9], the Partners for Change System (PCOMS) [9], and the Clinical Outcomes in Routine Evaluation (CORE) [14].
While progressively monitor psychotherapy sessions is proven to increase the chances of a positive outcome, the current monitoring system, however, suffers from few challenges, e.g. (i) extra work and effort needed for administering the questionnaires, (ii) time constraints for scoring and analyzing the results, (iii) lead to an unnecessary burden on the patient [15], [16]. Thus, it becomes a challenge to provide a better monitoring system that could progressively monitor the outcome throughout the treatment.
We approach the problems by developing a computational method using the Dynamic Joint-Sentiment-Topic Model (dJST) [17] to measure and monitor patient treatment outcomes by tracking current and recurrent views of the topic and sentiment. Specifically, by incorporating the sentiment and topic analysis for each therapy session, we identify sentiment and topic trends evolved throughout treatments on the author's level. We highlight the sentiment (positive or negative) and the topic of each therapy session for each patient. We show the high potential of using this computational method in clinical settings by comparing our analysis with the professional practitioner analysis.

II. MATERIALS AND METHOD
To discover and track the sentiment and topic over time from each therapy session, we employed the Dynamic Joint Sentiment Topic (dJST) model [17]. Sentiment topic models have been used in various studies to extract coherent sentiment-bearing topics in stock market prediction [18], product reviews [19], and text clustering [20]. The model, as shown in Figure 1, assumes that the documents at the current epoch are influenced by documents in the past, where the current sentiment-topic specific word distributions , at epoch are generated according to the word distributions at previous epochs. The time stamp for each stream of documents , ⋯ , can be an hour, a day, or a year. Each document d at epoch t is represented as a vector of word tokens, , , ⋯ , . The evolutionary matrix of topic and sentiment label , , where each column in the matrix is the word distribution of topic and sentiment label , , , , generated for document streams received within the time slice specified by , where ∈ , 1, ⋯ , 1 , the current sentiment-topicword distributions are dependent on the previous sentimenttopic specific word distributions in the last epochs. We then attach a vector of weights " , #" , ,$ , " , , , ⋯ , " , ,% & ' , each of which determines the contribution of time slice in computing the priors of , . Hence, the Dirichlet prior for sentiment-topic-word distributions at epoch is ( , , ) " , .
Assuming we have already calculated the evolutionary parameters , ) , " , for the current epoch , the generative story of dJST as shown in Fig. 1 at epoch is given as follows: The Dirichlet priors ( of size * x + x < are first initialized as symmetric priors of 0.01, and then modified by a transformation matrix = of size * x < where encodes the word prior sentiment information. = is first initialized with all the elements taking a value of 1. Then for each term ∈ 1, ⋯ , < in the corpus vocabulary, the element = > is updated as follows: where the function ?3 5 returns the prior sentiment label of in a sentiment lexicon, i.e., positive or negative.

A. Experimental Setups
This section describes the datasets of counseling transcripts from Carl Roger's therapy sessions and discusses the settings used in our experiments.

1) Dataset
We conducted the experiments using the transcripts of Carl Rogers' therapy sessions -the founder of client-centered psychotherapy. We acquired 158 transcripts from 51 clients of Roger's therapy cases from Lietaer and Brodley [21]. The cases reflect the wide range of clients with whom he worked from schizophrenic patients to a conflicted clinical psychologist. This dataset has been made available for research purposes [21]. The Rogers' psychotherapy transcripts have been used in psychotherapy research [22], [23]. For the sake of our model evaluation, we selected five cases of client session transcripts that the professional psychotherapist has analyzed. We dropped the remaining 46 cases in this study. We only extracted the client's verbatim transcripts for each case and did not consider the therapist verbatim in this study. We show a summary of clients and the number of sessions in TABLE I and summarised each case in the following paragraph. The case of Mr. Herbert Bryan: Bryan, a client suffering from blocking and a variety of neurotic complaints. He suffered severe pain from his blocking, which interferes with his sexual, business, life, and social life [24].
The case of Frank: Frank, a client who was a student in college having serious attitude problems. He has shown argumentative, attention-getting, and uncooperative behavior in his classroom [24].
The case of Marry Jane Tilden: Marry Jane, a client who was a high school graduate from an upper-middle-class family. She was brought to the hospital by her mother, who worried as she noticed Marry had symptoms of major depression with suicidal ideation, social isolation, low self-esteem, and strong self-critical attitudes.

2) Settings
Each dataset underwent pre-processing, including conversion to lowercase, removal of non-alphanumeric characters, and removal of stop words. We empirically set the number of topics to 5 for the 2 sentiment labels (i.e., positive and negative), which is equivalent to a total of 10 sentimenttopic clusters for each case.

III. RESULTS AND DISCUSSION
In this section, we present our results and analysis of the experimental datasets. We aimed to analyze the session by focusing on the sentiment and topic trend throughout the psychotherapy session for each client. We first show the results from the two clients, i.e., Miss Vib and Miss Int. For these two cases, however, we could not find any detailed analysis evaluated by the expert. Therefore, we only demonstrate the sentiment analysis from our model and compare it with the results reported in the studies by Hoffman [25]. We show the excerpt of the results from Hoffman [25] in Table II. The table represents the index of behavioral maturity with a scale of 1 to 4, with defined values of 1, 2 and 4 towards the changes in counseling therapy. Value 1 indicates the behavior of little or no control over himself/herself or the environment. Value 2 indicates that the individual controls the environment, whereas value 4 indicates good behavior with self-direction, maturity, and responsibility [25].

A. Analysis of Sentiment-Topic for Miss Vib
We show the trend of sentiment analysis over the therapy session of Miss Vib in Fig. 2. The X-axis in the figure represents the therapy session, whereas Y-axis represents the probability of sentiment given a document @3 | 5. As depicted, negative sentiments are dominant throughout the course. However, it is observable that the positive sentiment value has significantly raised between sessions 8 and 9. It shows that our sentiment analysis trends are consistent with the reported behavior trends analyzed in the work of Hoffman [25], as shown in Table II, in which, towards the end of the session (session 9), the maturity level was increased. Apart from the sentiment analysis, we also show some of the topics extracted from the Miss Vib counselling sessions in Table III. From the topic words, e.g., 'family', 'friends', 'married', we can suggest that, for instance, Topic 1-negative is related to the relationship; Topic 5-negative is more related to education based on the topic words such as 'courses', 'library' etc. Whereas Topic 1-positive could be analyzed as the topic related to life's goal from the common topic words such as 'aim', 'goal', 'accomplished'. Similarly, we could suggest that from the topic words, Topic 2-positive is related to education/study based on the topic words such as 'study', 'teachers', and 'scholarship'.

B. Analysis of Sentiment-Topic for Miss Int
We show the trend analysis of sentiment over the therapy session of Miss Int in Fig.3. As depicted, there is no consistent trend evident for 7 counseling sessions. We observed that positive sentiment is relatively dominant over negative sentiment for the first three sessions from the analysis. From session 3 onward, however, the negative sentiment significantly raised and stayed dominant towards the end. When compared to Hoffman [25], we found no similarities with the level of maturity, as shown in Table II. This contradiction analysis might be due to the differences of interpretation of sentiment words by our model.
Despite the sentiment analysis, we illustrate some of the topics extracted from the Miss Int counselling sessions in Table IV. Based on the examples of topic words, we can suggest that Miss Int expressed positive sentiment towards topics related to the topic education and topic life. In contrast, there are also negative-sentiment topics expressed related to the topic of her life expectation and topic living. We acknowledge that it is quite challenging to learn the topic words without the background knowledge for each client.

C. Analysis Evaluation
In this section, we present the details of further analysis from our experimental results. Specifically, we compare our analysis finding with the overall commentary made by the professional psychotherapist published in [24], [26]. Due to the availability of published analysis by the domain experts, we only evaluate three of our analysis cases, i.e., Herbert, Frank, and Marry Jane Tilden.

1) Analysis of the Case of Herbert Bryan
We show the trend analysis of sentiment distributions over the therapy session of Bryan in Fig. 4. As depicted, the first five sessions showed high negative sentiments in Bryan's therapy. Nevertheless, Bryan has ended the therapy with positive sentiments, in which the attitude has started to change from session 6 onward. This finding is consistent with the concluding remarks made by Rogers for the case of Herbert Bryan [24]. Below is the excerpt by Rogers over Bryan's therapy sessions: Also, it is worth mentioning that, as shown in Fig. 4, the negative sentiment value is constantly reduced from session 1 to session 4. That is, we observed an increment in the positive sentiment value. This positive change is confirmed by the analysis made by Rogers, as per below excerpt from the session 4 [24]:

"A comparison of these attitudes toward the self with those expressed in previous interviews indicates clearly the tremendous development in insight and the increasingly positive attitudes."
However, it is noticeable that the negative sentiment again significantly raised up on session 5. Rogers's analysis on Bryan's discouragement and de-motivation during this session could explain this trend. Below is the excerpt [24]: "The full measure of his discouragement, as he faces the implications of the insight achieved in the previous contact, is best shown by listing, as before, the spontaneous sentiments voiced during the interview. I haven't any motivation to choose the better way. When I have nothing but neurotic satisfactions, it is hard to feel that other satisfactions would be better. I feel discouraged about myself. I'm suffering real pain. I should like to have you pull a rabbit out of a hat for me. This whole struggle is very exhausting to me."   Apart from that, we also report the topic proportions for each therapy session in Fig. 5. Some examples of topics can be seen in Table V. From the result, we can generally suggest that Bryan has expressed his feeling and thoughts on Topic 1 in the first session, which is about the topic related to blocking symptoms that he has suffered from. This finding is consistent with the conclusion commentary made by Rogers as published in [24]. Below is the excerpt from the summary commentary by Rogers in session 1:

"The following would seem to be a fair summary of the outstanding attitudes, which have been spontaneously expressed: I suffer from a blocking which interferes with my sexual life, my business life, my social life. I suffer excruciating pain from this blocking. My only satisfaction is voyeurism."
As for the four subsequent sessions (session 2-5), Topic 5 has been consistently discussed in Bryan's session. As can be seen in Table V, the topic is related to the feeling of neurotic. This finding is consistent with the Rogers analysis. Below is the excerpt from the summary commentary by Rogers [24]: As depicted in Fig. 4, from session 5 onwards, the negative sentiment has dropped significantly. Yet, Bryan has completed the therapy with the positive sentiment, in contrast to the earlier 5 sessions. For these last three sessions, Bryan has discussed on the Topic 2 and Topic 3, which is related to satisfaction and improvement. Again, this finding is confirmed by Rogers's analysis as shown in the excerpt below [24]: "These attitudes show very vividly the fact that, after teetering for two interviews between neuroticism and growth, Mr. Bryan has chosen the pathway of growth with a clearness and vitality that is amazing. Between the sixth and seventh interviews the accumulated insight has been translated into a positive decision, which brings a decided feeling of release. The attitudes expressed are in sharp contrast to the weakness and helplessness which were evident in the two preceding interviews. The crisis is fully passed. The client has discovered resources within himself for making this crucial choice and moving ahead."

2) Analysis of the Case of Frank
We show the trend analysis of sentiment distributions over five therapy sessions of Frank in Fig. 6. As depicted, Frank started the therapy session with high negative sentiment. The negative attitude has constantly dropped over the session, that is, the highest positive sentiment reached at session 3. However, the negative sentiment again raised until the end of the session. As can be seen, Frank has ended the session with high negative sentiment. However, when we compared the result with the analysis made by Rogers [24], we noticed that only the first four sessions are consistent with Rogers's analysis. Below is the excerpt by Rogers over Frank's therapy sessions: "Although the first two interviews are largely ``talking out'' processes there are a few statements which indicate the beginnings of self-understanding. In the third interview, particularly on pages 8 and 9, significant insight is achieved.... The fourth interview represents a distinct slump in progress... The fifth interview contains not only a fresh surge of insight but many new choices and positive actions."   Also, we report the topic proportions for each therapy session in Fig. 7. We show some examples of the topic in Table VI. Based on the results, we can generally suggest that during Session 1 Frank mentioned topics related to procrastinate. Topic 4 related to friends has been discussed in Session 2 and 4. Whereas, in session 3, Frank has mentioned the topic related to responsibility. To compare with Roger's analysis, for the case of Frank, only the analysis for sessions 1 and 5 are available. Below is the excerpt from Rogers's analysis:

3) Analysis of the Case of Marry Jane Tilden
We show the trend analysis of sentiment distributions over seven therapy sessions of Marry Jane in Fig. 8. Note that the transcript for session 2, 4, 6, and 8 are not available. Thus, we skipped all those sessions for the analysis. As depicted, Marry started the therapy session with very high negative sentiment, and negative sentiment is steadily declined over the course. Nevertheless, the positive sentiment increased significantly between session 9 and 11. Marry had ended the session with a significantly high positive sentiment. Marry continued her third session with negative sentiment, in which the sentiment values are higher than the previous session. However, we noticed that the positive sentiment is started to grow in this session. Likewise, negative sentiment values are reduced. Rogers analysis on Marry third session is reported in the below excerpt [26]: "The client expresses and idea very basic in her thinking, the question of whether she is capable or as intelligent as other people are. Clients discuss the very significant problem of motivation, both with regard to herself and to other people." By session 5, despite that, there is a significant rise in positive sentiment. However, the negative sentiment remains high. Below is the excerpt from Rogers's analysis [26]: "..major attitudes expressed by Marry Jane. I'm afraid to venture, because i'm afraid i will fail. I was frightened by my jealousy when she went with a boy in whom i was interested. It made me afraid to trust my feelings." The negative sentiment dramatically declines between sessions 7 and 9; whereas the positive sentiment has significantly increased. Our result is confirmed by Roger's analysis as illustrated in the below excerpt [26]: "Marry Jane discusses her plan for work in a rather discourages way, fearful that "something will turn up and I'll sort of lose faith all over again..." From session 9 to 11, the trend shows the major changes in Marry's therapy session. That is, the positive sentiment is dominant as there are further declines in the negative sentiment. Our trend analysis is consistent with Rogers, as per below excerpt for session 9, 10, 11, respectively: "She voices the attitude that she is not too well connected with reality. She discusses this and realizes that many of her past problems lies within herself.... she realizes that she is always dissatisfied with anything she does-her job or any other undertaking.." "She continues with a discussion of the fact that when she is feeling good she feels much more adventurous and is now thinking of taking a very different and interesting job during the coming summer, with a girl with whom she has become friendly at work... She Fig. 9 The Summary of Analysis between Our Method and Expert Analysis. Note: N represents negative sentiment; P represents positive sentiment;represents not available

D. Summary of Analysis Evaluation
We show the summary of analysis evaluation between our computational method and expert analysis in Fig. 9. We compare the analysis from the domain expert with our proposed computational method analysis. As shown in the figure, it clearly indicates that our method illustrates comparable results with an expert's analysis. Our results present a similar analysis with the domain expert. For the case of Miss Vib, Miss Int, Bryan, and Marry, we achieve 100% accuracy. However, in session 5 (S5) of Frank's case, our analysis is contradicted with the domain expert analysis, in which we analyze negative sentiment (N) instead of positive (P).

IV. CONCLUSION
In this paper, we have demonstrated the work on monitoring the progress of outcomes from the individual psychotherapy counseling transcripts. Progress monitoring is important to assess the effects of changes on knowledge, attitudes, beliefs, and behaviors upon the psychotherapy intervention. Employing the topic model for monitoring the progress of counseling sessions' progress provides the trend of sentiments and topics evolving over each client's treatment. We evaluated the method by comparing the analysis from our model with the analysis made by the domain expert.
From the experimental findings, our model revealed comparable results with the analysis made by the domain expert. Therefore, we suggest that the dynamic topic models provide an opportunity for tracking the client's progress by closely analyzing the shifts of sentiment and topic for each session. This monitoring might be beneficial to serve as an early identification tool in analyzing whether the necessary changes in therapy needs to be taken.
We believe that our result could be improved with further enhancement in our model, for instance, by incorporating more domain-specific lexicons. For future work, we could also integrate the counselor's verbatim transcripts for the analysis. Apart from that, we also observed that it is difficult to predict the future outcome of the next counseling session. As we noticed, the number of counseling sessions is not linked to positive attitudes towards the treatment, as no evidence of trend can be used as a guide to predict sentiment outcome. We suggest that this might be different for each case and depends on the client's characteristics and the complexity of clients' problems.