Psychiatrists ’ experiences and opinions of generative artificial intelligence in mental healthcare: An online mixed methods survey

Following the launch of ChatGPT in November 2022, interest in large language model (LLM)-powered chatbots has surged with increasing focus on the clinical potential of these tools. Missing from this discussion, however, are the perspectives of physicians. The current study aimed to explore psychiatrists ’ experiences and opinions on this new generation of chatbots in mental health care. An online survey including both quantitative and qualitative responses was distributed to a non-probability sample of psychiatrists affiliated with the American Psychiatric Association. Findings revealed 44 % of psychiatrists had used OpenAI ’ s ChatGPT-3.5 and 33 % had used GPT-4.0 “ to assist with answering clinical questions. ” Administrative tasks were cited as a major benefit of these tools: 70 % somewhat agreed/agreed “ documentation will be/is more efficient ” . Three in four psychiatrists (75 %) somewhat agreed/agreed “ the majority of their patients will consult these tools before first seeing a doctor ” . Nine in ten somewhat agreed/agreed that clinicians need more support/training in understanding these tools. Open-ended responses reflected these opinions but respondents also expressed divergent opinions on the value of generative AI in clinical practice, including its impact on the future of the profession.


Introduction
Since the launch of OpenAI's "ChatGPT" in November 2022, considerable attention has been focused on advances in chatbots in healthcare (Lee et al., 2023).ChatGPT is an example of a new generation of artificial intelligence tools powered by large language models (LLMs).These chatbots are trained on vast amounts of data to generate responses.Operating like autocompletion devices, their aptitude is predicting the next word in a sequence, and at scale they can today facilitate impressive capacities to summarize and generate text-based content.Moreover, unlike web-based search engines, these models can "remember" previous prompts, and engage with users in exchanges that resemble conversations.
Previous research demonstrates that documentation and administrative tasks remain a leading source of burnout among all clinicians including psychiatrists who seek help, including from the assistance of AI (Blease et al., 2020a(Blease et al., , 2020b)).LLM-powered chatbots such as Open-AI's GPT-4, and Google's Bard are potentially well suited to these tasks (Haupt and Marks, 2023).Notably, for example, these tools have strengths in writing responses in a requested style, literacy level or tone, suggesting potential to help write clinical notes.Further support comes from preliminary evidence that LLM-powered chatbots can assist with writing empathic documentation (Ayers et al., 2023;Sharma et al., 2023).Emerging research shows that ChatGPT may improve documentation by creating more comprehensive and organized clinical histories.For example, in a randomized controlled study by Baker et al., ChatGPT generated longer, and more detailed documentation compared with typing or dictation methods (Baker et al., 2022).Other recent research indicates that these tools might also play a role in assisting physicians with differential diagnosis, for example, in hypothesis generation including in more complicated clinical presentations (Kanjee et al., 2023).
However, LLM tools also carry significant limitations and invite new problems, although a study by Walker et al. found that ChatGPT-4 provided medical information of comparable quality to that offered by searches of static internet information (Walker et al., 2023).Notwithstanding, risking the epithet of "garbage-in, garbage-out", these tools are unable to discriminate the quality of the information upon which they are trained.Responses can be inconsistent, clearly wrong, and more disturbingly, subtly false, and in many documented instances have generated harmful responses (El Atillah, 2023;Ingram, 2023).In their study, Baker et al. found that ChatGPT included erroneous information in 36 % of documentation (Baker et al., 2022).Relatedly, owing to the content and supervised training techniques used to devise these models, biases will be incorporated into responses.This "algorithmic discrimination" risks perpetuating or worsening inequities in healthcare (Teno, 2023) with evidence that responses exhibit gender, race, and disability biases (Gross, 2023;King, 2022).It is worth emphasizing that physicians also exhibit unwanted prejudicial biases in clinical practice, and whether algorithmic biases are worse than what already exists in human mediated care is unknown (FitzGerald and Hurst, 2017;Takeshita et al., 2020).Combined, currently there is a lack of concrete, quantified evidence about whether clinicians using LLMs can improve the outcomes in their work by using LLMs.
Beyond concerns of accuracy, veracity, and bias in the use of LLM tools, patient privacy may also be risked (Marks and Haupt, 2023).Owing to their prima facie conversational fluency, many patients and clinicians may be tempted to input sensitive clinical information (Blease, 2023).Chatbots such as  are not yet regulated, and in June 2023, the American Psychiatric Association (APA) issued a statement cautioning that physicians should not be using them to undertake clinical work at this time (American Psychiatric Association, 2023).Notably, other professional organizations such as the World Psychiatric Association, the European Psychiatric Association, and the Royal College of Psychiatrists, have not yet issued position statements or guidance related to these tools.
Yet these technologies are rapidly evolving.Knowing how the field of psychiatry currently understands the potential of these tools and seeks to use them in the future can help guide the development of LLMs and ensure they are implemented safely and in alignment with the field's needs.Therefore, the aim of this study was to gauge the views of psychiatrists about these tools.

Subjects
Participants in this convenience sample, online survey were recruited from members of the APA and affiliates who had previously registered for and completed an APA-delivered informational session called "AI in Psychiatry: What APA Members Need to Know" that took place on August 16, 2023.The APA is the largest professional network of psychiatrists in the U.S. with 38,000 members in the U.S. and internationally.Approximately 811 APA psychiatrists attended the course and between October 10, 2023, and October 25, 2023, these participants were then sent an email inviting them to participate in the survey.All invited participants were assured that their identities would not be disclosed to investigators, and participants gave informed consent before taking part.Ethical approval for the study was obtained from the APA (Protocol # APAPPPDHLLMS0923).The survey was timed to be three minutes in length, and no recompense was offered to participants.

Procedures
The study team devised an original survey to explore psychiatrists' experiences and opinions about LLM-powered chatbots in their clinical practice.The survey was pre-tested to ensure face validity with four psychiatrists who offered think aloud feedback, and timed to take no more than 3 min to complete.
The survey was divided into three main sections (see Supplement 1: Survey) and opened with the statement, "The following questions ask for your experiences with using OpenAI's ChatGPT (or Google's Bard or Microsoft's Bing AI if you have used those instead).The first part included a single question asking participants to select which (if any) of these LLM-powered chatbots they had used in their clinical practice.The second part asked participants to reflect on, "how your practice will be affected or already is affected if you use ChatGPT/Bard/Bing AI".This section included four items on the effects of these tools on diagnostic accuracy, disparities in healthcare, documentation efficiencies, and the need for clinician training/support in understanding these tools.Employing 4-level Likert items, we included the following response options: "disagree," "somewhat disagree," "somewhat agree," and "agree."All closed ended questions also included a fifth "don't know" option.In the third part, again employing these response options, the survey requested participants reflect on patients' use of these tools.This part opened with the statement, "Among my patients who use ChatGPT/ Bard/Bing AI a majority will…" followed by four items: "better understanding their health," "worry more about privacy," "will consult these tools before seeing a doctor," and "will use these tools to better understand their medical records."All participants were then invited to respond to an additional, optional question requesting them to, "add additional comments you might have about the use of ChatGPT/Bard/ Bing AI in mental healthcare."The survey closed by requesting participant gender and age.

Analysis
We used descriptive statistics to examine closed ended questions on physicians' experiences and opinions about generative AI in psychiatry.These analyses were completed using Alchemer survey software and Excel.The qualitative question was subject to content analysis (Mayring, 2015).Due to limitations with the data set (short phrases or sentence fragments), full thematic analysis was not appropriate (Joffe and Yardley, 2004).We employed the following iterative process: comments were read by CB and JT to familiarize themselves with responses.Next, two coders (CB and JT) undertook thematic analysis.Brief descriptive labels ("codes") were applied to comments, and multiple codes were applied if comments presented multiple meanings.Following this, CB and JT met to discuss these coding decisions and revisions and refinements of codes were undertaken.Following this process, first-order codes ("categories") were grouped into second-order themes based on commonality of meaning.CB and JT convened to review and refine the final themes.

Opinions about effects on practice of generative AI
See Fig. 3.

Opinions about patients' use of generative AI
See Fig. 4.

Qualitative responses
Of the 138 participants, 65 (47 %) left additional comments (e.g., "Revolution underway,") which were typically brief (a phrase, or one or two sentences) (word count of qualitative responses = 2412 words).As a result of the iterative thematic analysis process, four major themes were identified in relation to the impact of generative AI on mental healthcare: (1) Documentation efficiencies, (2) Benefits and harms; (3) The future of the profession; and (4) The patient-doctor relationship.Figures in parentheses refer to participant identification numbers, gender (M, F, or Prefer Not to Say) and whether the respondent has used LLMs to assist in answering clinical questions (Yes or No).

Documentation efficiencies
The largest theme was the perceived impact of generative AI to "help with documentation and reducing admin burden" (#11, M, Yes).Multiple comments referred to enhanced documentation efficiencies which were predicted to, "relieve administrative burdens on overworked psychiatrists" (#114, M, No).Although responses were often unspecific, some comments embedded a sense of urgency, while acknowledging a need to uphold patient privacy; for example, "long overdue.Can't wait for EPIC to incorporate" (#167, M, Yes) and "Please bring on AI dictation tools now!HIPAA compliant ASAP" (#174, F, Yes).Other respondents signaled they were already using these tools in practice; for example, "My clinic is already using AI to fully complete clinic chart and consultation notes" (#112, F, Yes), and "I've only used it for what it's actually good at: to help write letters and to give me creative suggestions" (#175, F, Yes).
One participant suggested documentation could be wholly outsourced to AI: "I think the biggest advancement will be in improving documentation for physicians (e.g., voice to text, updating notes with relevant information without human input" (#166, M, Yes).Others saw a role for humans working in collaboration with AI: "I am more interested in AI facilitating my medical records such as uploading automatically identifying data, prescriptions, symptoms depending on the conversation with the patient which AI could record and interpret for the clinician to edit, approval and signature."(#76,M,No) Still others were vaguer about whether psychiatrists could be disintermediated: "it does […] the note-writing better than I do" (#74, M, Yes).

Benefits and harms
Another emergent theme was the contrastive expectations about generative AI.Aside from documentation efficiencies, some participants were enthusiastic and perceived multiple benefits to clinical practice.Many of these comments were vague: for example, "Overall a positive thing" (#127, F, Yes), "The tools are useful to both patients and clinicians" (#143, F, Yes).Others perceived general benefits to patient care; for example, "Improved access to care, minimize health disparities.Improve quality of care" (#50, F, Yes); "I love the idea of AI technology improving patient care and safety" (#161, F, Yes); "I suggest that "best practices" as well as "standard of care" can easily be incorporated into AI" (#109, M, Yes).
However, others were less sanguine about the benefits with some respondents anticipating harm from the use of these tools, which were described by one participant as a "BAD IDEA" (#80, F, No).Several respondents emphasized that "Putting patient care in the hands of AI is seriously risky" (#148, M, No) and "unethical" (#60, F, No).A few described the tendency of generative AI to make up false information ("They are often confidently wrong" (#82, Prefer Not to Say, No)), warning that these tools embed unwanted biases in care that could lead to clinical errors.Some linked these views to personal observations and the need for regulatory control.For example: "I definitely do NOT use ChatGPT to make clinical diagnoses or answer any clinical questions requiring factual answers, such as med interactions.Unfortunately, that's exactly what I've seen people do, even docs whom I know are quite computer savvy."(#175, F, Yes) "I am very concerned about reliability of these tools.Ppl [people] talk about "hallucinations" or "confabulations" but let's be real, AI has been shown to be deceitful or LIE if the system doesn't know the answer.I find this to be sociopathic.Also, AI systems are created by humans and will definitely replicate the same structural racism and biases /-isms that we all carry.Unless there are REAL safeguards, we are all in trouble."(#59, Prefer Not to Say, No) Fig. 2. Use of generative AI to answer clinical questions.

C. Blease et al.
A few participants commented on confidentiality concerns: "Privacy is doubtful given the corporate owners of these products" (#133, F, No).
Notably, many respondents expressed a mixture of opinions, anticipating both positive and negative effects of these tools on patient care.For example: "I think LLMs have a lot of potential, but they are not ready for prime time" (#59, M, Yes); "The potentially wonderful benefits e.g., in summarizing extensive mental health notes… could be lost if problems of hallucinations, privacy, etc. are not solved.I watch with interest" (#71, M, Yes).

The future of the profession
Another striking theme was perceptions about the impact of generative AI tools on the profession of psychiatry.A singular anticipation among many was the belief that these tools "will revolutionize healthcare" (#181, M, Yes) and "make a huge change in our mental health services" (#124, F, No).Some comments expressed anxieties about the potential for AI to threaten the job of psychiatrists.For example, "Some AI enthusiasts are excited to mostly replace doctors.And therapists" (#133, F, No); "I'm terrified of being replaced by AI and being unable to pay my bills (student loans)" (#61, F, No).
In contrast, several respondents anticipated "AI co-pilots" (#30, M, No) with "man and machine" increasingly working together.For some, this arrangement was viewed optimistically; for example, "These tools have tremendous potential to augment our profession" (#111, M, Yes).However, others were more skeptical about the potential to enhance professional competence: My concern is that AI will actually limit our own ability to problem solve for our patients.It may limit creative thought processing …  Think of the Netflix TV system, it is easier (lazier) to go with the lists of top films/popular series presented to us on our screen, isn't it?AI may be helpful in some ways, but those physicians who primarily rely on it may be the types who follow "recipes" strictly at the cost of innovation.It may result in us physicians operating in a more robotic fashion… Medicine will become more conveyor belt like and those stimulated to think in more complex, "human" ways will be less attracted to medicine and even psychiatry.(#154,F,No) In anticipation of the increasing use of generative AI tools in clinical practice, some proposed the need for the profession to become better prepared: for example: AI tools will be a meaningful part of [the] clinical practice of the near future.The sooner we negotiate our own attitude towards the AI, the better prepared we will find ourselves in this inevitable partnership.I would compare it to the position of those who say they don't take interest in politics.Just because you do not take an interest in politics [AI] doesn't mean politics [AI] won't take an interest in you.(#154,M,No) Relatedly, other comments expressed the imperative for greater professional guidance and education about these tools: "We need to understand these tools" (#70, F, No); "Would love to get training in it ASAP" (#69, M, No).

The patient-doctor relationship
A final, smaller emergent theme was the impact of generative AI on the patient-doctor relationship with participants expressing mixed opinions.Some envisaged LLM-enabled chatbots as strengthening dialogue by allowing patients to come better prepared to appointments.For example, "I think it will potentially be helpful to get them to ask better questions" (#25, F, Yes).One respondent offered a mixed perspective: I already have many adolescent patients using AI to understand their symptoms prior to seeing me.They often are more well-informed and have good questions when they come to our visits.I've had some patients use it to self-diagnose as well, which has been somewhat problematic.(#56,F,Yes) Other respondents suggested that these tools would engender negative interpersonal effects that "will further separate us socially" (#60, F, No).For example, "I feel like people may get upset by having their problems handed off to a machine as though they weren't worthy of a human's attention" (#40,M,No); "the client needs a special human interaction which the AI can't provide" (#171, M, No).One participant hinted at the potential for strained patient trust in psychiatrists with the incursion of chatbots into care: The use of AI tools without disclaimer's or with misleading language (using personal pronouns like "I" or empathy/emotionally charged wording) runs the risk of confusing pt's into anthropomorphizing the AI or eroding pt-doctor relationship.(#42, M, Yes)

Main findings
Since the launch of ChatGPT in November 2022, there has been much debate but scarce exploration of the views of physicians.This exploratory, mixed methods, online survey offers insights into the experiences and opinions of APA-affiliated psychiatrists.More than half of respondents reported using "AI tools (e.g., ChatGPT/Bard/Bing AI)" "to assist in answering clinical questions."Nearly 70 % somewhat agreed or agreed that documentation will be/is more efficient and almost 90 % somewhat agreed or agreed that clinicians need more support/training in understanding these tools.
Participants expressed mixed opinions about the use and effects of these tools on mental health patients.Three in four somewhat agreed or agreed that among those patients who used tools such as ChatGPT/ Bard/Bing AI), "the majority of their patients" would consult these chatbots before first seeing a doctor; of these patients, nearly two in three somewhat agreed/agreed patients would "use these tools to better understand their medical records."Among patients who used these tools, more than half of our respondents somewhat agreed or agreed that they would better understand their health after using these tools with a similar proportion believing patients would worry more about privacy as a result of AI tool usage.Results from the qualitative section of the survey supported and added nuance.Again, a dominant perspective was LLM-powered chatbots would reduce documentation burdens and improve administrative efficiencies, though participants were conflicted about whether various documentation tasks could be completely outsourced to AI.With respect to benefits and harms-echoing disparate opinions in the closed-ended questions about whether these tools would improve diagnostic accuracy or decrease disparities in healthcare-respondents offered mixed opinions.Some expressed optimism that these tools could strengthen patient safety, access, and the quality of care, while others pointed to the potential for harm urging that current models fabricate information, embed harmful biases, and risk patient privacy.
Psychiatrists in our survey also reflected on the impact of generative AI on the future of their job.Many believed these tools would revolutionize the delivery of mental healthcare, but participants were divided about what this might mean.While some expressed fears that these tools might outright replace clinicians, others predicted psychiatrists would increasingly work closely with AI as a co-pilot.In anticipation of this, participants pointed to the need for the profession to be better prepared for this change and called for greater training on generative AI.Finally, respondents expressed contrasting views about the impact on the patient-doctor relationship.Some foresaw benefits, anticipating that chatbots would help patients to be better prepared to engage in dialogue with doctors; others predicted that AI tools could undermine human connections with patients.
The results of this survey are in line with ongoing efforts by electronic medical record companies at integrating generative AI chatbots that will comply with the privacy standards of the 1996 Health Insurance Portability and Accountability Act (Adams, 2023), and an Azure HIPAA compliant GPT-4 service already available (Boyd, 2023).Notably, however, statements issued earlier this year by the American Psychiatric Association (American Psychiatric Association, 2023) and the American Medical Association caution entering any patient data into these systems until there is further regulatory clarity and guarantees of patient privacy (AMA, 2023).Companies like Google are also working on healthcare specific LLMs, Med-PalM2, that have specific training and tuning customized to the needs of clinical users.However, these specific programs have yet to be rigorously evaluated for bias and accuracy.
While our findings indicate a strong desire on the part of psychiatrists to take advantage of these tools for administrative tasks, clinicians may require stronger guidance on patient privacy risks in relation to HIPAA compliance (Blease, 2023;Marks and Haupt, 2023), and best practices.In other regions advances are under way to strengthen privacy.In December 2023, the European Union (EU) reached the world's first decision about ChatGPT and other LLMs with its "AI Act" which declared these technologies a "medium risk" to consumers and that companies should be transparent about how they work, allowing consumers to decide on whether they use them (European Council of the European Union, 2023).The European Parliament will vote on the AI Act proposals early 2024, but legislation will not take effect until at least 2025.Moreover, how these rules intersect in a practical way with healthcare contexts and patients are still not fully understood.
The issue of data privacy is paramount, especially in the context of psychiatric care where personal information is highly sensitive.We suggest that in the era of LLMs in healthcare, fundamental issues pertaining to privacy will need to be considered (Mulligan et al., 2016), including whether individual patient privacy can be considered a relic of the past or whether technological advances can effectively guard sensitive data without compromising the confidentiality of individuals.Such ethical dilemmas will need to be resolved if the aggregate benefits of these models in healthcare are to be optimized and balanced with privacy preservation.Relatedly, the adequacy of informed consent provisions in relation to LLM tools demands ongoing scrutiny.In the EU, where regulations on data harvesting are stricter than the US, consumers who are presented with the choice to consent to data collection through applications must also navigate and comprehend complex terms and conditions.Put another way, the standard for consent is unfeasibly onerous for patients and consumers, a concern that persists with LLM tools and health privacy (Blease, 2023).
Further research is needed to establish whether generative AI embedded into electronic health records serves to increase efficiencies or exacerbates new versions of technology fatigue, like techno-burnout that already add to physician burdens and burnout.Notably, psychiatrists' opinions on the usefulness of generative AI were wide-ranging.Although many were aware of the limitations associated with LLMs, including the problems of hallucinations and algorithmic discrimination, some respondents appeared to overestimate the readiness of these tools to outright replace clinicians in writing documentation and to assist with clinical tasks.Again, we suggest this finding indicates the need for more targeted training about the evolving evidence base implementation challenges associated with these technologies.

Strengths and limitations
Administered a year after OpenAI launched ChatGPT, this is the first study to explore psychiatrists' experiences and opinions about generative AI in mental healthcare.However, the survey has several limitations.The convenience sample, restriction to participants enrolled in the "AI in Psychiatry: What APA Members Need to Know" course, and the low response rate of 18 % likely influenced results; in addition, the decision to complete the survey may have been influenced by responder biases such as prior enthusiasm or skepticism about the topic which might have affected findings.Therefore, the use of non-probability sampling limits the generalizability of our findings.In addition, some survey items did not differentiate between participant experience, and anticipated effects of these tools and may be challenged on grounds of vagueness.For example, items on the effects of generative AI on clinical practice embed interpretative ambiguity with respect to whether respondents believed LLMs will affect clinical practice currently or in the near or long-term future.Future research, including qualitative studies, should usefully aim to disambiguate these questions.Furthermore, although the qualitative aspect of the study supported and elaborated on the quantitative questions, comments were often brief, and because of the restriction to a time-limited online survey, it was not possible to probe participants' responses further.
We recommend that future surveys strive for stratified sampling techniques that permit correlative analyses of participants experiences and opinions according to gender, age, and workplace environment.Further in-depth qualitative work could also obtain a richer understanding of psychiatrists' views including how frequently they have used these tools in clinical practice.In addition, much remains to be understood about whether LLMs will improve workplace efficiencies or increase burdens, especially in mental health care where clinicians must be cautious about how to write documentation that is respectful, understandable, and accurate (C.R. Blease et al., 2020b).We suggest that Delphi polls recruiting a variety of expert stakeholders in mental healthcare, including informaticians, psychiatrists, and patients, could be used to determine how best to facilitate and implement the workplace potential associated with LLMs while also ensuring the ethical, safe adoption of these tools.
Finally, psychiatrists in our study believed that some patients would use ChatGPT and other LLM-powered chatbots to understand their health, but respondents offered divergent views about whether chatbots would strengthen or strain clinical relationships.To probe this concern further, future research should also examine the experiences and perspectives of mental health services users with these tools, including privacy concerns, and perceptions of the benefits of LLMs in mental healthcare, and their views about the potential for stigmatizing language or images embedded in LLM tools.

Conclusions
Our survey of psychiatrists revealed a variety of opinions expressed on the benefits and harms of these tools, and their scope to change the profession of psychiatry.The foremost interest was around the potential of these tools to assist psychiatrists with documentation.With a rapidly changing regulatory landscape on AI (Biden, 2023), and in light of continued media hype and hope surrounding chatbots, we conclude that physicians need new support to advance their knowledge and concerns about these tools.We also urge further research is needed to understand how best to implement these tools, where feasible, for the benefit of patient care and clinician workflow.

Fig. 3 .
Fig. 3. Opinions about effects on practice of generative AI.