Conversational AI and Vaccine Communication: Systematic Review of the Evidence

Background Since the mid-2010s, use of conversational artificial intelligence (AI; chatbots) in health care has expanded significantly, especially in the context of increased burdens on health systems and restrictions on in-person consultations with health care providers during the COVID-19 pandemic. One emerging use for conversational AI is to capture evolving questions and communicate information about vaccines and vaccination. Objective The objective of this systematic review was to examine documented uses and evidence on the effectiveness of conversational AI for vaccine communication. Methods This systematic review was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. PubMed, Web of Science, PsycINFO, MEDLINE, Scopus, CINAHL Complete, Cochrane Library, Embase, Epistemonikos, Global Health, Global Index Medicus, Academic Search Complete, and the University of London library database were searched for papers on the use of conversational AI for vaccine communication. The inclusion criteria were studies that included (1) documented instances of conversational AI being used for the purpose of vaccine communication and (2) evaluation data on the impact and effectiveness of the intervention. Results After duplicates were removed, the review identified 496 unique records, which were then screened by title and abstract, of which 38 were identified for full-text review. Seven fit the inclusion criteria and were assessed and summarized in the findings of this review. Overall, vaccine chatbots deployed to date have been relatively simple in their design and have mainly been used to provide factual information to users in response to their questions about vaccines. Additionally, chatbots have been used for vaccination scheduling, appointment reminders, debunking misinformation, and, in some cases, for vaccine counseling and persuasion. Available evidence suggests that chatbots can have a positive effect on vaccine attitudes; however, studies were typically exploratory in nature, and some lacked a control group or had very small sample sizes. Conclusions The review found evidence of potential benefits from conversational AI for vaccine communication. Factors that may contribute to the effectiveness of vaccine chatbots include their ability to provide credible and personalized information in real time, the familiarity and accessibility of the chatbot platform, and the extent to which interactions with the chatbot feel “natural” to users. However, evaluations have focused on the short-term, direct effects of chatbots on their users. The potential longer-term and societal impacts of conversational AI have yet to be analyzed. In addition, existing studies do not adequately address how ethics apply in the field of conversational AI around vaccines. In a context where further digitalization of vaccine communication can be anticipated, additional high-quality research will be required across all these areas.


Preprint Settings
1) Would you like to publish your submitted manuscript as preprint? Please make my preprint PDF available to anyone at any time (recommended). Please make my preprint PDF available only to logged-in users; I understand that my title and abstract will remain visible to all users. Only make the preprint title and abstract visible. No, I do not wish to publish my submitted manuscript as a preprint. 2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public?
Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended).
Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain v Yes, but only make the title and abstract visible (see Important note, above). I understand that if I later pay to participate in <a href="http

Original Manuscript
Results: After duplicates were removed, the review identified 496 unique records which were then screened by title and abstract, of which 38 were identified for full text review. Eight fit the inclusion criteria and were assessed and summarized in the findings of this review. Overall, vaccine chatbots deployed to-date have been relatively simple in their design and have mainly been used to provide factual information to users about vaccines. Additionally, chatbots have been used for vaccination scheduling, appointment reminders, debunking misinformation and, in some cases, for vaccine counseling and persuasion. Available evidence suggests that chatbots can have a positive effect on vaccine attitudes; however, studies were typically exploratory in nature, and some lacked a control group and/or had very small sample sizes.

Conclusions:
The review found evidence of potential benefits from conversational AI for vaccine communication. Factors that may contribute to the effectiveness of vaccine chatbots include their ability to provide credible and personalized information in real time, the familiarity and accessibility of the chatbot platform, and the extent to which interactions with the chatbot feel "natural" to users. However, evaluations have focused on the short-term, direct effects of chatbots on their users. The potential longer-term impacts of conversational AI on issues such as information literacy and public trust in healthcare have yet to be analyzed. In addition, existing studies do not adequately address how vaccination ethics apply in the field of

Introduction
During the early 2020s use of conversational AI (chatbots) in healthcare increased significantly, especially in the context of increased burdens on health systems and restrictions on in-person consultations with healthcare providers during the COVID-19 pandemic [1,2]. In response to these stresses on health systems, there has been a growing interest in how conversational artificial intelligence (AI) and digital communication tools more generally can improve healthrelated knowledge, attitudes, and behaviors. Chatbots were already being used in a health context prior to COVID-19, primarily to assist with treatment and monitoring, patient education, health system support, behavior change, and diagnosis [3,4]. Use cases during the COVID-19 pandemic included (but were not limited to) triaging users based on their COVID-19 symptoms and risk factors, gathering data on disease symptoms and prevalence, disseminating information to the public, screening recovered patients for activities such as blood plasma donation, and aiding coordination and communication between healthcare workers and health organizations [1].
The association between chatbots and health communication dates back to the mid-1960s when Joseph Weizenbaum developed the first chatbot, ELIZA, which was used to simulate a consultation with a Rogerian psychotherapist. Early chatbots like ELIZA were rules-based, meaning they used a series of pre-programmed rules to match user input to pre-defined outputs. More recent chatbots, such as Apple's Siri or Amazon's Alexa, use Natural Language Processing (NLP) to parse user input and generate human-like responses. Relying on Machine Learning (ML), these chatbots do not require pre-defined answers for all possible user inputs, and they are capable of "learning" from user input rather than being limited to the knowledge base they were programmed with. In addition to the broad distinction between rules-based and natural language bots, chatbots differ along a number of other dimensions, including the knowledge domain in which they operate (e.g. healthcare, retail, banking, etc.), the type of service they provide (e.g. access to information, assisting with a task, offering a service, etc.), the type of interface they employ (e.g. voice or text), the delivery channel (e.g. website, smartphone app, social media channel, SMS), and the extent to which they require human supervision.
One emerging use case for conversational AI within the health field is to communicate information about vaccines and vaccination with the aim of building vaccine confidence [2]. In theory, a well-designed chatbot can disseminate accurate vaccine information in real time, assist users in finding available vaccination appointments, book appointments and issue appointment reminders, and address user concerns and questions about vaccines. The ability to provide timely and accurate information to the public at scale is particularly important in the context of what has come to be called an 'infodemic', characterized by the World Health Organization (WHO) as an excess of "information including false or misleading information in digital and physical environments during a disease outbreak" [5]. Information ecosystem disorder is one of many threats to vaccine confidence and uptake, resulting in a need for practical solutions that assist people in a context where information is abundant but not necessarily reliable. Vaccine chatbots are a potentially beneficial tool for this purpose, assuming they can provide real-time information from reliable and trustworthy sources on commonly used communication platforms.
However, given the relatively recent application of chatbots in the context of vaccine communication, the evidence base around their potential use cases and effectiveness in this field is still quite limited. In order to better understand the current state of knowledge in this area and identify ways forward, this systematic review aimed to 1) understand the current evidence base around the use of chatbots for vaccine communication, and 2) identify key gaps in the evidence in order to suggest directions for future research. In the following sections, we discuss the methodology for this review; key findings on vaccine chatbot design, usage, and effectiveness; gaps and limitations in the available evidence; and recommendations for future research.

Search strategy and database search
This methodology aims to identify and document recent vaccine-related chatbots and their impact on vaccine attitudes and behaviors. A keyword search strategy was used and applied across 13 databases (PubMed, Web of Science, PsycINFO, Medline, Scopus, CINAHL Complete, Cochrane Library, EMBASE, Epistemonikos, Global Health, Global Index Medicus, Academic Search Complete, and a University of London library search. The search was "(vaccin* OR (immuniz* OR immunis*)) AND (chatbot OR "chat bot" OR "chat-bot" OR "conversational AI" OR "conversational artificial intelligence" OR "conversational agent" OR "conversational interface")". Relevant articles were identified and exported into an Excel spreadsheet.

Screening and selection of articles
Two researchers (AP and EP) independently screened articles included in the Excel spreadsheet by title and abstract, and then by full text, according to the inclusion and exclusion criteria shown in Table 1. We decided to include three studies [6][7][8] that used a 'Wizard of Oz' protocol, in which participants interact with what they believe to be an autonomous AI system, but is actually an interface being controlled by a concealed human operator (the 'wizard'). 'Wizard of Oz' experiments are often used in the early phases of system design and testing, to address design and usability issues before time and resources are invested into software development. While the simulated conversational agents used in 'Wizard of Oz' experiments are not themselves autonomous AI systems, we nonetheless deemed them relevant since they contain data on how users perceive and interact with vaccine communications delivered by (what they perceive to be) autonomous AI systems.

Data extraction and analysis
We recorded the following data for the various studies: authors, publication year, title, citation, abstract, location of study, vaccine(s) studied, timeframe, aim, hypotheses, research design and key findings. We then analyzed the interventions and outcomes thematically.

Results
This systematic review identified 971 records across 13 databases published before August 2022. 482 duplicates were excluded, and the remaining 496 were screened by title and abstract using the criteria listed above. 426 records were screened out by title, leaving 70 to be reviewed by abstract. During the abstract screening, an additional 32 articles were excluded leaving 38 for full-text review. From these, 30 were excluded for the following reasons: they did not discuss vaccine-related chatbots and/or they did not evaluate the chatbot's impact on either attitudes or behaviors.
At the time of our search, other vaccine-related chatbots were in development and missed our inclusion criteria either because they were still in the design phase or because they were evaluating feasibility or message content rather than impact on attitudes or behaviors. Given this is a new and rapidly emerging research area, we expect that relevant literature will increase quickly. However, at the time of this search, eight articles fit the inclusion criteria and were assessed and summarized in the findings of this review. Seven additional articles from the reference lists of the eight screened-in articles were also assessed as they appeared potentially relevant, but none of them met the inclusion criteria. Thus, eight articles are included in this review. All eight are peer-reviewed articles and none are gray literature. This is purely because none of the gray literature items identified in the search contained any evaluation data. Of the included publications, there was one global systematic review [1], three studies in the United States [6][7][8], one each in France [9], South Korea [10], and Japan [11], while one study location was not explicitly stated but was inferred to be the UK [12]. Three of the studies investigated COVID-19 vaccines [9,11,12], three evaluated HPV vaccines [6][7][8], and one examined childhood immunizations (as recommended by the Republic of South Korea) [10]. However, there were only six unique chatbots, as a) one of the eight included articles was a systematic literature review, and b) of the seven original research articles included, two discussed the same chatbot at different points in its development cycle.

Discussion
The use of conversational AI in healthcare generally, and for vaccine communication specifically, is still an emerging field and the state of the literature reflects this. In this section we discuss the 1) design and uses of vaccine chatbots to-date, 2) evidence on their effectiveness, and 3) key limitations and knowledge gaps.

Chatbot design and usage
Vaccine chatbots deployed to-date have been relatively simple in terms of their design (see Table 2). Out of the six unique chatbots identified by this review, two were NLP-based [10,12], a third was a hybrid with some NLP functionality integrated within a predominantly rulesbased system [11], one was purely rules-based [9] and the remaining two were simulated agents [6][7][8]. Of the three chatbots that had some NLP capability, only one had the capability to generate natural language responses [10] while the other two were able to process natural language inputs but not generate natural language responses [11,12]. Only five studies (four unique chatbots) specified the platforms and programming languages used to develop their chatbots, which in those cases were Apple's Software Development Kit (SDK) [6,7], Google Dialogflow [10], tawk.to [8] and Python [12]. In terms of delivery platforms, two chatbots were provided via instant messaging services [10,11], a further three were hosted on custom-built web pages [8,9,12], while the sixth was delivered through an iPad app [6,7]. In most cases, the knowledge base for the chatbots was constructed from governmental websites and scientific literature, typically with review and verification of the answers by medical experts. Chatbot development was not generally informed by systematic analysis of local information environments prior to deployment, for example by using social media and web search data to identify information-seeking behaviors or prevalent misinformation narratives among target populations. In addition, it was not always clear how far chatbot design took into account insights from behavioral and communication theories. The main use case for vaccine chatbots so far has been information dissemination. All chatbots in the studies included in this review provided basic factual information to their users such as data on vaccine safety and effectiveness, and common side effects. Other use cases included vaccination scheduling, appointment reminders and infodemic management [10]. Some chatbots were also used for vaccine counseling or persuasion; that is, the chatbots proactively sought to persuade users to vaccinate themselves or their children, rather than simply providing factual information and leaving users to make their own choice. In one case, the chatbot was programmed with a strong normative stance in favor of COVID-19 vaccination for its (adult) users [12]. Using natural language processing (NLP), the chatbot automatically identified the user's concern(s) about COVID-19 vaccination based on their input, and then provided counterarguments to persuade the user to get vaccinated. Other forms of persuasion included a protocol to pursue a recommendation in favor of HPV vaccination for their child in the event of (parental) user resistance or disengagement [6], and in another case a financial incentive for parents to get their children vaccinated in the form of a drinks coupon [10].

Effectiveness of vaccine chatbots
Vaccine chatbots have not always been subject to robust evaluation. In addition to the eight publications included in this review, a further three journal articles and four items of gray literature that were identified through our literature search were excluded as there was no documented attempt to evaluate the chatbots described. Of the six unique chatbots that did meet the criteria, one had been evaluated using a randomized control trial [9], three through other experimental or quasi-experimental methodologies [8,10,12], one through a crosssectional survey [11] and one using a pre-and post-usage survey [6,7]. However, in many cases the sample sizes were very small (range: 18 -10,192; median: 142). In all cases, evaluation was limited to short-term, direct effects of chatbot usage on users' self-reported vaccine attitudes and behaviors and typically over a time period of days or at most a few weeks.
Notwithstanding these limitations, all the studies that sought to measure the influence of chatbots on users' vaccine attitudes and behavioral intent found evidence of positive effects. None identified any 'backfire effects' (where some participants become more vaccine hesitant after the intervention), which have been reported in some previous studies [13][14][15], however one study did find some potential evidence that the relative benefits of chatbot use compared to non-use may decline over time [9]. The results of the studies included in this review are not strictly comparable due to the use of slightly different attitudinal and behavioral metrics between studies, and different operationalizations of these metrics within evaluation questionnaires. To enhance the comparability of future studies, consideration should be given to using standardized survey instruments in vaccine chatbot evaluation, such as the Chatbot Usability Questionnaire (CUQ) [16], the Speech User Interface Service Quality (SUISQ) survey [17], and the Vaccine Confidence Index (VCI) [18].
Several factors were identified in the studies we examined as having a positive influence on chatbot usability and effectiveness. Evidence suggests that providing credible, personalized information in real time through a familiar and accessible platform is key to chatbot success [10]. In addition, making chatbot interactions feel more "natural" by limiting the length of text responses, incorporating images and videos, and eliminating repetition can improve user experience and engagement [6]. There is also some evidence that anthropomorphic cues, such as the gender of the chatbot persona, can affect how users perceive and engage with chatbots [8]. Conversely, excessively lengthy or repetitious text-based responses, obvious gaps in the knowledge base, and a robotic or inhuman "feel" can all weigh negatively on chatbot user perceptions and effectiveness [6].

Gaps and limitations
Our review identified a number of gaps and limitations in the current literature on conversational AI and vaccine communication. Firstly, the range of vaccines covered and the range of study locations are both very limited, and this could potentially be a source of systemic bias in the evidence base on chatbot effectiveness. We found only one study [10] focusing on vaccines other than COVID-19 and HPV and no studies at all in the global South, which may reflect barriers to chatbot development and deployment in resource-constrained settings. All of the studies we examined focused on individual chatbots in single study locations. There were no comparative studies that assessed how the effectiveness of chatbots could differ depending on design features and delivery platforms, or between different demographic groups or country locations. In particular, the focus on COVID-19 vaccines as a paradigmatic case study for chatbot evaluation could skew the evidence base for effectiveness of vaccine chatbots more generally. In theory, chatbots should be most effective at influencing users' attitudes towards topics where they have little knowledge and few pre-formed opinions, which would not be the case for many users in relation to COVID-19 vaccines [9].
Secondly, because chatbot evaluation was largely limited to the short-term, direct effects of chatbot usage on users themselves, we know relatively little about indirect and system-wide effects of a shift towards conversational AI for vaccine communication in the longer term. While respondents in one study indicated a desire to share information they had received through the chatbot, none of the studies we examined tried to measure the indirect effects that chatbots might have on non-users via information sharing. Moreover, the potential longer-term impacts of conversational AI have yet to be analyzed for issues such as information literacy and public trust in health interventions. Some experts have expressed concerns that conversational AI, insofar as it delivers single answers to complex questions or conceals disagreement between information sources, may be less effective at promoting information literacy than more traditional information retrieval systems such as search engines [19]. This is an important question because, in the public health sphere, information literacy is widely viewed as an integral component of long-term strategies for building resilience against misinformation and future infodemics [20]. Similarly, the potential effects of conversational AI on public trust in healthcare providers and systems are also unclear, but will likely be influenced by public perceptions of chatbots' usability, reliability and any 'gatekeeping' role that chatbots are perceived to have in relation to healthcare access [21].
Thirdly, because this is an emerging field of study and many vaccine chatbots are still at the proof-of-concept phase, evaluation has tended to focus on the effectiveness of chatbots rather than their cost-effectiveness. None of the studies we examined provided any information about the costs associated with developing and maintaining the chatbots. Consequently, while there is some evidence that chatbots are more effective at improving vaccine attitudes than the same information provided through static text [9,10], the current literature provides no way of assessing whether the marginal benefit of a chatbot outweighs the additional time and resource costs compared to developing a static webpage. For the same reason it is also unclear how far chatbots could be a scalable or sustainable solution to various vaccine communication challenges in the longer term.
Finally, like the use of AI in healthcare more generally, vaccine chatbots raise a number of challenges from the ethics perspective that are not adequately addressed in the current literature [21,22]. For instance, we are already seeing the development of chatbots that go beyond simply providing users with accurate and up-to-date vaccine information (on the assumption that this will indirectly influence their vaccine willingness in a positive direction) and instead proactively seek to persuade their users to get a vaccine for themselves or their children. However, for data protection and privacy reasons, chatbots do not typically gather detailed "knowledge" of the individual user's medical history, religious and cultural beliefs, or the many other personal factors that may be relevant to their vaccine decision-making and would be needed to make a prudent recommendation. In any case, experts have raised doubts about whether conversational AI is -or ever will be -technologically mature enough to replace health professional assessments [21].

Conclusions
Available evidence suggests that conversational AI, properly designed and implemented, can potentially be an effective means of vaccine communication that can complement more traditional channels of health communication, such as consultations with healthcare providers. While the evidence base on the impact of different chatbot design features remains quite limited, the data in the studies we reviewed do suggest some basic principles that could help maximize the effectiveness of future vaccine chatbots. Specifically, future chatbots should aim to provide reliable, personalized information in real time through communication platforms that are familiar and accessible to target audiences. So far as possible, chatbot interactions should be designed to emulate the "natural" ebb and flow of human conversation, limit the length of text responses and incorporate different media such as images and videos. In addition, chatbots focused on childhood immunization need to have the technical capability to tailor the information they provide depending on the child's age [10].
To conclude, we offer four specific recommendations for future research, to build the evidence base around conversational AI for vaccine communication and to ensure that no unintended harms result from its use.
In the first place, there is a need for further high-quality research on the effectiveness of conversational AI for vaccine communication. There is a particular need for comparative studies that test how chatbot effectiveness may vary depending on design and implementation (e.g. anthropomorphic cues, voice or text interfaces), communication context (e.g. populationwide or community-specific vaccination campaigns), and across different demographic groups and country locations. Researchers should aim to recruit larger, more representative samples and include control groups. Because studies of this nature are costly, consideration should also be given enhancing the comparability of studies conducted by research teams working independently of one another, through use of standardized indices of chatbot usability and vaccine attitudes within evaluation questionnaires.
Secondly, there is a need to evaluate the longer-term and indirect effects of conversational AI as well as the short-term, direct effects on chatbot users. Since one study found that the relative benefits of chatbot use compared to non-use declined over time [9], which the authors speculate could be due to non-users receiving pro-vaccination messaging from other sources during the study period, there would be value in additional longitudinal studies incorporating follow-up surveys of chatbot users and control groups over longer time periods. Where possible, longitudinal surveys should also aim to assess trends in information sharing habits, information literacy and trust in healthcare among chatbot users and non-users over time.
Together, these data would help to build the evidence base around longer-term and indirect effects of conversational AI in this field.
Thirdly, more evidence and transparency around the costs of chatbot development and maintenance are needed as evaluations currently focus on the communicative benefits of vaccine chatbots, without addressing the cost side of the equation. As vaccine communication is still a relatively new application for conversational AI, and many chatbots are still at the proofof-concept stage, it may be premature to expect detailed economic appraisals. However, if future studies could include at least some basic data on the time and resource costs associated with chatbots, this would begin to build an evidence base for the marginal cost-effectiveness of chatbots compared to other forms of vaccine communication, such as web-based FAQs, social media campaigns, webinars, or in-person consultations with healthcare providers.
Finally, greater consideration needs to be given to how vaccination ethics apply in the field of conversational AI. Future research should directly address the question of what may be appropriate or inappropriate tasks for vaccine chatbots to perform, based on analysis of the technical capabilities and limitations of current conversational AI systems. One interesting avenue of research could be around the technical feasibility and ethical desirability of incorporating relevant ethics frameworks and principles directly into a chatbot's knowledge base. For the foreseeable future, however, there will be a continuing need for the human designers and researchers of vaccine chatbots to exercise their own ethically-informed judgment about prudent and imprudent uses of conversational AI technology.