A mixed-methods evaluation of ChatGPT’s real-life implementation in Undergraduate Dental Education

Background: The recently introduced Artificial Intelligence tool ChatGPT seems to offer a range of benefits in academic education, while also raising concerns. The relevant literature revolves around issues of plagiarism and academic dishonesty, as well as pedagogy and educational affordances, yet no real-life implementation of ChatGPT in the educational process has been reported to our knowledge so far. Objective: The aim of this mixed-methods study was to evaluate ChatGPT’s implementation in the educational process, both quantitatively and qualitatively. Methods: In March 2023, seventy-seven 2nd-year dental students of the European University Cyprus were divided in two groups and asked to compose a learning assignment on ‘Radiation biology and Radiation protection in the dental office’, working collaboratively in small sub-groups, as part of the educational semester program of the Dentomaxillofacial Radiology module. Designing the research process was challenging for the authors, as this was an early attempt to actually implement ChatGPT in the teaching-learning process and potential challenges had to be identified and resolved ahead, so that the results would not be compromised. One group searched the


Table of Contents
the language model, recognized its opportunities and limitations and used it efficiently.

Preprint Settings
1) Would you like to publish your submitted manuscript as preprint?Please make my preprint PDF available to anyone at any time (recommended).
Please make my preprint PDF available only to logged-in users; I understand that my title and abstract will remain visible to all users.Only make the preprint title and abstract visible.
No, I do not wish to publish my submitted manuscript as a preprint.2) If accepted for publication in a JMIR journal, would you like the PDF to be visible to the public?
Yes, please make my accepted manuscript PDF available to anyone at any time (Recommended).
Yes, but please make my accepted manuscript PDF available only to logged-in users; I understand that the title and abstract will remain v Yes, but only make the title and abstract visible (see Important note, above).I understand that if I later pay to participate in <a href="http

Introduction Background
The emergence of ChatGPT (OpenAI Inc, San Francisco, California, USA), in November 2022 represents the third significant technological breakthrough in information technology impacting education, following the introduction of Web 2.0 over a decade ago [1], and the rapid and widespread implementation of e-learning during the COVID-19 pandemic [2].ChatGPT is an artificial intelligence (AI) tool that offers benefits and opportunities in higher education including increased student engagement, collaboration, personalized feedback, and accessibility.But it presents a limited database (restricted ability to answer medical questions, potential for inaccurate and/or biased responses).There are also concerns regarding legal and ethical implications, plagiarism and academic integrity [3][4][5].
The research on artificial intelligence and its implementation in academic education is a prominent subject; Google Scholar search for "artificial intelligence and dental education" yielded 100,000 results and approximately 18,000 results for "ChatGPT and higher education" (on 9 June 2023).AI technology evolves to unprecedented levels, transforming professions, revolutionizing workflows, and reshaping human-machine interactions.ChatGPT, the most recent milestone in natural language processing (NLP) AI models, has been enabling advanced conversational capabilities and expanding the boundaries of AI-powered communication.Interest in ChatGPT applications encompasses both clinical practice [6][7] and higher education [3,[8][9][10][11], with promising results.

Relevant prior research
Within the higher education landscape, it has been suggested that dental curricula at universities need to be updated due to the AI paradigm shift [9,[12][13].This involves defining a fundamental dental curriculum for both undergraduate and postgraduate levels, and establishing learning outcomes related to dental AI [8].Cotton et al [3]) and Halaweh [14] proposed strategies to ensure the ethical and responsible use of AI tools in higher education.Fergus et al [10] evaluated academic answers generated using ChatGPT, and Bearman et al [15] in their review on AI in higher education discussed the shifting dynamics of authority and the relationships among teachers, students, institutions and technologies.Gimpel et al [16] in their extensive discussion paper proposed guidelines and recommendations for students and lecturers, and urged the Universities for a multi-stakeholder dialogue to implement an efficient and responsible use of generative AI models in higher education.Roganovic et al [17] performed a cross-sectional online survey among experienced dentists and finalyear undergraduate students from the School of Dental Medicine, University of Belgrade-Serbia, to investigate their current perspectives and readiness to accept AI into practice.Responders, especially final-year students, showed a lack of knowledge regarding AI use in medicine and dentistry (only 7.9% of them were familiar with AI use) and were skeptical (only 34% of them believed that AI should be used in dental practice); the underlying reasons were fear of being replaced by AI, as well as a lack of regulatory policies, since students and -at a lesser degree -dentists were concerned that using AI could legally complicate the clinical practice [17].Chan et al [11] reported different results in exploring students' perceptions of Generative AI and ChatGPT in teaching and learning, through an online questionnaire: the study revealed a generally positive attitude towards Generative AI, with students demonstrating a good understanding of this technology, its benefits and limitations, despite its novel public appearance.Students recognized the potential for personalized feedback and learning support, brainstorming, writing assistance, and research capabilities, and stated they would integrate technologies like ChatGPT in their studies and future careers, but they were also concerned about becoming over-reliant on them.They moreover expressed concerns about data accuracy, privacy, ethical issues, and the impact on personal development [11].Students' perceptions of the learning environment and the teaching strategies have a significant impact on their approach to learning and the learning outcomes (positive perceptions lead to a deep approach to learning), thus being of pedagogical interest to educators and institutions [11,18].Nazari et al [19] conducted a randomized controlled trial to examine the efficacy of an AI-powered writing tool (Grammarly) for postgraduate students and concluded that students in the intervention group demonstrated significant improvement in engagement (behavioral, emotional, and cognitive), self-efficacy and academic emotions (positive and negative), domains that address learning behavior, lead to self-development and underpin authentic pedagogy.

Aims of the study
Despite the large number of AI and Large Language Models' (LLM) related publications, the majority involve discussion papers, viewpoint articles, and positions [3,13,16,[20][21], with few being exploratory, cross-sectional or questionnaire-based studies [11,17,19].To our knowledge so far, we have not identified any experimental studies, where ChatGPT was in vivo implemented by the students within the teaching process and the outcomes were comprehensively evaluated.
Therefore, this study aimed to implement ChatGPT within the learning process and perform a quantitative and qualitative evaluation of the outcomes (mixed-methods research study).

Study design: challenges
The study was conceptualized, organized, and refined in February 2023, and realized in March 2023.Of note is that ChatGPT appeared publicly on 30 November 2022; in March 2023 ChatGPT-3.5 was freely available (and was mostly used by the students), whereas ChatGPT-4 had just emerged (few students used this).The study was not a stand-alone research endeavor, instead, it constituted part of students' educational activities, embedded within the semester's educational program.As this was the first attempt to implement ChatGPT in the educational process and there were no existing research studies in the literature to refer to, and adding to the limited knowledge on ChatGPT's properties and limitations at the time, authors encountered various challenges while organizing the research design.Therefore, to anticipate potential issues that could affect student learning or compromise the study's outcomes, they conducted a systematic, forward-looking analysis of the research process, considering each step and taking proactive measures to mitigate any challenges or obstacles that may have arisen.For example, authors realized that once the ChatGPT output was generated, students could not critically evaluate it nor would they know whether this was scientifically correct or incorrect, comprehensive or incomplete, because they would not have an exemplar scientific text to compare it with.The reason was that the subject of the assignment was not taught before, it was a new subject that students had to discover and learn through engaging with the project.At the same time this would affect the achievement of the learning objectives.This, along with other foreseeable challenges was tackled in advance, so that a well-defined process with clear outcomes was communicated to the students.

Study design: implementation
The second-year dental students (77 students) of the School of Dentistry, European University Cyprus (EUC) were randomly divided into two large groups and were asked to compose an assignment on "Radiation biology and Radiation protection in the dental office".The subject of Dentomaxillofacial Radiology is taught through theoretical lectures and practical labs/training during two semesters, and students' learning assignments are embedded within the lectures' program, as an alternative for traditional lecturing.Student learning assignments to replace lectures followed by in-class presentation and discussion, is a methodology used within the "Dentomaxillofacial Radiology" module whenever the topic is suitable for such an approach.Students usually work collaboratively to perform the assignments by searching the internet for scientific, reliable sources and compiling the results into a PowerPoint slide presentation, including the references they used.Students of both groups were asked to work in small sub-groups to compose the assignments, where each sub-group would comprise 3-7 students, decided among them.It is worth mentioning that the EUC School of Dentistry is an English-speaking program, educating students from over 30 countries encompassing different ethnic and cultural backgrounds; therefore, the study's sample could be considered diverse and representative.
One large group would compose the assignment through literature research (the traditional method for assignments) and the other group would use the ChatGPT tool for the assignment (pose prompts and register the answers), also submitting a slide presentation.Students were given one month to deliver the assignment and they were informed that they would present their presentations in-class on a designated day.Moreover, students of the ChatGPT group were encouraged to experiment with it, asking different questions, asking for videos, images, and internet resources, and in general to be creative, imaginative, and playful while using this new tool.After finishing the assignment, they were asked to complete an open-ended questionnaire individually (AI Evaluation Questionnaire) ( Multimedia Appendix 1), including questions about the usability, problems, opinions, proposals, etc., which was emailed to them, and which they would submit to the educator together with the assignment (i.e. the PowerPoint presentation).
The AI Evaluation Questionnaire included 12 questions and was developed by the authors by combining questions from two sources: essays evaluation questionnaires retrieved in the scientific literature [22][23][24] and the questionnaire ChatGPT produced on the prompt "Can you develop 10 questions for a user to evaluate your performance on writing an essay?"Questions were combined and modified, they were piloted within a small student group other than the research groups, and were finally amended as necessary.Replies to the AI Questionnaire were grouped in main themes and discussed (subjective, qualitative evaluation).
After students completed and submitted their projects via emails, and on the designated day they would present the PowerPoint presentations in class, at the beginning of the session, they all had an unannounced knowledge exam (answered individually and anonymously, where they only indicated the group they belonged in) developed by the authors and consisting of 10 Multiple Choice Questions which addressed the learning objectives of the topic.They were informed that the knowledge test was intended for the educator to identify whether the assignment has equipped them with the intended knowledge and whether there were any knowledge gaps to address.Results of the exam (exam grades) were compared among the two groups, i.e. the literature research group and the ChatGPT group.Statistical significant differences between the groups' grades were explored using the Mann-Whitney non-parametric test.Data analysis was conducted using SPSS (version 25.0; SPSS Inc, Chicago, IL), and statistical significance was set at p=0.05 (objective, quantitative evaluation).
The final study design is summarized as follows:  Students were randomly divided in 2 large groups (the ChatGPT and the Literature research groups) and further to smaller groups  Literature research group: would perform the assignment through searching the internet and deliver it in PowerPoint format, including the references used  ChatGPT group: (i) would ask the LLM relevant queries and develop a PowerPoint presentation (ii) would register and report on their interactions with ChatGPT, including the prompts and their modifications, the final outcome and its evaluation (iii) would answer the AI Evaluation Questionnaire on their experience with the LLM  All students would present their learning assignments in-class.In the beginning of this session, they would undertake an unannounced knowledge exam of 10 questions.

Quantitative results
Out of the 77 students, 39 were assigned to the ChatGPT group forming 9 sub-groups and 38 in the Literature research group forming 8 sub-groups.Seventy students undertook the MCQ exam (7 students were absent) and exam grades ranged from 5-10 on the 0-10 grading scale.Figure 1 presents the number of students (percentages within each group) to their exam grades.We notice that in the higher range of exam grades, i.e. 8-10, the ChatGPT students outperform the Literature research students and the opposite happens within the lower range of exam grades, i.e. 5-7.
To check for differences between the ChatGPT student group and the Literature research group, we performed the Mann-Whitney test, which showed that students of the ChatGPT group (n=39; mean=7.54,SD=±1.18)performed significantly better (p=0.045)than students of the Literature research group (n=31; mean=6.94,SD=±1.12).
To foster inclusiveness and avoid discriminations, we deliberately chose not to perform statistical analyses regarding gender differences, as we also believe that genders (gender diversity) are not associated with the educational process or the educational outcomes.Education is offered equally to all students and any gender differences possibly found would not direct different educational approaches for one gender or others.Instead, we perceive this student cohort as representatives of their generation (Generation Z), a characteristic that is directly related to this study's outcomes and could explain several findings.This concept is in line with the National Institute of Health recommendations for gender-neutral language [25].

Qualitative results
Out of the 39 students of the ChatGPT group, 31 students (80%) answered the 12 questions of the AI Evaluation Questionnaire.At the beginning of the Questionnaire, students were asked to indicate their self-assessed level of knowledge and experience regarding computers and digital applications on a 5point scale: 'not experienced at all', 'barely experienced', 'moderately experienced', 'experienced enough' and 'very experienced and skillful'.Only 1 student indicated 'barely experienced', whereas 10 students were 'moderately experienced', 15 students 'enough experienced' and 5 students 'very experienced' in IT technology.Replies to the questions were grouped into themes and discussed.Three main themes emerged:

Collaboration with ChatGPT and problems encountered
Although the majority of students were aware that ChatGPT had surfaced a couple of months ago in the digital world and some of them had already used it, this was the first opportunity they had to actually work with it and 'officially' use it within their studies, and they enjoyed and appreciated this opportunity.They characterized it as a 'powerful and versatile tool', 'intuitive and intelligent', 'revolutionary', and 'enjoyable to work with' and they thought this experience was 'interesting and different from the regular assignments'.They stated that learning to use these AI tools would improve their future practice, but 'you have to learn how to properly use it'.They appreciated its human-like answers, as these 'do not make the user feel distanced from technology'.A student stated: 'In the beginning I was afraid it was going to be too difficult to work with but as I was discussing with it I understood its greatness.I think it really is the future as it can help both education and research.I really did enjoy its human-like answers like when something was wrong it persisted like a human being for its accuracy as well as when it did not answer the question as it should like a lazy student'.Another student commented: 'I enjoyed working with ChatGPT, because I got to learn and understand something that is going to be a part of the future'.Not unexpectedly, students identified all the problems and limitations of ChatGPT, later described in detail in the literature.They identified the need to rephrase or detail the prompts to have a satisfactory output ('we learned quickly how to ask the questions to get a good answer') and realized that if the same question was asked slightly differently the output was different ('by asking it 6 different questions, we wanted to get a better idea of what it changes on the text every time we put a new word, or phrase the question differently').They confirmed that some information was outdated and important content was missing, part of the answer was occasionally incorrect, links to references were nonexistent and the links to videos were not working, although the LLM provided detailed and seemingly reliable information on the links and references (thus unknowingly identifying the 'hallucination' effect of ChatGPT) (Figure 2).A student stated: 'Mostly it understood our questions but it was not giving us that detailed and satisfactory answers as we anticipated according to our book'.Another student correctly noticed that 'ChatGPT is not capable of having thoughts or opinions on its own, so it does not answer some questions that demand a critical-thinking answer' (Figure 3).Technical issues were also mentioned by some students, e.g.'some days it was not opening and our conversation couldn't be saved on the cloud' (enough experienced student) and 'it "crushed" sometimes mid-working'.

Quality of the generated outputs
In general, students thought that the text generated by ChatGPT was correct and sufficient, yet the quality and depth of the information provided depended on the quality and wording of the questions asked.As a student noticed 'I would not say that it demonstrated a very deep understanding of the topic, but I think with even more questions being asked, then the text could essentially show a deep understanding of the topic'.Students quickly realized that with follow-up questions and rewording they could guide the LLM to produce more detailed and in-depth answers: 'it needed some guidance with follow up questions to further specify what we were asking for'.While comparing the output with a reference text, students reported that the answers were not detailed, sometimes included false data, and were brief, general or superficial; nevertheless, the key points were evident.A student concluded that 'ChatGPT is more than enough in order to understand and have a general idea about the main points of the matter being discussed', and another student thought that 'I will find more details by going and searching online or in books'.They expect ChatGPT to improve in the future and be able to provide videos and images, because 'they are helpful in understanding a topic and provide a more effective way to retain information as well' and also to be able to browse external resources outside its stable database (Figure 4).They evaluated the language as appropriate for a scientific document, understandable and explanatory, and they indicated that when references were asked for, the language was even more formal and academic: 'It is fascinating how the AI provides understandable answers in a scientific manner'.However, they encountered problems with the references, as in some occasions ChatGPT denied to supply references, while in other instances the references were incorrect.A student described that 'The AI was continuously denying to give us relative references but after reforming our questions we eventually got our answer.The references it used were accurate scientific resources found on its stable database like the American Dental Association', whereas another student stated that 'We used chat GPT 4 so all our references were sufficient and up to date' (apparently overestimating ChatGPT-4's currentness, as it has the same cut-off date as ChatGPT-3.5).The majority of students evaluated the references as relevant, sufficient, reliable, and up-to-date, however, they also recognized the limitations of the LLM, thinking that 'it is under construction so not all its answers are up to date and sufficient information is only provided up to a certain point in time'.

Exploring additional possibilities and predicting the future
Students experimented with ChatGPT, asking it to provide images and videos, and create multiplechoice questions, charts, bullet-point summaries, and presentation templates, e.g.'we asked about multiple choice questions and the answers were actually impressive' (Figure 5).Students were imaginative and resourceful, and they were disappointed when their request was not realized: 'I asked from it to provide me some explanatory images related to our topic, but it was not able to do so.I think this is a crucial disadvantage, as images give depth and context to a description and provide a much more immersive experience than writing alone'.

Figure 5. MCQs creation by ChatGPT
Two student groups comprised of technologically very experienced students surprised the authors when they skillfully bypassed the inability of ChatGPT to produce PowerPoint presentations, by asking it to write a programming code: 'we used the AI for the generation of a PowerPoint.Since it cannot on its own generate PowerPoint Slides we asked it to generate a VBA code for the PowerPoint.That code was copied and then pasted to the 'Developer' section of the PowerPoint.As a result we got a beautiful but not so detailed presentation of our topic'.This process enabled the instant transfer of ChatGPT's output within a PowerPoint slide presentation created by ChatGPT!Among the future applications of ChatGPT, students included the use in dental education, for example for the creation of multiple-choice questions, summarizing a topic, lecture revision, helping students better understand a theory or concept, assignments and projects, lab reports, questions about law and ethics, communication with patients and more.A student proposed: 'Virtual patient consultations: ChatGPT could be used to simulate patient consultations for dental students.Students could practice various scenarios, including patient history taking, explaining diagnoses, and treatment planning'.
Continuing education could also avail from the opportunities ChatGPT and Large Language Models offer: 'Education that never ends: ChatGPT may be utilized to give dental professionals continual education.For dental professionals to keep current in their field, faculty might create modules containing the material they need, and ChatGPT may offer engaging tasks and tests to reinforce the learning'.
Considering the dental practice, students proposed that ChatGPT could be used to educate and solve problems for the dentist, for example, when 'the dentist has a mind block' or when the dentist 'seeks information about new dental materials and techniques'; also for treatment plans, schedule creation, and oral hygiene info; and for patient education 'through integrating the model into a dental practice's website or patient portal'.
For research and scientific publications students thought it 'can be useful to use it synergistically with your own research', but 'you should always double-check the information' and 'keep in mind the plagiarism, using the information provided appropriately'.
In any case, students admitted that ChatGPT has drawbacks, such as a limited database, incapability to access external web resources and provide images and videos, inaccurate links, and the need to verify the information generated.They thought that 'it should be used with caution' and that 'AI still needs to evolve', so that it will become 'an incredibly smart, effective and powerful tool that can help the scientific community'.They realized that 'the power it holds is unpredictable and the work of doctors could be compromised', and feared that 'maybe we will live one day that AI robots could even replace dentists'.A student eloquently summarized ChatGPT's past, present, and future:

Discussion
In March 2023, 39 twenty year-old dental students, through composing an educational assignment, identified the capabilities and limitations of the recently introduced ChatGPT, explored various possibilities, used it to write multiple-choice questions and programming codes, proposed future applications in education, research, and dental practice…and outperformed their fellow students in the knowledge exam!Admittedly, an impressive performance!

Results explained and compared
Quantitative results, i.e. the exam grades, showed that all students performed above average and no students underperformed or failed, while ChatGPT group students outperformed their Literature research group peers.Since the exam occurred with no prior notice to the students, it directly reflects the knowledge acquired and retained through the projects' creation.Students' good performances on the exam could be related to the format of the project in connection with their generational traits: all students socially belong to the Generation Z cohort (born between 1995 to 2010), so they are the first true 'digital natives' [26], with the extraordinary technological advancements -such as the Web 2.0, smartphones, social networks, applications, and streaming content -being their everyday routine [27].They are considered tech-savvy, mobile-driven, collaborative, and pragmatic [28-29], and possess a natural facility with digital tools and an interest for everything digital.Motivated by the opportunity to use the internet and work collaboratively, students immersed themselves in the project and explored it in depth, and this applies even more to the ChatGPT group students who were excited and curious to test this new digital tool.The enhanced learning observed with the ChatGPT students can be also attributed to the increased 'time on task' for these students, as they had to spend more time asking and re-asking the questions, evaluating the answers, correcting and complementing them in comparison to their peers who had clear and readily available results from the relevant scientific literature.Additionally, ChatGPT group students more than their fellow students had to work with the learning material at a higher cognitive level and constantly apply critical thinking while experimenting with various questions and answers, comparing and synthesizing them, an element that also enhances deep learning and results in enhanced performance [30].
The AI Evaluation Questionnaire revealed students' opinions and evaluation of ChatGPT, the problems encountered, and future estimations; all their remarks were consistent with the forthcoming published articles.Students evaluated their learning experience with ChatGPT as interesting, enjoyable, and engaging [19] and appreciated its learner-friendly interface and the possibility to argue with it [16,31].They assessed the generated content as overall correct and sufficient [7,32], although frequently providing a general overview of the subject [5], as well as not demonstrating a deep understanding of the context [33][34][35] nor thinking critically [10,36].They first-hand identified the need for carefully created questions [37] and critical analysis of the answers [14,37], and they urged for cautious and responsible use [4,6].In agreement with Chan et al [11], they are ready to embrace this new technology, but in a collaboration where people maintain control and are not replaced by AI [17,20,[38][39].Students proposed possible applications of ChatGPT in education for revisions, MCQs creation, personalized learning, writing essays [3,20,31,38,40], and continuing education [39], also in research and clinical practice [6,12,31].Nevertheless, students thought that the LLM must evolve to provide images, videos, accurate and relevant citations, and browse the internet [32,[41][42].Numerous publications examined the LLM's limitations already identified by the students: incorrect answers and outdated content [10] possibly due to its limited dataset [38][39]43], the possibility for fabricated information and hallucination [44], false citations and links leading to nonexistent sources [39,[44][45], inability to browse the web [41], and risks for plagiarism [3,46].The present research materialized Kung et al's [32] concluding remarks that 'the utility of generative language AI for medical education must be studied in real-world learning scenarios with students, across the engagement and knowledge spectrum', since ChatGPT was embedded within the educational process, thus producing authentic and relevant results.The quantitative and qualitative outcomes of the study indicate that the current cohort of Gen Z students is capable to adapt quickly to new technologies and ready to use LLMs such as ChatGPT in the learning process -while acknowledging its limitations -particularly when these tools are integrated within a pedagogical framework that fosters creativity and autonomous learning.Educators on the other hand seem to have limited technological knowledge, skills, and pedagogical expertise to assess AI applications and successfully integrate them into education [12,47]; therefore they should pursue professional development to develop new skills related to AI understanding, possibilities and implementation [15,40,[48][49].

Pedagogical aspects
All second-year students were asked to explore the topic of 'Radiation biology and Radiation protection in the dental office' and develop assignments to be presented in class as PowerPoint presentations.Questions and knowledge gaps were covered during the in-class presentations by the instructor and not infrequently by their peers.This approach is consistent with the "flipped classroom" concept, an educational methodology that research has shown to engage students in the learning process, promote autonomy and self-regulation, allow for higher-order thinking, improve student satisfaction, and increase academic performance [50][51].Another element of pedagogical interest is the small-group collaborative work to develop the assignments.Collaborative learning has the potential to promote deep learning, which is essential for understanding complex concepts particularly in science education, through students' meaningful interactions and constructive debates [52].Scager et al [52] reported that effective collaboration is achieved when students undertake a challenging, complex task and they succeed to create a new and original output.Such tasks applied in higher education build a sense of responsibility and shared ownership of the output and the collaborative process, and this sense was indeed apparent in the students of the present study within and during their oral presentations.An additional pedagogical element is the learning assignments as a method for self-learning and knowledge acquisition.Learning through assignments has been reported to be preferred by students: in the study of Warren-Forward et al [53] 79% of the students reported that the assignment on MRI safety was both a positive learning experience and provided an understanding of the topic.Writing assignments enhances retention of knowledge; when assignments include reflective thinking, for example when students have to evaluate and synthesize information (as happens in the present study), then higher-order (critical) thinking is also enhanced as students work at a higher cognitive level [30].
The innovative pedagogical aspects of the study (flipped classroom, learning assignments, group learning) constituted a supportive environment for students of both groups to demonstrate their skills, achieve the learning objectives and produce valuable results.

Study design: tackling the challenges
Of interest would be to communicate here the challenges faced during designing the research process, as the ChatGPT territory was largely unknown at the time, and obstacles and drawbacks had to be identified and resolved ahead, through a step-by-step prospective analysis of the sequence of events.For example, as mentioned before, a concern that had to be addressed ahead was the fact that the subject was unknown to the students and they would not know whether the output was accurate and comprehensive, as they would rely solely upon ChatGPT's answers.To address this, they were advised to compare the outcome with the relevant content of a recommended textbook (or other reliable source of their choice), critically evaluate the quality of the AI outcome, and perform the necessary amendments to complement or correct the AI results.The comparison should be included either within their presentation or within the Evaluation Questionnaire.This process would additionally ensure the achievement of learning objectives.In line with this process and at a later time, Chung proposed in his article published in April 2023 that 'instructors should teach students to use other authoritative sources (e.g., reference books) to verify, evaluate, and corroborate the factual correctness of information provided by ChatGPT' [48].
Another concern arose about elucidating students' engagement with ChatGPT: since the output of ChatGPT would be texts in slide format (similar to the ones of the Literature research group), the educator (one of the authors) could evaluate these texts/slides for accuracy and comprehensiveness, but could not comprehend whether they were generated following single or multiple attempts, posing differentiated or follow-up queries, therefore the time and effort spent on the research process and the learning path could not have been assessed nor would the capabilities and drawbacks of the LLM be revealed.To address this concern, the ChatGPT group students were asked to register and report all their interactions with the LLM (including the number of prompts, the modification of prompts, the queries about references, images, etc, and the underlying reasoning), thus the educator could evaluate the cognitive effort they put in the assignment and the critical thinking applied until a satisfactory result was achieved.Furthermore, this would provide valuable insights into comprehending the usability and operational characteristics of the LLM.Adding to this, the AI Evaluation Questionnaire was a useful means to draw information on student-LLM interactions.
In accordance with the above procedure determined by the authors and in affirming their decisions, Halaweh's article [14] published in April 2023 -two months after the development of the present study's design and one month after its implementation -precisely described the same process when discussing the strategies for successful implementation of ChatGPT in education: 'Students must go through multiple queries to obtain full and relevant information, requiring them to develop the necessary skills to produce satisfactory outcomes….students must document the cases and explain how they adjusted the parameters until the topic was refined.An audit trail must be included in the report, containing a record of the questions asked, the generated texts (answers), and the reflections on how the process was conducted; demonstrating the effort taken to produce useful and relevant results.'It seems that future literature confirmed the authors' study design overall.

LLMs in Higher Education
In view of the study's results and in agreeing with the relevant literature, authors would suggest that Higher Education Institutions and Dental Schools have no alternative but to update their curricula, policies and teaching methods to prepare students for an AI-driven future, by including education on and with AI tools and LLMs [8,45].The introduction of LLMs into education will offer opportunities to improve its efficiency and quality: improved student performance, personalized learning, targeted and immediate feedback, increased accessibility, creativity and innovations, student engagement, lesson preparation, collaborative activities and evaluation [4,40,[54][55][56].From the pedagogical perspective, students using LLMs have the potential to develop new competencies including the 21 stcentury soft skills, such as self-reflection abilities, problem-solving skills, creative and critical thinking, and collaboration, thus becoming motivated and autonomous learners [3][4]16,34,49].Moreover, as AI technology evolves and gradually integrates within the educational process, the conventional pedagogical theories may not be relevant nor sufficient to support the 'teacher-studenttechnology' relationship, as 'technology' profoundly alters the way students learn and engage with the content and the teacher; innovative pedagogies will be needed, such as the 'entangled pedagogy' Fawns [57] proposed, to contextualize students' learning in a world where AI is increasingly prevalent [15][16].
To respond to the AI paradigm shift, higher education institutions, educators and students must engage in constructive dialogue to develop policies, guidelines, and training opportunities for the implementation of innovative technological tools in the teaching process [16,35 55].Despite the current weaknesses that limit their implementation, LLMs will likely improve in the future in terms of performance, scalability, and quality of responses, as well as through fine-tuning for specific tasks, customized use cases, and search engine connection [16,[31][32]58] .

Limitations and strengths
The small number of students participated in the present study (77 in total, 39 in the ChatGPT group) in one Dental School can limit the extrapolation of the results.Students digital literacy is also of relevance: students participated in this research were mostly tech-savvy, whereas students in other Schools/Universities may be less familiar with digital technologies, thus results would not apply to them [17].Also, some findings (particularly the qualitative ones) may be outdated at the time of publication, as LLMs constantly evolve and new LLMs have been introduced since the research was conceptualized and implemented.For example Google Bard and Microsoft Bing claim to have live access to the internet, a capability highly appreciated by the students; ChatGPT has since evolved its algorithms, with results being more accurate and relevant.Some elements of the study design could have been further explored, for example students' assignments could have been graded and compared, but since assignments' grading was not included in the semester program of the module, this was not performed.In any case, the importance of this study lies in the fact that this was a very early attempt to implement legitimately and in vivo a language model in the teaching process as a partner in learning, in contrast to the large number of publications perceiving ChatGPT as a partner in cheating and academic dishonesty [12,[59][60].Another strength would be that it revealed aspects of language model -students' interactions during the learning process, which indicate that this emerging relationship is yet to be explored, and updated pedagogical frameworks are needed for this purpose.

Conclusions
ChatGPT was implemented in real-life undergraduate dental education and was evaluated.Students using ChatGPT for their learning assignments performed significantly better in the knowledge exam than their fellow students who used the literature research methodology.The Questionnaire answered by students revealed the capabilities and weaknesses of the language model, as identified later in the scientific literature.Students enjoyed working with this tool and explored different options and possibilities, indicating that they are technologically knowledgeable and capable to adapt to new technologies, both in education and in future clinical practice.Large Language Models such as ChatGPT have the potential to play a role in education, underpinned by solid pedagogies.

Figure 1 .
Figure 1.Students' exam grades (% of students within each group) Humanization of the LLM is worth noting: 'He always understood what we wanted'.Textbox 1 shows examples of students' prompts.Textbox 1. Examples of students' prompts to ChatGPT (exact copies) How does radiation affect human health?What's the difference between deterministic & stochastic effects of radiation?Is radiation exposure carcinogenic?Which are the radiation doses from common dental radiographic exams?Which criteria are used to reduce unnecessary radiographic exposure in dentistry?Can a pregnant employee continue to work in the dental radiology department?What is the importance of radiation biology?With references used What are the effects of radiation on cells and tissues?With references used What are the effects of radiation on the oral cavity?.... Rewrite the previous answer in a more elaborate way Make a chart about effective dose from diagnostic x-ray examinations focusing on the oral cavity Radiation biology, include references Measurements of radiology safety, include references Radiology protection in dentistry, include references How can we minimize the radiation exposure on dental staff, including references Why are radiation safety precautions necessary for the dentist Tell me how radiation can affect the human body Write me an essay discussing radiology safety and protection procedures in dentistry Can you explain radiation biology for medicine and dentistry in 400 words, include references Radiation exposure in dental office word limit 200-250 words.Include references Radiation monitoring in the dental office in 230 -270 words include references Write me an essay of 400 words about the biology of radiation and provide references.Write me a 300 words essay about radiation safety and protection in dentistry What are the risks associated with exposure to radiation?What are the modifying factors of irradiation?How does radiation exposure time and dose differentiate between adults and children in dental x-ray taking?

Figure 2 .
Figure 2. A student group's slide describing their attempt to obtain videos

Figure 3 .
Figure 3.A student group's slide evaluating ChatGPT's answer to queries posed