Spatial ability and 3D model colour-coding affect anatomy performance: a cross-sectional and randomized trial

Photorealistic 3D models (PR3DM) have great potential to supplement anatomy education; however, there is evidence that realism can increase cognitive load and negatively impact anatomy learning, particularly in students with decreased spatial ability. These differing viewpoints have resulted in difficulty in incorporating PR3DM when designing anatomy courses. To determine the effects of spatial ability on anatomy learning and reported intrinsic cognitive load using a drawing assessment, and of PR3DM versus an Artistic colour-coded 3D model (A3DM) on extraneous cognitive load and learning performance. First-year medical students participated in a cross-sectional (Study 1) and a double-blind randomised control trial (Study 2). Pre-tests analysed participants' knowledge of anatomy of the heart (Study 1, N = 50) and liver (Study 2, N = 46). In Study 1, subjects were first divided equally using a mental rotations test (MRT) into low and high spatial ability groups. Participants memorised a 2D-labeled heart valve diagram and sketched it rotated 180°, before self-reporting their intrinsic cognitive load (ICL). For Study 2, participants studied a liver PR3DM or its corresponding A3DM with texture-homogenisation, followed by a liver anatomy post-test, and reported extraneous cognitive load (ECL). All participants reported no prior anatomy experience. Participants with low spatial ability (N = 25) had significantly lower heart drawing scores (p = 0.001) than those with high spatial ability (N = 25), despite no significant differences in reported ICL (p = 0.110). Males had significantly higher MRT scores than females (p = 0.011). Participants who studied the liver A3DM (N = 22) had significantly higher post-test scores than those who studied the liver PR3DM (N = 24) (p = 0.042), despite no significant differences in reported ECL (p = 0.720). This investigation demonstrated that increased spatial ability and colour-coding of 3D models are associated with improved anatomy performance without significant increase in cognitive load. The findings are important and provide useful insight into the influence of spatial ability and photorealistic and artistic 3D models on anatomy education, and their applicability to instructional and assessment design in anatomy.

Objective of the current study. This study aimed to investigate how spatial ability affects anatomy learning using drawing assessment, and the effects of colour coded PR3DMs on extraneous cognitive load and learning performance. We hypothesised that (1) learners with high spatial ability would have better anatomy performance in drawing assessment, (2) colour coded PR3DM (i.e., A3DM) of the liver will decrease ECL, and (3) learning with A3DM models would results in better anatomy performance. Gaining a deeper understanding about these will hopefully aid in the design of tools and curriculum for anatomy teaching and learning, including possible customisation based on individual needs and abilities.

Methods
Study design. A cross-sectional study (Study 1) followed 3 days later by a double-blinded randomised control trial (Study 2) with pre-and post-testing was designed to compare student's learning performance and ICL/ ECL. Experimental design and flow of activities of studies one and two is detailed in Fig. 1.

Ethical approval. Ethical approval was obtained from the Nanyang Technological University Institute
Review Board (IRB-2022-476). Study methods were performed in accordance with approved guidelines and regulations. Participation in the study was voluntary and would neither replace the existing cardiac and liver anatomy curriculum nor impact on formal assessment for the participants. Signed informed consent was obtained from participants prior to enrolment into the study. Confidentiality was ensured by de-identification of participants' names during data collection. Singapore Dollar $10 vouchers were provided as tokens of appreciation.
Participant recruitment. Participants were recruited from the first-year undergraduate medical student cohort of the Lee Kong Chian School of Medicine in Singapore. Fifty participants were recruited based on a similar power calculation to that used in the study by Skulmowski and Rey 13 .The study preceded formal teaching on cardiac and liver anatomy. The study was only publicised in the week leading up to the study through class announcement and email. Study 43 . Responses were scored out of 48. Participants were sorted into equal groups of high or low SA by a median split depending on their relative scores upon completion of the MRT 45 .
Pre-tests. Pre-tests (Appendix 1.1) were designed to assess participants' baseline knowledge of cardiac (Study 1) and liver (Study 2) anatomy, utilising a closed-book format. Face validity of the question items were ensured by expert discussion, where questions were assessed to be appropriate to address the research aims. www.nature.com/scientificreports/ Learning phases and post-tests. Study 1: Participants had three minutes to memorise a 2D-labelled diagram of the superior view of heart valves ( Fig. 2A), and were told to memorise the names, shapes, positions and orientation of the heart valves and cusps. Participants then had 10 min to sketch the same diagram and label the valves and their components in the opposite orientation to the diagram in the learning phase. This involved the participants mentally rotating the diagram 180 • to draw the valves. The paper provided to complete the drawing was fixed to the desk surface, such that participants were not able to rotate the paper nor change their seats while drawing. The drawings were collected, marked, and analysed according to a standardised scheme. Participants also answered additional questions on the difficulty of orienting and sketching the heart valves (Appendix 1.2). Study 2: Participants were randomly assigned to one of two groups via block randomisation. One group was shown a photorealistic 3D model of the liver, while the other group was shown the artistic version of the photogrammetric model (A3DM). The resources were developed in-house via photogrammetry of plastinated liver specimen. The photographs of plastinated liver specimens (Von Hagens Plastination, Gubener Plastinate GmbH, Guben, Germany) were taken at every 5 degrees of rotation using an automated turntable. This was performed 4-5 times at different angles to capture all the specimen's details. The best 100 photos (the limit under Autodesk's educational license) were stitched together using Autodesk Recap Photo (Autodesk Inc., 2015, California, USA) to generate a 3D model. In Autodesk Maya (Autodesk Inc., San Rafael, CA), the 3D model was refined, and a   46 according to anatomy conventions 47 . The geometry, anatomical details, and labelling were constant between both models ( Fig. 2B-E). An example of such labelling is indicated in Fig. 2B,D, for "Falciform ligament", and in Fig. 2C,E, for "Common hepatic duct". Lines were also drawn to differentiate the borders of different structures in the A3DM. Models were uploaded onto Sketchfab (V2.22.0, 2022 Sketchfab Inc, New York) for 3D visualisation and user-interaction. Participants were instructed to memorise the names and positions of 15 labelled components of the liver, and their relationships to each other. In the study by Skulmowski, participants were given 90 s to study a 2D knee model with 16 labels 13 . Since the model in this study required rotation and touch to reveal labels, three minutes (180 s) were given to study the model. Participants then answered a 10-question (open-ended, different www.nature.com/scientificreports/ from pre-test) post-test, with diagrams from their respective models (Appendix 1.3). These questions assess the participants ability to identify the liver structures. Face validity of both the drawing assessment for Study 1 and liver post-test questions for Study 2 were ensured by expert discussion, where questions were assessed to be appropriate to address the research aims. Questions of study 2 were also designed similarly to the school's practical anatomy spot tests, of which the researchers have much experience in writing.
Cognitive load questions. Following study 1's learning phase, participants answered ICL questions from the survey instrument by Klepsch et al. (2017) on a 7-point Likert scale, and ECL questions for study 2 44 (Appendix 1.4). This cognitive load instrument with excellent internal consistency is appropriate for learning using simple visualisations 48 . However, "task" was replaced with "visualisation" for more clarity 13 .

Statistical analysis.
Means and standard deviations were reported for normally distributed continuous variables, median and interquartile range for non-normal variables, and frequencies and percentages for categorical variables. Student's t-test was used to compare normally distributed continuous variables, Mann-Whitney U test for non-normal variables, and chi-squared/Fisher's exact test for categorical variables as appropriate. Pearson correlation was conducted for the relationship between MRT scores and drawing assessment scores. Cronbach α was used to determine survey internal consistency. Cohen's D and r = |z|

Results
Participant demographics. Fifty participants were recruited, and demographics are detailed in Table 1.
No participant reported prior anatomy experience. The participants were grouped into low (MRT score ≤ 26) and high (MRT score > 26) SA based on the median score. 46 participants continued voluntarily with study 2, with 24 (50.0%) shown the PR3DM, and 22 (45.8%) shown the artistic version. There was no significant difference in cardiac and liver pre-test scores between groups divided based on SA and 3D model. The ICL/ECL measures had relatively good internal consistency (Cronbach α : ICL 0.798, ECL 0.854).
Post-test performance and cognitive load. Study 1. The results from Study 1 are detailed in Table 2.
There was no significant difference in overall ICL (p = 0.110) and difficulty of heart drawing (p = 0.187) between low and high SA groups. However, low SA participants reported significantly higher difficulty in remembering names of valves and cusps (p = 0.038). The low SA group scored significantly lower for overall heart drawing (p = < 0.001, ES = 0.752), and for individual valves. MRT scores were moderately correlated with heart drawing scores (r = 0.407, p = 0.001).

Study 2.
The results from Study 2 are detailed in Table 3. There was no significant difference in ECL (p = 0.720) between PR3DM and A3DM groups. The A3DM group had significantly higher liver post-test scores than the PR3DM group (p = 0.042, ES = 0.523).

Mental rotation test and sex differences.
A stratified random sample of 19 males was obtained for comparison between sexes, to account for the large difference in numbers of female and male participants. There were significantly more males in the high SA group in both the original population and stratified sample, detailed in Table 4.
Details of comparison between females and stratified sample of males are indicated in Table 5. There were significantly more males in the high SA group (p = 0.023). Males also had significantly higher MRT scores than females (p = 0.011, ES = 0.779). Otherwise, there were no significant differences in heart drawing scores (p = 0.254) and ICL (p = 0.065) for Study 1, and ECL (p = 0.099) and liver post-test scores (p = 0.263) for Study 2 between sexes.

Discussion
Through this study, it was found that (1) participants with low SA had significantly lower heart drawing scores than those with high SA despite no significant differences in reported ICL, (2) males had significantly higher SA than females, (3) there were no significant differences in reported ECL when signalling in the form of colourcoding was added to the PR3DM, and (4) participants who studied the liver A3DM had significantly higher post-test scores than those who studied the liver PR3DM. The findings are important and provide useful insights on the impact that spatial ability and photorealistic and artistic 3D models have on anatomical education and their application to the research of instructional and assessment design in anatomy. The finding that participants with a low SA had significantly lower heart drawing scores supported our first hypothesis that learners with a high SA have superior anatomy performance in drawing assessment. The correlation between MRT and heart drawing scores was moderately positive. This corroborates with the metaanalysis by Roach et al. reporting significant positive pooled correlation between SA and anatomy performance when drawing tasks were used to assess anatomical knowledge 50 . From our study, it was found that males had significantly higher SA than females (p = 0.011), and there were significantly more males in the high SA group (p = 0.023). As modalities like drawing rely more on spatial reasoning, they can exacerbate effects favouring higher SA students, such as males. This emphasizes that anatomy assessment should be designed from both spatial and non-spatial perspectives, especially for anatomy beginners where SA has a greater effect on anatomy    In several studies, SA improved with repetition and practice [53][54][55] , and mentored sketching (in engineering fields) 56 . For example, Provo et al. (2002) found that males had superior spatial ability, but that at 8 months, spatial abilities were comparable 42 . Our findings suggest that drawing can be encouraged during anatomy teaching and learning sessions to promote SA, increasing experiences of knowledge construction. However, this should be further investigated. A possible explanation for the low SA group's performance in heart drawing is that they were unable to mentally imagine, orient, and map the heart valves with their components from an unfamiliar perspective. This impacted their ability to recall and present the image correctly. Previous research found that participants who could not draw objects from an imagined viewpoint could do it from their actual viewpoint 57 . This indicates that a deficiency in drawing abilities should not be a problem in recalling the image. Therefore, using drawing in this study as an assessment tool to evaluate spatial knowledge is reasonable. Interestingly, ICL was not significantly different between low and high SA. The low SA group's self-reported scores on the difficulty of drawing in different orientation and remembering valve and cusp positions were comparable to the high SA group. This phenomenon could possibly be explained by the Dunning-Kruger effect, where students who perform poorly overestimate their performance in self-ratings 58 , which has been observed in the medical field 59,60 . However, low SA participants reported much increased difficulty remembering cardiac valve and their cusp names. Engaging in active learning techniques such as sketching and labelling of the cardiac valves and cusps can help students' memory and retention of information.
Our second hypothesis was that colour-coded 3D model would decrease ECL. Previously, literature showed that PR3DMs can provide accurate details 10 but can increase ECL due to their anatomical and textural details that may not be essential to learning 12,61 . Significant ECL reduction through colour coding has also been reported 14 . In this study, non-additive signalling-colour-coding-was added, highlighting structural differentiation 28 . However, the current study observed no significant difference in ECL when PR3DM was colour coded. The reason for this could be because the distinct colours for each structure in the liver A3DM coupled with outlines of structures that were in close proximity to each other reduced ambiguity in determining the borders of anatomical structures, mitigating the increased cognitive load from photorealistic geometric and structural details even though textural and colour realism had to be sacrificed. Another possible reason for the similar ECL is that other factors contributing to ECL were kept constant between groups in the current study. For example, the transient information effect, describing provision of free-control of dynamic visual aids, also interferes with learning spatially-complex information 62 . This effect was kept constant as participants studying both models could independently control the rotation, size, and revelation of labels. Another example would be the redundancy effect, elicited as both 3D models were viewable from all directions, where presenting extra materials like multiple spatially-challenging views increases ECL and decreases working memory 63 , affecting anatomy learning. It was also reported that non-essential information should be discarded from visualisations, with only key structural views presented, especially in initial learning phases 64,65 . Further studies can be conducted to investigate how   www.nature.com/scientificreports/ presenting a series of static key views with sacrificed user-interactivity compares with freely rotatable PR3DMs with good interactivity but presents all views (including non-essential ones). Perhaps what is most important is the impact of instructional design on learning performance, as per our third hypothesis. Our results demonstrated that participants who studied the A3DM performed significantly better in post-testing than the PR3DM group, suggesting that colour-coding translated to better recognition, distinction of structures and information retention. Photorealism is useful because it preserves true geometry, giving accurate representations of subject material 10 . Our findings substantiate that PR3DM enhancement through signalling may increase the learner's attention and interest on the 3D anatomy models. This may have implications for the creation of learning tools and instructional design, in that realistic and color-coded 3D anatomy models can be displayed in tandem when guiding students in learning human anatomical regions that are more difficult and complicated to understand. However, our findings must be taken with the caveat that in the posttest, question diagrams corresponded to the model the participants studied. The question thus arises whether visual aids inadvertently increased the reliance of learners on these cues, potentially lowering their performance when tested without cues.

Limitations
We acknowledge several study limitations. Firstly, Study 1 was limited to the heart and Study 2 was limited to the liver. Future larger studies should evaluate if similar effects can be replicated with other human anatomical regions and organs. Secondly, learning strategies between the high and low SA groups could not be delineated and may have been different. Given the short learning phase, low SA participants may simply lack time to generate effective visualisation strategies to retain knowledge. Eye motion analyses would be helpful in identifying differences in viewing patterns, providing clearer explanations on how the high SA group performed better. Thirdly, self-selection bias may be another limitation in that students interested to learn anatomy before the formal course may have participated compared to those uninterested. Lastly, this investigation was undertaken prior to the formal anatomy course, thus limiting its validity to untrained individuals. It is possible that experienced learners may develop mechanisms to compensate for decreased spatial ability, and further studies in this cohort may be useful.

Conclusions
This investigation demonstrated the positive effects of SA and colour-coding on anatomy performance. The participants in the photorealistic and artistic 3D groups indicated similar ECL. These may have significance for the instructional and evaluation design based on PR3DMs and techniques of drawing anatomy.

Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.