Simulation-based teaching is an integral feature within medical education and following the emergence of virtual simulation, an array of possibilities exists for educators to choose between. However, evidence informing their use is scarce, particularly regarding outcomes assessing user experience and knowledge acquisition, and experimental studies comparing different approaches to virtual simulation. Therefore, this study compared immersive virtual reality (VR) simulation to computerized virtual patient (VP) simulation measuring their effect on knowledge acquisition and retention, as well as user experience, in fifth-year medical students.
This pilot study, of a randomized crossover design, comprised 18 participants independently completing an immersive VR simulation and a computerized VP simulation. All participants completed the same two scenarios and received an induction to both modalities. Multiple-choice questions were employed to assess knowledge acquisition and retention, with participants completing the questions immediately before and after the simulation and following a 12-week interval. User experience questionnaires were completed after the simulation, utilizing both Likert scale and open-ended questions. Statistical analysis comprised a Student’s t-test, whilst free-form responses were thematically analysed.
Both interventions achieved statistically significant levels of knowledge acquisition and retention. However, VR simulation achieved higher levels of acquisition (2.11; 95% CI = 0.89, 3.32, p = 0.0019) and retention (1.22; 95% CI = 0.16, 2.28, p = 0.026), when compared to VP simulation. Participants reacted positively to both interventions, though VR simulation was significantly preferred compared to VP simulation. Thematic analysis of free-form responses revealed themes of ‘education’ and ‘technology’, divided into subthemes of ‘application’, ‘knowledge and skills’, ‘value’, ‘software’ and ‘fidelity’.
The findings indicated that both interventions are effective and acceptable educational tools. However, learning does not appear to be uniform across different virtual simulators, with participants achieving higher levels of learning following immersive VR simulation. Moreover, participants reacted significantly more positively to VR simulation, though potential applications were identified for both interventions. This study highlights the importance of an evidence-based approach to the implementation of novel simulation technologies. The findings contribute to an underexplored area of the literature and offer a step towards enabling medical educators to make an informed decision regarding the application of virtual simulation in their context.
Simulation-based training has become an integral feature of medical education, though the approaches to simulation have evolved significantly over time [1,2]. Current available approaches, and their educational applications, are vast, particularly following the emergence of extended reality (XR) technologies . XR technologies is an umbrella term, encompassing virtual reality (VR), augmented reality (AR) and mixed reality (MR), all of which are increasingly being applied to medical simulation . This technology provides a potential approach to bridge the gap between theory and practice, by providing an engaging, interactive and flexible educational tool [5,6]. Furthermore, their increasing availability and reduced associated costs provide a step towards achieving standardized learning opportunities for trainees across the world, with the potential to reduce global inequalities within medical education [4,7]. New developments like integrated self-debriefing tools, also offer novel applications, such as a self-directed learning tool, that empower students to personalize their learning and independently develop their clinical competence .
A particular subset of XR technologies that have grown markedly is VR simulation . VR simulation describes a three-dimensional experience, whereby the user is immersed within a virtual world, typically using a head-mounted device (HMD) . It is important to distinguish VR simulation from other forms of virtual simulation, as a multitude of terms have been applied within this context. For instance, virtual simulation is a generic term used to describe any form of computerized simulation, whereas virtual patient (VP) is a term specifically used to describe digital representations of patients, visualized through an electronic screen . As such, an overwhelming array of virtual simulators, and possible applications, are now available to medical educators, and until now, there has been little focus on guiding educators in the best use of such technologies .
Much of the early literature focused on the validation of virtual simulation, comprising single-intervention studies assessing the validity of newly developed technologies amongst novices and experts. Many studies demonstrated the face, content and construct validity of these educational tools, and this set the groundwork for the widespread interest in virtual simulation [11,12]. As such, recent studies have focused on comparing the educational impact of virtual simulation to traditional teaching methods, such as lectures, textbooks, 2D images and 3D models [13,14]. These studies concluded that virtual simulation is at least equally effective, and potentially more effective, than traditional teaching methods. When taking these findings into consideration with the advantages offered by XR technologies, the potential value of these educational tools is significant.
However, there are multiple gaps within the literature, which require further exploration. First and foremost, there is a predominance of studies examining the utility of virtual simulation for skill acquisition, with few studies examining their impact on knowledge acquisition [4,9]. Furthermore, studies have typically overlooked the experience of the user within the evaluation of interventions . Although this information represents the most rudimentary level of evaluation, it provides essential information regarding learners’ acceptance and willing to use new approaches, and hence, the ability to integrate technology into the curriculum . Lastly, there is a lack of studies comparing different approaches to virtual simulation, to explore whether different technological designs play a role in the learning potential offered. With the expanding availability of new technologies on the market, this lack of evidence makes it challenging for educators to make informed decisions as to which technologies to implement.
Therefore, the aim of this pilot study was to compare the knowledge acquisition and retention, as well as user experience, among medical students during an immersive VR simulation and a computerized VP simulation. The results of this study provide an indication of whether different approaches to virtual simulation result in different degrees of learning and acceptance amongst users and can be used to inform a larger-scale study. This information offers a step towards enabling medical educators to make an informed decision regarding the selection of which virtual simulator to apply in their context.
This randomized controlled trial assessed and compared knowledge acquisition and retention and user experience for VR simulation and VP simulation. This report has been written in accordance with the adapted CONSORT guidelines extended for both randomized crossover trials and simulation-based research [15,16].
1. What impact do different forms of virtual simulation have on knowledge acquisition and retention in medical students?
2. What impact do different forms of virtual simulation have on user experience in medical students?
This pilot study was conducted at Hull York Medical School (HYMS), United Kingdom, between January 2020 and December 2021. All fourth- and fifth-year medical students were eligible for inclusion and invited to participate by e-mail. The HYMS Ethics Committee approved this study (19-43), and informed consent was obtained from all participants.
A two-intervention, randomized, crossover design was adopted, allowing participants to act as their own control group. A washout period was not deemed necessary, as the interventions addressed distinct clinical conditions, which were measured independent of each other. As such, the risk of carryover effect was deemed to be minimal. As well as the clinical justification, this decision was also supported on statistical grounds .
Participants were allocated to sequence 1, VR simulation followed by VP simulation, or sequence 2, VP simulation followed by VR simulation (Figure 1). A computer-generated random code determined participant sequence allocation in a 1:1 ratio, which was concealed until after recruitment. However, one participant swapped sequence allocation, due to initial technical difficulties with the VR simulator, experienced at the time of data collection. Recruitment and sequence allocation were completed by the investigators. Blinding was not feasible in this study, given the nature of the intervention. However, as outcomes were objectively measured, the impact of this is likely to have been marginal.
Participants completed two simulation scenarios – diabetic ketoacidosis (S1) and sepsis secondary to pneumonia (S2). S1 was completed using VR simulation, developed by Oxford Medical Simulation © (London, United Kingdom), and consisted of an HMD (Oculus Rift) and a hand-held controller. S2 was completed using VP simulation, developed by BodyInteract™ (Coimbra, Portugal), and consisted of a large interactive touch-screen table. Both scenarios were developed by the respective software developers and the investigators were not involved in the development of the scenarios. All participants received a standardized introduction in operating both interventions, including the opportunity to run practice scenarios. Simulation scenarios were situated in the emergency department and participants were tasked with reviewing an acutely unwell patient using the ABCDE approach. Participants were allocated 15 minutes per scenario, though they could stop the scenario, if they felt they have completed all necessary tasks. Both scenarios were independently completed by participants and immediately followed by an individual 10-minute standardized individual debrief with a facilitator. The in-built self-debriefing tools offered by both interventions were not utilized, as they provided different approaches to debriefing, which could have impacted the learning experience. As these are optional features, not employed by all educators, the impact of the core features of the two interventions on learning was explored in this study. Different facilitators were used across participants, though each participant was allocated the same facilitator for both interventions. The detailed lesson plan provided to facilitators is provided in Supplementary File 1.
Data regarding knowledge acquisition and retention was collected using 30 peer-reviewed and piloted multiple-choice questions (MCQs). These comprised 15 questions per intervention. MCQs only covered clinical knowledge that could be acquired during the virtual simulations (Supplementary File 2). Assessments were complemented in a pre- and post-test format, as well as after 12 weeks, to measure baseline knowledge, knowledge acquisition and knowledge retention. The same questions, presented in a different order using a computer-generated random code, were used in each instance of testing. The pre- and post-tests were completed under exam conditions. However, due to the COVID-19 pandemic, the retention test was undertaken remotely, though participants were encouraged to comply with exam conditions. Data regarding user experience were collected using questionnaires comprised of five-point Likert-scale questionnaires with 10 questions and four open-ended questions (Supplementary File 3). Identical questionnaires were provided for both interventions and completed alongside the post-test MCQs.
An a priori sample size calculation was performed. A minimum of 18 students were necessary, to detect a between-group minimal detectable difference of 1.1, with a 0.05 two-tailed significance level and 80% power . The crossover design was considered in the sample size calculation.
Score regarding knowledge acquisition and retention were assessed using Student’s paired t -test. Knowledge acquisition was assessed through comparison of pre- and post-test scores, whereas knowledge retention was assessed through comparison of pre- and retention test scores. Within-intervention and between-intervention testing was performed to establish the learning effect offered by each intervention, as well as a comparison between interventions. The adoption of parametric methods is justified by the small sample size of this study . However, non-parametric methods, specifically Wilcoxon signed-rank test, were used to verify the results. No interim analyses were completed. The analysis was conducted using the statistical program: Stata 16 (2019; StataCorp LP, College Station, TX).
Participants’ responses to the Likert scale were converted to numerical values with strongly disagree equating to 1 and strongly agree equating to 5. Values were subsequently totalled and analysed using means and Student’s paired t -test. An inductive thematic analysis of the four open-ended questions was undertaken using the approach described by Braun and Clarke . The process was completed independently by two investigators, using the following six steps: 1. Becoming familiar with the data, 2. Generating codes, 3. Generating themes, 4. Reviewing themes, 5. Defining and naming themes and 6. Producing the report. The investigators met after each step to compare notes and align their approach, before continuing to the next step.
A total of 18 students participated in this study. Ten participants were male, whilst eight were female. Moreover, eight participants were fourth-year medical students, and ten participants were fifth-year medical students. All 18 participants completed the pre-, post- and retention tests, with no loss to follow-up. Figure 2 shows the flow of the participants throughout the study.
The mean pre-test, post-test and retention scores for the two interventions are outlined in Figure 3. An increase in knowledge from pre- to post-test, with minimal decline at retention testing, was observed for both interventions. A wider range of test scores was observed for VR simulation, compared to VP simulation. Moreover, participant scores were higher for VP simulation, though the maximum score of 15 was not achieved.
The mean pre-test score for VR simulation was 7.9 (2.6) and the mean post-test score was 10.9 (1.7), with a mean score difference of 3.05 (95% CI = 1.93, 4.16, p = 0.001). The mean pre-test score for VP simulation was 11.4 (1.3) and the mean post-test score was 12.3 (1.2), with a mean score difference of 0.94 (95% CI = 0.36, 1.52, p = 0.0031). A comparison of score differences for knowledge acquisition between interventions demonstrated a mean difference of 2.11 (95% CI = 0.89, 3.32, p = 0.0019), in favour of VR simulation.
The mean retention test score for VR simulation was 9.9 (2.6), with a score difference between the pre-test and retention test scores of 2.05 (95% CI = 1.19, 2.19, p = 0.001). The mean retention test score for VP simulation was 12.2 (1.5), with a score difference between the pre-test and retention test scores of 0.83 (95% CI = 0.12, 1.54, p = 0.024). A comparison of score differences for knowledge retention between interventions revealed a mean difference of 1.22 (95% CI = 0.16, 2.28, p = 0.026), in favour of VR simulation.
Non-parametric methods (Wilcoxon signed-rank test) were used to verify the above results, and no differences between results were identified.
Participants generally responded positively to both modalities on the Likert-scale questions (Table 1), though mean scores were higher for VR simulation across all 10 questions. This reached statistical significance in questions 1, 2 and 4. The mean score for VR simulation was 4.5 (0.3), whereas the mean score for VP simulation was 4.0 (0.3). The mean difference in scores was 0.5 (95% CI = 0.4, 0.6, p < 0.001), indicating that participants significantly preferred VR simulation.
|Question||VR Sim||VP Sim||p-value|
|1. The session was enjoyable.||4.7 (0.4)||4.3 (0.6)||0.015*|
|2. The scenario was realistic to clinical practice.||4.4 (0.6)||3.7 (1.0)||0.018*|
|3. The equipment was easy to use.||4.2 (0.8)||3.8 (1.1)||0.235|
|4. The scenario was clear and easy to follow.||4.7 (0.5)||4.1 (1.0)||0.026*|
|5. The simulation modality is a valuable teaching tool at my stage in training.||4.6 (0.7)||4.1 (0.7)||0.077|
|6. The session increased my knowledge of assessing and managing patients with the condition covered.||4.7 (0.5)||4.2 (1.1)||0.098|
|7. This form of simulation could be used as an assessment tool.||3.8 (1.3)||3.4 (1.4)||0.348|
|8. I would like scheduled teaching sessions using this modality.||4.4 (0.7)||4.1 (1.0)||0.365|
|9. I would use this simulation modality within my self-directed learning time.||4.7 (0.5)||4.3 (1.1)||0.204|
|10. I would recommend this simulation modality to my peers.||4.6 (0.8)||4.0 (1.2)||0.107|
*Statistically significant p-values.
Question 1 (The session was enjoyable) was the highest scoring question for both interventions, whereas question 7 (This form of simulation could be used as an assessment tool) was the lowest scoring question. Question 2 (The scenario was realistic to clinical practice) demonstrated the largest difference in mean scores between interventions.
Thematic analysis of the open-ended questions revealed two themes and five subthemes. The two main themes were ‘education’ and ‘technology’. Education referred to the pedagogical aspects of the virtual simulations, being further divided into subthemes of ‘application’, ‘knowledge and skills’ and ‘value’ for ‘education’. Technology focused on the technical specifications of the simulations, being further divided into subthemes of ‘software’ and ‘fidelity ’. The themes, subthemes and codes summarizing participant comments for the two simulators are presented in Table 2.
|Themes||Subthemes||Codes VR||Codes VP|
-Applicable across course
-Later year groups
-Applicable across course
-Support clinical transition
|Knowledge and skills||-A to E assessment
-Lacking clinical reasoning
|-A to E assessment
-Needs time to adapt
-Adverse effects of VR
-Quick response times
-Lacking haptic feedback
Identical codes across both modalities are highlighted in bold.
This pilot study compared the learning effect of two forms of virtual simulation: immersive VR simulation and computerized VP simulation. Overall, the results indicate that both interventions are effective educational tools for enhancing medical students’ knowledge of the assessment and management of acutely unwell patients. Participants demonstrated a statistically significant increase in knowledge acquisition and retention, suggesting a sustained learning effect was achieved. These results corroborate the findings of previous studies evaluating the educational effectiveness of virtual simulation [21,22]. Although knowledge remained significantly higher compared to the pre-test, suggesting an overall knowledge gain, participants did demonstrate a degree of knowledge decay in the retention test for both interventions. This reflects the learning curve observed within previous studies exploring knowledge-based outcomes associated with simulation training . Hence, this study supports the need for curriculum sequencing of simulation training, whereby simulation scenarios build upon, and further develop, previously taught content, instead of presenting new and isolated concepts .
Comparison of the learning effect between the two interventions revealed a statistically significant difference in knowledge acquisition and retention between immersive VR simulation and computerized VP simulation. This indicates that different approaches to virtual simulation can result in different levels of learning amongst medical students. A potential explanation for this could be the technological differences between the two virtual simulators. VR simulation offers participants an immersive experience, using an HMD to transport the learner to a virtual world, and thus, shields them from reality. In contrast, VP simulation offers participants a computerized experience, whereby the scenario is accessed through a screen, with the learner remaining in the real world. Previous studies have identified immersion as an important element for learning through virtual simulation and may provide an explanation of the findings presented here . Immersion is argued to generate a higher level of presence within participants, whereby they perceived themselves to be located within the virtual world . Higher levels of presence are believed to enhance participants’ engagement with the content of the simulation and result in increased levels of learning . As such, the immersive experience offered by VR simulation may have generated a higher level of presence and engagement within participants, leading to the observed difference in knowledge acquisition and retention. This is further supported by the findings of the user experience questionnaire, whereby participants found VR simulation more engaging, realistic and enjoyable.
Participants appeared to respond positively to both interventions, though there was a significant preference for the VR simulation, as demonstrated by the Likert-scale responses. Moreover, there was an observed difference in perceived applications of the two interventions. Immersive VR simulation was deemed to be more applicable to senior medical students, with a potential use as a revision tool, whilst computerized VP simulation was deemed more applicable to the middle years of undergraduate medicine, as a tool to support the transition from preclinical to clinical years. This could reflect the different levels of expertise amongst medical students, and the increased cognitive load immersive VR simulation places upon the user . Given the wider experience of fifth-year medical students, they have a higher level of expertise to draw upon during simulation training. As a result, they could be more able to cope with the increased cognitive demands immersive VR simulation places upon them, and perhaps use the realistic experience to further enhance their learning. In contrast, less experienced medical students may find the process stressful and overwhelming, which can be detrimental to their learning. Instead, the computerized VP simulation may provide a better framework for learners developing their ability to apply theoretical knowledge to clinical practice. As such, further research measuring participants’ cognitive load across different forms of virtual simulation and comparing the knowledge acquisition across junior and senior medical students is warranted.
There were also similarities within the possible applications of both interventions. Participants perceived both interventions to be a suitable educational tool for self-directed learning. Such use could readily be supported by the self-debriefing tools offered by both simulators, allowing users to receive personalized feedback, automatically generated from their performance. This creates the possibility of a new application of simulation, whereby learners book the virtual simulation equipment to independently practice clinical scenarios, widening the accessibility of simulation training. Furthermore, as equipment capable of accessing virtual environments becomes more widespread, the remote delivery of virtual simulation training could become a possibility. Conversely, participants were less responsive to the possibility of using either intervention as an assessment tool. This could represent the novelty of virtual simulation, and hence, the uncertainty of how these tools could be utilized as a form of assessment. However, virtual simulation as a self-directed learning resource or an assessment tool are novel applications of these educational tools, and further investigation is warranted.
Importantly, participants identified several shortcomings for both interventions. These findings are important as they can inform the development and implementation of different approaches to virtual simulation. Immersive VR simulation was deemed to be lacking the ability to train clinical reasoning skills, whereby choices were predetermined and inflexible and participants were not given the opportunity to select or interpret clinical findings. This resulted in a more passive learning experience. Additionally, some participants experienced adverse effects of the VR technology – a previously identified issue with VR simulation . For computerized VP simulation, there was a recurring issue with the use of American terminology and units, which participants struggled to convert to UK equivalents. Moreover, the simulator was thought to be too prompting and the interface was deemed to be overly complicated. Such shortcomings highlight key limitations of the user experience, which have the potential to negatively affect participants’ willing to use these technologies, as well as their learning experience. Hence, these issues should be considered in the future developments of virtual simulators. Furthermore, a lack of communication skills was highlighted as an issue for both interventions, with both technologies unable to incorporate this into the simulations. This limits the utility of these forms of virtual simulation for the training of soft skills, such as communication, leadership and teamwork; widely accepted as essential skills for clinicians. As such, the potential learning objectives that can be attained through virtual simulation appear to be more restricted than alternative forms of simulation. Such insights should be used to inform educators in the implementation of novel simulation technologies.
The small sample size, recruited from a single institution, limits the generalizability of the results. Moreover, although a pre-test/post-test design was adopted to provide baseline data, differences in participants’ familiarity with the clinical conditions selected for the simulations could have impacted the learning process, affecting the change in test scores from pre- to post-tests. This difference in familiarity may also have influenced participants’ differential preference across the two modalities. A further limitation was the self-developed data collection tools employed within this study and the lack of psychometric data supporting their use. Finally, the same MCQs, presented in a different order, were used at all three instances of testing, increasing the risk of participant preparation between the post-test and retention test.
This pilot study indicates that both immersive VR simulation and computerized VP simulation are effective educational tools, that are well-received and accepted by participants. However, immersive VR simulation appeared to be more effective for both short-term and long-term learning and significantly preferred by participants. These findings could support the important role immersion plays in the learning, through its creation of a more engaging experience. This was supported by participants’ qualitative responses, where immersive VR simulator was described as more enjoyable, realistic and engaging.
The difference in learning effect between these two interventions is pertinent. It suggests learning is not uniform across virtual simulation and the proven efficacy of one device cannot be equally applied to all. This study provides a step towards enabling medical educators to make an informed decision regarding the implementation of virtual simulation in their context, as well as guiding developers in the important design features underpinning user experience. The findings of this study can be used to inform a larger study, to verify the results across diverse contexts and evaluate the ability of virtual simulation to meet the varied learning needs present within medical education.
Supplementary data are available at The International Journal of Healthcare Simulation online.
The authors would like to pay special thanks to all of the participants and the following: Dr Christa Brew, Dr Samuel Chumbley, Miss Bethany Cosway, Dr Rosa Maeve McGing, Dr Azhar Merchant, Mr Andrew Murphy-Pittock, Dr Emily Ratford, Dr Harriet Van Den Tooren, Mr Stuart Wall, Ms Fiona Ware, Dr Zoe Wellbelove and Dr Joshua York.
KB and LB, as project co-leads, were equally involved in all aspects of this study, from study design, data collection, data analysis and preparation of the manuscript. TS, ASR and DH supported the undertaking of this project and provided significant contribution to all steps. All authors approved the final version of this research report, submitted for publication and accept full accountability of its contents.
No funding was received for this project.
Raw data and materials underpinning the results of this study are available upon request from the corresponding author.
The HYMS Ethics Committee approved this study (19-43), and informed consent was obtained from all participants.
The authors declare that there is no conflict of interest.
A study protocol was prepared prior to the undertaking of this pilot study and is available upon request from the corresponding author.