Skip to main content

Virtual reality suturing task as an objective test for robotic experience assessment



We performed a pilot study using a single virtual-simulation suturing module as an objective measurement to determine functional use of the robotic system. This study will assist in designing a study for an objective, adjunctive test for use by a surgical proctor.


After IRB approval, subjects were recruited at a robotic renal surgery course to perform two attempts of the “Tubes” module without warm-up using the Da Vinci® Surgical Skills Simulator™. The overall MScore (%) from the simulator was compared among various skill levels to provide construct validity. Correlation with MScore and number of robotic cases was performed and pre-determined skill groups were tested. Nine metrics that make up the overall score were also tested via paired t test and subsequent logistic regression to determine which skills differed among experienced and novice robotic surgeons.


We enrolled 38 subjects with experience ranging from 0- < 200 robotic cases. Median time to complete both tasks was less than 10 min. The MScore on the first attempt was correlated to the number of previous robotic cases (R2 = 0.465; p = 0.003). MScore was different between novice and more experienced robotic surgeons on the first (44.7 vs. 63.9; p = 0.005) and second attempt (56.0 vs. 69.9; p = 0.037).


A single virtual simulator exercise can provide objective information in determining proficient use of the robotic surgical system.

Peer Review reports


Surgical training has met immense challenges from the rapid growth of minimally invasive surgery (MIS) across all surgical specialties, compounded by limitations in training hours [1, 2]. In particular, robotic surgery has expanding indications in many surgical specialties [3]. As with diffusion of any new technology, early adoption of robotic surgery was associated with adverse patient outcomes [4, 5]. Robotic simulation may improve the learning curve and may also improve the operative characteristics of surgeons with simulation training [69].

Robotic surgery encompasses new challenges in assessing skill, proctoring, and credentialing of surgeons [10]. Proctoring is an essential patient safety component of surgeon privileging for specific operations, including the use of new technology. However, the proctor has limited guidelines, assessment parameters, or objective tools to assess other surgeons and their comfort level with new technology [11].

Objective testing of the adequate use of a robotic surgical system may provide added information for institutions and proctors regarding specific surgeons’ comfort level and competency with the robotic equipment prior to live patient operative experience. We investigate whether the overall score calculated from one advanced module (“Tubes”) on the Da Vinci® Surgical Skills Simulator™ (DVSSS, Intuitive Surgical, Inc., Sunnyvale, CA, USA) is associated with a number of previous robotic cases and assumed comfort with using the robotic system safely.


Participants and setting

University of California San Diego Institutional Review Board reviewed and approved the ethical conduct of the study (IRB#: 130298). After IRB approval and informed consent, participants enrolled in an American Urological Association-sponsored “Hands on Robotic Renal Surgery” course on May 3, 2013 were asked to perform the “Tubes” training module on the DVSSS twice at one sitting. Each participant was given a number for confidentiality in analysis and was asked to record the number of previous robotic cases performed as primary surgeon. This identification number was entered into the simulator as their user name. After watching the instructional video at the beginning of the module, the participant was not given any other direction. The results from the individual components and the overall score were recorded within the software and retrieved after all participants completed the task.


We used the DVSSS, which incorporates the Mimic software program (Mimic Technologies, Inc. Seattle, WA, USA) to provide an MScore™ developed from individual skill metrics (Table 1, Fig. 1a and b). The MScore™ and metric percentages were developed from the mean and standard deviation of 100 robotic surgeons who have completed at least 75 robotic cases (similar to the Fundamentals of Laparoscopic Surgery FLS™ protocol) to facilitate credentialing and privileging ( The DVSSS uses the actual Da Vinci Si® surgical robotic console (Intuitive Surgical, Inc., Sunnyvale, CA, USA) with the video cable to the simulation pack that fits on the posterior aspect of the console (Fig. 1a). The “Tubes” module is a virtual simulation of a suturing task mimicking an anastomosis of two tubular structures and has been validated in previous studies [12]. The simulator and software have been evaluated favorably for face, content, construct, and concurrent validity, though were found to be limited on predictive validity [6, 1315].

Table 1 Metric definitions
Fig. 1
figure 1

a: Da Vinci® Surgical Skills Simulator™. b: Tubes module. c: MScore evaluation score sheet

Study design and data collection

We performed a construct validity observational study of urologic surgeons and residents at the AUA course described above. The one-day course consisted of didactic learning regarding technique, simulation of robotic tasks, and hands-on robotic porcine laboratory experience. Data was collected from the DVSS in Microsoft Excel format including all metrics for the first and second attempt at the Tubes module. The database was expanded to include the number of previous robotic cases. We categorized participants into two groups. The first group included novice robotic surgeons (0–10 cases) and intermediate/experienced robotic surgeons (>10 cases) based on previous studies [16]. The second group for sub-analysis included residents (0 cases), novice robotic surgeon (0 cases), intermediate (1–49 cases), experienced (50–200) and expert (>200) [17, 18].


We hypothesized that the overall score on the virtual simulation module “Tubes” without warm up is able to distinguish novice surgeons from intermediate and experienced surgeons. Our primary outcome was the overall score on the advanced suturing virtual simulation “Tubes” module as a continuous variable. A secondary outcome was the difference in the scores between the first and second attempt, which we hypothesized would be more distinct in novice surgeons. We then examined the individual metrics to determine any trends that categorize common mistakes between novice and experienced robotic surgeons.

Statistical analysis

We investigated the correlation of the number of robotic cases and the individual overall performance MScore on the first attempt. Subsequently, we compared the mean overall MScore on the first attempt comparing novice robotic surgeons (<10 cases) to more experienced robotic surgeons (>10 cases) using the t-test. Each sub-group’s (resident, novice, intermediate, and experienced) achievement of overall MScore was compared to the “expert” group scores using the Wilcoxon Rank test with Benjamini and Hochberg adjustment. In order to investigate the amount of improvement from the first attempt to the second, a paired t-test was utilized and is displayed using a bar bell graph for each group. Secondarily, we compared each individual component metric that makes up the Mscore from the first and second attempt using the Students t test to determine differences in specific areas. We investigated the correlation of the number of robotics cases compared to the overall Mscore on the first and second attempts at the “Tubes” task on the simulator using Spearman's rho. Univariate and multivariate analyses were performed using logistic regression for individual components of the overall score to determine in which areas the novice robotic surgeons perform poorly. Subsequently, we performed bidirectional stepwise multiple logistic regression to find the most significant of these metrics compared to overall score. Finally, we attempted to determine a cut point between 50 % and 80 % that could be used as the MScore percent, which would maintain a difference between novice and more experienced robotic surgeons while maximizing the number of surgeons who would qualify as proficient to use the robot. All p values <0.05 are considered significant. Statistical analysis was performed using the R statistical package.

Results and discussion

We enrolled 38 subjects in the study with a previous experience range from 0–2,000 robotic cases. All participants completed the “Tubes” module twice, in which raw and percentage values were obtained for the overall score and the 9 individual metrics included raw and software calculated percent (%) score. The median time to complete the task was 4.5 (2.5–13.6) min on the first attempt and 3.9 (1.9–14.2) min on the second attempt. In addition to the overall score, the individual metrics that improved from the first attempt to the second were: use of the workspace, economy of motion, missed targets, and time to complete the task (Table 2).

Table 2 MScores

Educational experience consisted of residents (n = 9), novice robotic surgeons (n = 7), intermediate (n = 9), experienced (n = 7), and expert robotic surgeons (n = 6). Compared to expert surgeons, residents in training and novice robotic surgeons showed significantly lower overall MScores on the Tubes module on both attempts (first attempt p = 0.039 and second attempt p = 0.023) (Table 3). The median overall MScore on the first attempt was 47.5 (range 0–95) and the second attempt was 62.5 (range 2–98). The MScore on the first attempt was correlated to the number of previous robotic cases (Spearman Correlation 0.465 (p = 0.003) (Fig. 2a). The second attempt was no longer correlative (Spearman 0.200; p = 0.228) (Fig. 2b). Significant differences in the mean MScores were noted comparing the novice robotic surgeons and surgeons with some robotic experience on the first (44.7 vs. 63.9; p = 0.005) and second attempt (56.0 vs. 69.9; p = 0.037). The overall scores did improve on the second attempt by 8.82 %, in which the novice group improved to a greater extent than the experienced group (11.4 % vs. 6 %; p = 0.012). We graphed each individual subject’s 1st and 2nd attempts and connected each score with a line to display trends of improvement within each sub-group of experience level (Fig. 3). In this figure, novice and residents without attending level experience on the robot nearly unanimously improved on the second attempt. The other levels of robotic experience seem to be less predictable, possibly due to their particular robotic expertise or experience with this particular simulator.

Table 3 Overall MScores based on experience level
Fig. 2
figure 2

a and (b): comparison of MScore and number of robotic cases in the first (a) and second (b) attempt. Scatter plot and regression line comparing the first attempt overall MScore on the Tubes module compared to number of previous robotic cases performed as the attending surgeon. The dark grey shaded area represents a 95 % confidence interval of a linear regression line. The statistical analysis uses the Spearman Rho

Fig. 3
figure 3

Bar bell plots comparing the first and second attempts on the simulator per subject in each category of experience. The first dot corresponds to the individual subject’s first attempt at the “Tubes” task on the robotic simulator. The second dot to the left is the second attempt for that individual subject. Varying color lines to represent the individual subject’s scores connect the two dots

In order to identify metrics most influential to the overall MScore, we performed logistic regression and identified that experienced surgeons had more out of view penalties if adjusting for missed targets (p = 0.014) on the first attempt despite having higher overall scores. On the second attempt, time (p = 0.014) and missed targets (p = 0.004) were the most significant factors between the novice and experienced groups (Table 4).

Table 4 Experience and individual metrics of the overall MScore

Therefore, the “Tubes” simulator module within the DVSSS does have construct validity to determine if the subject has performed more than 10 cases previously. The virtual simulation task, therefore, may be useful as an objective assessment of proficient use of the robotic console defined as basic functional use of the robot system (not surgical proficiency). Limiting the test to only one difficult virtual reality simulation may limit the amount of information obtained, however, the test can be performed quickly (approximately 5 min) and efficiently.

A previous study has suggested that the use of virtual reality robotic simulation may serve as an assessment tool in a variety of settings [19]. We tested a wide range of robotic surgical experience to determine if one task or module (“Tubes”) could have the ability to provide assessment value in proctoring in a future study. Proctoring requires another surgeon to assess the new surgeon’s ability to perform a particular surgery and report to the credentialing authority [20]. The proctor’s prior training, experience and ability to judge competency may be highly variable; however, the proctor does have the authority and responsibility to recommend further training prior to a surgeon being given unrestricted privileges to perform robotic surgery [10, 20]. Therefore, introducing an objective measure to assess the surgeon’s comfort with the robotic system may be helpful to identify those surgeons who may need to take part in a standardized robotic curriculum prior to robotic privileging.

The simulator can identify particular tasks the user may need additional practice or training on, making the test a learning opportunity (Fig. 1). Metrics such as workspace utilization, economy of motion, missed targets and time all play important factors in the overall MScore and can provide an opportunity for self-reflection and improvement. No to be understated, the more experienced group of robotic surgeons had more “out of view” errors. More experienced surgeons may be moving robotic arms out of the field, potentially causing safety concerns to which a common reaction would be: “I know where the arm is.” Even experienced surgeons should use the simulation as an opportunity for improvement. Another scenario would be if a novice scored exceptionally well on the MScore tasks. The surgeon would still go through the usual proctoring and prove the ability to troubleshoot the robotic system, but may not need additional mentored robotic console training.

The simulator can provide these metrics; however, in order to incorporate them into proctoring, a benchmark needs to be set to provide assessment. Many proficiency based surgical curricula and training programs are based on pass rates of 80 %–91 % to progress to the next level [9, 21]. We then identified that an overall MScore of 75 % would serve as the lowest possible score that could still distinguish surgeon experience even if the surgeon is granted a second attempt. On first attempt only 1 of 20 (5 %) novice surgeons were able to pass compared to 8 of 18 (44 %) surgeons with >10 robotic cases (p = 0.001). When given a second attempt, 2 additional novices (15 %) and 1 additional surgeon with some robotic experience (50 %) were able to pass with maintenance of statistically significant difference between the groups (p = 0.020). Overall, 23 failed both attempts and 6 passed both attempts, 3 passed on the first but not second and 6 failed on first and passed on second attempt. We emphasize the contrast in distinguishing familiarity with the robotic console and basic operation, not surgical proficiency. The cutoff values are arbitrary but should be consistent to compare surgeons to their peers and provide baseline proficiency.

If a surgeon “fails” their first attempt, the natural inclination is to try again. Therefore, we had the subjects perform the task a second time to determine improvement levels, as described above. We found that both the novice and experienced groups were able to improve their overall MScore on the second attempt, with the novice group able to improve to a greater extent. The improvement provides suggestive information that the simulator does have a learning curve and repeated measures may improve their virtual reality score, which may in turn provide improved operative efficiency [9]. Thus, turning the proctoring experience into an opportunity for improvement with specific recommendations on areas of focus. Based on the simulator results, the learner can be directed to a surgeon-specific training curriculum if needed.

The culture regarding operative safety has drastically improved in the last few decades with the use of safety checklists, pathways, and guidelines [22]. One component of safety that has not been investigated sufficiently is the incorporation of new technology and the surgeon. Previous studies suggest a reduction in errors may be achieved with the use of surgical simulation similar to flight simulation in the aviation industry [23]. Currently, simulators for surgical training, practice and warm-up are not widely utilized [24]. Barriers of cost, validation, and optimal specific simulators and tasks have hampered widespread adoption.

Our study is limited by the sample size and number of repeated measures. In addition, we relied on the surgeons to remember the number of robotic cases they have performed, which may be subject to recall bias. Of note, we did not ask their previous experience with simulators. The study was developed in the context of a robotic training course, which interjects some bias regarding the participants. The novice surgeons and residents are not yet at the point of requiring robotic surgical privileges, although a range of skill was needed for the purposes of the study. Additionally, the use of surgical simulators have limitations in that the DVSSS is attached to the actual console and can only be performed when not in use for patient care. Similar software has been used in the Da Vinci Trainer (Mimic Technologies, Seattle, WA, USA) as a tabletop simulator that may be more mobile though may have less working space [12]. The simulator’s virtual reality environment is improving but continues to have low fidelity compared to actual human surgery. Therefore, simulator based testing can only offer an assessment regarding operation of the robotic equipment and not surgical decision-making. We stress that this single simulator test is helpful, but may not be ready for widespread use and standardization. Subjects may not be familiar with the simulator and may need to perform practice sessions first; due to time constraints of the study, we selected one tool and performed it twice. The use of simulators prior to incorporating them into credentialing should be rigorously studied and tested, such as the Fundamentals of Laparoscopic Surgery examination for general surgeons [25, 26]. Proctors with robotic experience would be needed to evaluate actual robotic surgical proficiency. With the help of telemedicine, these opportunities may be offered to hospitals that do not have experienced robotic surgeons available [27].


A single virtual simulator exercise can provide objective information to assist surgical proctors in assessing the use of the surgical robot in addition to the usual proctoring process. This study supports further research regarding the proctoring process for robotic privileges and further incorporation of simulation into robotic skills testing. The results from the virtual simulation process may be used as a learning tool and guideline for individualized robotic curriculum to improve the surgeon’s efficiency on the robotic console.


  1. Chung RS, Ahmed N. The impact of minimally invasive surgery on residents' open operative experience: analysis of two decades of national data. Ann Surg. 2010;251:205–12.

    Article  PubMed  Google Scholar 

  2. Antiel RM, Reed DA, Van Arendonk KJ, Wightman SC, Hall DE, Porterfield JR, Horvath KD, Terhune KP, Tarpley JL, Farley DR. Effects of duty hour restrictions on core competencies, education, quality of life, and burnout among general surgery interns. JAMA Surg. 2013;148:448–55.

    Article  PubMed  Google Scholar 

  3. Anderson JE, Chang DC, Parsons JK, Talamini MA. The first national examination of outcomes and trends in robotic surgery in the United States. J Am Coll Surg. 2012;215:107–14. discussion 114-106.

    Article  PubMed  Google Scholar 

  4. Mirheydar HS, Parsons JK. Diffusion of robotics into clinical practice in the United States: process, patient safety, learning curves, and the public health. World J Urology. 2013;31:455–61.

    Article  Google Scholar 

  5. Ellison EC, Carey LC. Lessons learned from the evolution of the laparoscopic revolution. Surg Clin North Am. 2008;88:927–41.

    Article  PubMed  Google Scholar 

  6. Lendvay TS, Brand TC, White L, Kowalewski T, Jonnadula S, Mercer LD, Khorsand D, Andros J, Hannaford B, Satava RM. Virtual reality robotic surgery warm-up improves task performance in a dry laboratory environment: a prospective randomized controlled study. J Am Coll Surg. 2013;216:1181–92.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Seymour NE, Gallagher AG, Roman SA, O'Brien MK, Bansal VK, Andersen DK, Satava RM. Virtual reality training improves operating room performance: results of a randomized, double-blinded study. Ann Surg. 2002;236:458–63. discussion 463-454.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Lerner MA, Ayalew M, Peine WJ, Sundaram CP. Does training on a virtual reality robotic simulator improve performance on the da Vinci surgical system? J Endourol. 2010;24:467–72.

    Article  PubMed  Google Scholar 

  9. Ahlberg G, Enochsson L, Gallagher AG, Hedman L, Hogman C, McClusky 3rd DA, Ramel S, Smith CD, Arvidsson D. Proficiency-based virtual reality training significantly reduces the error rate for residents during their first 10 laparoscopic cholecystectomies. Am J Surg. 2007;193:797–804.

    Article  PubMed  Google Scholar 

  10. Zorn KC, Gautam G, Shalhav AL, Clayman RV, Ahlering TE, Albala DM, Lee DI, Sundaram CP, Matin SF, Castle EP, et al. Training, credentialing, proctoring and medicolegal risks of robotic urological surgery: recommendations of the society of urologic robotic surgeons. J Urol. 2009;182:1126–32.

    Article  PubMed  Google Scholar 

  11. Sachdeva AK, Russell TR. Safe introduction of new procedures and emerging technologies in surgery: education, credentialing, and privileging. Surg Clin North Am. 2007;87:853–66.

    Article  PubMed  Google Scholar 

  12. Liss MA, Abdelshehid C, Quach S, Lusch A, Graversen J, Landman J, McDougall EM. Validation, correlation, and comparison of the da Vinci trainer() and the daVinci surgical skills simulator() using the Mimic() software for urologic robotic surgical education. J Endourol. 2012;26:1629–34.

    Article  PubMed  Google Scholar 

  13. Finnegan KT, Meraney AM, Staff I, Shichman SJ. da Vinci Skills Simulator construct validation study: correlation of prior robotic experience with overall score and time score simulator performance. Urology. 2012;80:330–5.

    Article  PubMed  Google Scholar 

  14. Kenney PA, Wszolek MF, Gould JJ, Libertino JA, Moinzadeh A. Face, content, and construct validity of dV-trainer, a novel virtual reality simulator for robotic surgery. Urology. 2009;73:1288–92.

    Article  PubMed  Google Scholar 

  15. Lendvay TS, Casale P, Sweet R, Peters C. VR robotic surgery: randomized blinded study of the dV-Trainer robotic simulator. Stud Health Technol Inform. 2008;132:242–4.

    PubMed  Google Scholar 

  16. Perrenot C, Perez M, Tran N, Jehl JP, Felblinger J, Bresler L, Hubert J. The virtual reality simulator dV-Trainer((R)) is a valid assessment tool for robotic surgical skills. Surg Endosc. 2012;26:2587–93.

    Article  PubMed  Google Scholar 

  17. Seixas-Mikelus SA, Kesavadas T, Srimathveeravalli G, Chandrasekhar R, Wilding GE, Guru KA. Face validation of a novel robotic surgical simulator. Urology. 2010;76:357–60.

    Article  PubMed  Google Scholar 

  18. van der Meijden OA, Broeders IA, Schijven MP. The SEP “robot”: a valid virtual reality robotic simulator for the Da Vinci Surgical System? Surg Technol Int. 2010;19:51–8.

    PubMed  Google Scholar 

  19. Lee JY, Mucksavage P, Kerbl DC, Huynh VB, Etafy M, McDougall EM. Validation study of a virtual reality robotic simulator–role as an assessment tool? J Urology. 2012;187:998–1002.

    Article  Google Scholar 

  20. Livingston EH, Harwell JD. The medicolegal aspects of proctoring. Am J Surg. 2002;184:26–30.

    Article  PubMed  Google Scholar 

  21. Zhang N, Sumer BD. Transoral robotic surgery: simulation-based standardized training. JAMA Otolaryngol. 2013;139:1111–7.

    Google Scholar 

  22. de Vries EN, Prins HA, Crolla RM, den Outer AJ, van Andel G, van Helden SH, Schlack WS, van Putten MA. Effect of a comprehensive surgical safety system on patient outcomes. New England J Med. 2010;363:1928–37.

    Article  Google Scholar 

  23. Henriksen K, Battles JB, Marks ES, et al. Advances in Patient Safety: From Research to Implementation (Volume 4: Programs, Tools, and Products). Rockville (MD): Agency for Healthcare Research and Quality (US); 2005.

  24. Liss MA, McDougall EM. Robotic surgical simulation. Cancer. 2013;19:124–9.

    Article  Google Scholar 

  25. Derossis AM, Fried GM, Abrahamowicz M, Sigman HH, Barkun JS, Meakins JL. Development of a model for training and evaluation of laparoscopic skills. Am J Surg. 1998;175:482–7.

    Article  CAS  PubMed  Google Scholar 

  26. Vassiliou MC, Dunkin BJ, Marks JM, Fried GM. FLS and FES: comprehensive models of training and assessment. Surg Clin North Am. 2010;90:535–58.

    Article  PubMed  Google Scholar 

  27. Ereso AQ, Garcia P, Tseng E, Dua MM, Victorino GP, Guy LT. Usability of robotic platforms for remote surgical teleproctoring. Telemed J and E-health. 2009;15:445–53.

    Article  Google Scholar 

Download references


We would like to acknowledge Intuitive Surgical, Inc. for assistance in obtaining the data from the surgical simulators that were provided for the robotic surgical course.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ithaar H. Derweesh.

Additional information

Competing interests

Dr. Christopher J. Kane is a consultant for Intuitive Surgical, Inc. Dr. Ithaar H. Derweesh is an investigator in a study sponsored by GlaxoSmithKline. Michael A. Liss, Tony Chen and Joel Baumgartner have no conflicts of interest or financial ties to disclose.

Authors’ contributions

ML conceived and designed the study along with data collection, statistical analysis, and writing the manuscript. CK provided support for the study, assisted with data collection, and edited the manuscript. TC performed data collection and formation of tables. JB performed statistical analysis. ID performed oversight of the project and manuscript editing. All authors read and approved the final manuscript.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit

The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liss, M.A., Kane, C.J., Chen, T. et al. Virtual reality suturing task as an objective test for robotic experience assessment. BMC Urol 15, 63 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: