Measuring the quality of MDT working: an observational approach

Background Cancer multidisciplinary teams (MDTs) are established in many countries but little is known about how well they function. A core activity is regular MDT meetings (MDMs) where treatment recommendations are agreed. A mixed methods descriptive study was conducted to develop and test quality criteria for observational assessment of MDM performance calibrated against consensus from over 2000 MDT members about the “characteristics of an effective MDT”. Methods Eighteen of the 86 ‘Characteristics of Effective MDTs’ were considered relevant and feasible to observe. They collated to 15 aspects of MDT working covering four domains: the team (e.g. attendance, chairing, teamworking); infrastructure for meetings (venue, equipment); meeting organisation and logistics; and patient-centred clinical decision-making (patient-centredness, clarity of recommendations). Criteria for rating each characteristic from ‘very poor’ to ‘very good’ were derived from literature review, observing MDMs and expert input. Criteria were applied to 10 bowel cancer MDTs to assess acceptability and measure variation between and within teams. Feasibility and inter-rater reliability was assessed by comparing three observers. Results Observational assessment was acceptable to teams and feasible to implement. Total scores from 29 to 50 (out of 58) highlighted wide diversity in quality between teams. Eight teams were rated either ‘very good/good’ or ‘very poor/poor’ for at least three domains demonstrating some internal consistency. ‘Very good’ ratings were most likely for attendance and administrative preparation, and least likely for patient-centredness of decision-making and prioritisation of complex cases. All except two characteristics had intra-class correlations of ≥0.50. Conclusions This observational tool (MDT-OARS) may contribute to the assessment of MDT performance. Further testing to confirm validity and reliability is required.


Background
Cancer multidisciplinary teams (MDTs) are well established in the UK and other countries as a core mechanism for improving patient outcomes. A core activity is regular MDT meetings (MDMs), held weekly for most tumour types, which brings together all relevant health professionals to discuss and agree patient treatment plans.
The benefits of MDTs in relation to improved adherence to evidence-based guidelines, better treatment decisions, and association with better clinical outcomes including survival have been well documented [1,2]. Nevertheless MDTs are a very expensive resource and we know little about how well they individually function. The UK national annual cancer peer review programme provides a mandatory mechanism for assessment of MDTs. MDTs have to show their compliance with standards that are derived from tumour-specific improving outcomes guidance (which is in turn based on evidence and/or clinical consensus). Data from the peer-review programme demonstrates wide variation between teams in adherence to standards [3,4]. The standards encompass structural features of MDTs such as team composition (having the required expertise) and having protocols for referral and treatment but omit other aspects of MDT functioning such as the quality of leadership and chairing; teamwork (e.g. inclusiveness and mutual respect); and the clinical decision-making process.
Evidence is accruing regarding the association between process and outcomes in cancer MDTs [2]. The quality of diagnostic and treatment decisions has been related to MDT discussion in a range of tumour types [5][6][7]. The quality of MDT recommendations relies upon consideration of all relevant information. Failure to consider patientbased information has been shown to be a major reason for non-implementation of recommendations, either due to being unacceptable to patients or clinically inappropriate [8][9][10][11]. Moreover, non-implementation of MDT meeting recommendations can have both clinical and financial consequences if further discussion is required and treatment is delayed. This is likely to require inclusive discussions in MDT meetings but research has shown that cancer MDT meetings may prioritise the medical model and place less value on the contributions of non-medical members of the team [12,13].
Robust assessment of the complex behaviours and activities in MDMs is likely to require mixed methods including independent observation [14,15]. Observational measures have been developed to assess teamwork in other healthcare teams, such as surgical teams [16,17]. To apply such methods to MDTs first requires agreement about the optimal MDT characteristics. Determining causal relationships between teamworking and outcome is riddled with methodological challenges, which may partly account for the lack of empirical research regarding predictors of effectiveness [1]. Nevertheless, consensus of opinion in a recent UK national survey completed by over 2000 MDT members [18] resulted in recommendations for MDT working: 'The characteristics of an effective MDT' [19]. Eighty-six 'characteristics of effectiveness' are organised within five domains: The Team; Infrastructure for meetings; Meeting organisation and logistics; Clinical decisionmaking; and Clinical governance. These characteristics provide a framework against which to develop objective criteria for assessing the quality of teamworking in MDMs.

Aims
To develop quality criteria for assessment of characteristics observable in MDMs To conduct preliminary tests of their acceptability with 10 bowel cancer teams, to include describing: ∘ the variation within and between teams ∘ the characteristics most/least likely to receive high quality ratings To test inter-rater reliability and feasibility

Development of MDT-OARS (Observational Assessment Rating Scale)
Each of the 86 'characteristics of effective MDTs' [19] was considered in turn for its relevance and feasibility for independent observational measurement in MDT meetings. Additional consideration was given to whether the characteristics were already measured in national peer review assessments, and priority was given to including characteristics that did not overlap with peer review. In total 18 of the 86 characteristics were felt to be observable. Due to overlap between characteristics these aggregated into 15 aspects of teamworking (Table 1).
Quality criteria for each characteristic were initially informed by relevant literature, observation of MDMs in a range of tumour types, and review of the survey data that had informed the Characteristics [18]. These prototype quality criteria were subsequently reviewed by an independent panel consisting of a consultant radiologist, oncologist and nurse with bowel cancer expertise as well as a senior NHS cancer manager. The panel discussed the potential variation in quality for each characteristic. Optimal ratings were calibrated against the recommendations in the Characteristics. Thresholds for lower quality performance were based on consensual agreement about the added value or detriment to MDT-working of variations to the optimal rating. It was agreed that the variation in quality for most characteristics would best be represented on a four-point scale: very poor (1), poor (2), good (3) or very good (4). Very poor/poor were merged for three characteristics where it was felt that further categorisation would lose value (Table 1).
Video-taped observation was proposed for rating behavioural characteristics such as chairing, teamwork and decision-making, to enable multiple observers to rate identical footage and allow replay. To enhance the reliability of behavioural ratings a case discussion proforma was developed for rating characteristics on a case-by-case basis prior to being aggregated to an overall rating (Additional file 1). Aggregation was a qualitative process for most characteristics: forming global judgments on the rating that best reflected performance based on notes taken for individual cases. The only exception was patient centred care where the rating was based on quantification of objective content. This was because it would be necessary to know the patient in order to value-judge this characteristic of effectiveness. Rating is instead based on the frequency that patient-based information is considered ( Table 1). Aspects of team functioning that could not easily be observed on a video-taped recording and could be objectively described were rated on the basis of notes and/or information collected by the researcher at the time of the videotaped meeting. This included attendance, the meeting agenda, meeting venue and technology. The researcher who was present at the meeting verified attendance information from the MDT lead and/or MDT coordinator, obtained an anonymised hard copy of the meeting agenda, and recorded information about the meeting venue and technology on a study-specific proforma (Additional file 2) to ensure systematic recording of the detail required for assessing these criteria.  Availability of multiple screens scores extra 1 point. Score out of possible 9 is then standardised onto 1-4 scale to give overall rating.
(1)  Piloting MDT-OARS: acceptability and variation in performance Ten bowel cancer teams volunteered to participate (ascertained via their participation in another study [20]). One MDM per team was filmed and observed in-vivo. Quality was subsequently assessed using completed case discussion proformas, the completed in-vivo proforma and anonymised agenda (Table 1, Additional files 1 and 2). Short structured interviews were conducted with a range of members from each team (including at minimum the MDT lead, MDT coordinator and clinical nurse specialist) to determine the impact of the observational method on the meeting, and confirm attendance information. Team members provided written informed consent prior to the meeting being video-recorded and all patient discussions and agendas were anonymised. Ethics approval was granted by the South East Multi-Centre Research Ethics Committee and R & D approval was obtained from the relevant NHS Trust for each MDT.

Analysis
Quality ratings for each characteristic are presented in their unstandardized form and are summed to provide an overall score out of 58 (Table 2). The ratings for all characteristics except teamworking: presence of tension/conflict were standardised to the same scale and either presented on a scale of 1-4 or dichotomised (very poor/poor vs. good/very good) to enable visual comparison of variability in quality within and between teams. Rating of the presence of tension/conflict is not presented graphically as the scale was not compatible.

Feasibility and inter-rater reliability
Feasibility was estimated by collecting data on the time taken to complete observational assessments. Inter-rater reliability was assessed by comparing the ratings of the research team with those of two independent observers: one oncology specialist registrar and one post-doctoral research psychologist. Intraclass correlation coefficients were calculated (two-way mixed models with measures of absolute agreement). 95% confidence intervals are

Results
Criteria for assessing MDM quality The quality criteria and methods for assessing each characteristic are summarized in Table 1.

Acceptability and team meeting characteristics
Presence of an observer and videocamera in MDT meetings was acceptable to team members and feasible to implement: it caused no delay or interruption to the usual flow of meetings. Interviews with team members confirmed all meetings had been typical: the presence of the researcher and camera had negligible impact on team member behaviour.
Meetings ranged in size from having 10 to 45 core and extended members present (median 16 members). The largest MDT meeting was a network-wide meeting comprising three local MDTs; all others comprised one local MDT. The average meeting length was 53minutes (range: 25-86 minutes) and teams discussed an average of 13 patients (range: 6-21 patients). Each patient was discussed for an average of 4.5 minutes (range 3-9 minutes).

Variation in team performance between teams
There was wide diversity in ratings between teams across all characteristics. Total scores (out of a possible 59) ranged from 29 (team 3) to 50 (teams 2 and 9; Table 2). Only one team (Team 3) failed to achieve 'very good' criteria for at least one characteristic. The quality of teamwork in this team was particularly poor in relation to chairing (the chair allowed long pauses in discussions and did not explicitly confirm treatment plans before moving onto the next case); prioritisation of the agenda (9/14 cases discussed were rated as having too much time spent on them as key information or results were missing, or because relevant team members, for example the oncologist, were absent); and presentation of clinical information (case presentations were unprepared and unstructured, time was spent searching for relevant information for most cases). In contrast, Teams 2 and 9 met 'very good' criteria for 9 out of the 15 characteristics. These teams demonstrated: optimal chairing (by ensuring discussions remained focused and clearly articulating treatment plans); inclusion of all relevant team members in discussions (most other teams, although not always dominated by one or two members, particularly lacked contribution from the clinical nurse specialist or any other nurses); explicit use of the meeting for professional development (a well prepared, informative discussion about a waiting times breach resulting in clear actions); optimal venue and technology (a spacious purpose-built room; core team seated in horseshoe formation enabling sight of imaging/pathology and each other. Picture Archiving and Communications System (PACS) and multiple screens facilitated viewing of current and retrospective pathology and radiological imaging); circulation of a comprehensive agenda prior to meetings (including patient ID and demographics, all previous relevant history, tests undertaken/results, reason for MDT discussion, and space for MDT recommendation to be inserted).

Variation in team performance within teams
Some internal consistency within teams was evident: MDTs performing well in one domain tended to perform well in other domains (and vice versa, Figure 1). Eight out of the 10 teams had the same rating (either 'very good/good' or 'very poor/poor') for at least three of the four domains of teamworking. The internal consistency was most evident in teams performing at either end of the spectrum: Team 3 received consistently poor ratings across all characteristics with three 'very poor' ratings and no 'very good' ratings, and teams 2 and 9 received 9 'very good' ratings and had no 'very poor' ratings of quality for any characteristic ( Table 2).

Characteristics of effectiveness most and least likely to be achieved
Teams were most likely to achieve 'very good' ratings for: administrative preparation (a comprehensive agenda), membership/attendance (all core team members or deputies present for whole meeting), and tension/conflict (no tension or conflict in meetings) (Table 2, Figure 2). Teams were least likely to achieve 'very good' ratings for: patientcentredness of case discussions and prioritisation of cases on the agenda. No teams were observed to explicitly consider patient-centred factors in all patient discussions: most teams considered patient-based factors in less than half of cases, mostly comprising demographic information rather than holistic needs or preferences. There was no attempt by any team to use the agenda to prioritise the order of case discussions to ensure appropriate time was spent discussing complex cases. In all teams at least some cases were judged as being discussed for too little or too much time based on their complexity.

Feasibility and inter-rater reliability
Completion of assessments required 93 minutes on average (range: 45-160 minutes). There was no consistent difference in time taken by different observers; instead the time taken was associated with meeting length, averaging between 1.25 and 2 times the meeting length. There was a trend for assessments to take longer initially (average 2 times length of meeting) compared to the final few teams observed (average 1.25 times length of meeting).
Acceptable intra-class correlation coefficients of at least 0.50 (and up to 0.92) were achieved for all except two characteristics of effective teamworking: the absence of tension/conflict (ICC = 0.10) and evidence of treatment planning (ICC = 0.32; Table 3).

Discussion
We have developed a novel, observational measure for assessing quality in cancer MDT meetings. The quality criteria are calibrated against 'effectiveness' as defined by consensus from over 2000 MDT members, and measure quality in relation to 15 aspects of observable activity in MDT meetings. Our preliminary study has shown that independent observation of MDT meetings is acceptable to teams and feasible to conduct. Whilst only a small number of teams participated in this pilot study, the application of the measure highlighted wide diversity in the quality of teamworking across the range of characteristics measured.
Teams typically performed well in terms of the administrative preparation for meetings and having the appropriate team members in attendance. They performed less well in relation to spending adequate time discussing cases (requiring case prioritisation on the agenda), and having patient-centred case discussions. Both of these aspects of MDT function require additional preparation time, either in relation to compiling and assessing case information to determine their complexity, and/or additional time with patients to comprehensively assess their history, needs and wishes. The importance of undertaking regular assessment of patients' needs and preferences (now referred to as holistic needs assessment [21]) was highlighted in the NICE guidance for improving supportive and palliative care [22] and generally it is expected Clinical Nurse Specialists (CNS) should lead this. Ensuring patient-led decision making is fostered in the MDT meetings, where patients are not present, is challenging due to the time pressures of meetings and requires further attention. Nevertheless if the opportunity is not grasped for information to be obtained and shared with the wider team at the earliest opportunity it may fail to impact on decision making, or may cause avoidable delays to treatment [9,10]. In this study, few nurses contributed to case discussions which may at least in part explain the lack of patient-based information discussed. Similar findings have been reported elsewhere [23]. Although responsibility for ensuring that recommendations are patient-centred rests with the whole  team, training and support to enhance nurses' involvement in MDT discussions may be warranted.
This pilot study was aimed at determining 'proof of concept' and has demonstrated that it is acceptable and feasible to measure complex aspects of team behaviour and activities such as leadership, teamworking and decision-making. The calibration of quality criteria against characteristics of effectiveness agreed by a large sample of MDT members, in addition to using available evidence and expert input, ensured content validity. Furthermore, most characteristics were measured reliably in the hands of different observers. There was low agreement in ratings for the presence of tension/conflict and the clarity of treatment recommendations which may be related to level of clinical experience. The quality criteria require refinement to increase their reliability or it may be necessary for observers to have relevant clinical experience for the ratings of these aspects of teamworking to be valid.
Further testing is required with more teams and other tumour types. Only one meeting was assessed per team which may not have adequately represented their teamworking. Team members confirmed that the filmed meetings were typical of their usual meetings in all cases, but some aspects of teamworking may have been more reliably rated longitudinally. This may be particularly important where behaviour or performance receives a poor rating. Furthermore, the teams volunteered through their participation in another study. It is necessary to test this method of observational assessment with other teams to further confirm its acceptability. Validation of ratings against other subjective and objective outcomes (such as team member assessments of their own performance, clinical outcomes, peer review data & national patient experience data) will be important to further define the characteristics of effective teamworking. The current design of the tool, based upon rating case-by-case, enhanced objectivity of ratings but was time consuming and is likely to require simplification to have clinical as well as research utility. Together with further validation, it may be desirable to develop quality criteria for other characteristics of MDM effectiveness. In particular this could include other aspects of case discussions such as presentation of nationally agreed minimum datasets for radiology, pathology and clinical data; and adherence to relevant nationally and locally agreed protocols. Such assessment may require observers with clinical expertise, at least for assessment of these aspects of team functioning.
The variation in quality of teamworking we report reinforces the need to provide teams with appropriate assessment tools, resources and training to optimise their performance. Indeed 85% of MDT members that responded to the UK national survey agreed MDTs need performance measures [18]. Table 3 Inter-rater reliability of ratings across three independent observers Rating Scale domains Intraclass correlation coefficient (95% confidence interval) Significance (p)

THE TEAM
Attendance n/arating based on in-vivo factual information