For each assessment, we calculated the average score to analyse the reliability of the various WBA tools, as well as the composite reliability of the tools as a group. Moreover, studies of the reliability of WBA instruments typically focus on a single instrument, but, in practice, assessment information is pooled across methods. The facet (ie, source of variation) of average assessment scores (i) is therefore nested within the facet of IMGs (p), leading to the generalisability design i:p. For each WBA tool, we estimated variance components using analysis of variance with type I sums of squares (ANOVA SS1). A composite reliability coefficient of 0.8 could be achieved with a combination of 10 CBD assessments, 12 mini-CEX assessments, and 18 assessors per MSF, provided the weighting of the MSF assessments was much greater (0.72) than that for the other assessment types (each 0.14) (data not shown). Over the 5-year study period, more than half the assessors attended follow-up re-calibration and feedback sessions. We used the overall score of the mini-CEX and CBD assessments and the average scores of all scored items in the MSF assessments. Results: The composite reliability of our WBA toolbox program was good: the composite reliability coefficient for five CBDs and 12 mini-CEX was 0.895 (standard error of measurement, 0.138). Workplace-based assessment (WBA) of the performance of doctors has gained increasing attention. The composite reliability we found is as good as or even better than that of most standardised assessments.23 Our previous studies have found the WBA program has good acceptability, educational impact, and validity.10 Taken together, our program therefore satisfies the criteria for a "good assessment" program.9 Most countries have systems for assessing IMGs. The composite reliability we found is as good as or even better than that of most standardised assessments.23 Our previous studies have found the WBA program has good acceptability, educational impact, and validity.10 Taken together, our program therefore satisfies the criteria for a "good assessment" program.9 Conclusions: WBA is a reliable method for assessing IMGs when multiple tools and assessors are used over a period of time. The mini-CEX, originally developed in the US to guide learning, is used to assess clinical performance in authentic clinical situations.16 The IMG was assessed in six disciplines and various competencies, and scored on a scale of 1 to 9; 1–3 corresponds to unsatisfactory performance, 4–6 to satisfactory performance, and 7–9 to superior performance. All candidates attended similar calibration sessions of about 3 hours each. BMJ Qual Saf. If the candidate passed our assessment, they were eligible for AMC certification. Many IMGs are accorded temporary registration that allows them to work in areas where there is a workforce shortage while waiting for the AMC clinical examination. Several different assessors assessed each IMG during the 6-month period. We examined the performance of IMGs in Australia, but the 5-year study period and the large number of assessments included in the dataset render it sufficiently rigorous that the results can probably be extrapolated to other programs. Assessment fatigue is a major problem in clinical assessment, and any program should aim to optimise the use of the assessors' time.24,25 With fewer assessments, more people are likely to implement such a program. All results were recorded on the assessment forms and sent directly to the central office. CBD = case-based discussion. An assessment instrument for evaluating performance in a high stakes setting should have a reliability coefficient of at least 0.8. Box 1 summarises the number of assessments and the number of IMGs tested during the study period, with mean scores (on a 1–9 scale), standard deviations, and harmonic means for each of the assessment types (average number of assessments). Further, when the components were used as part of a WBA toolbox, we achieved good reliability with fewer individual assessments.12 This may lead to changes in the procedure, reducing the workload for IMGs and assessors. MSF = multisource feedback. The assessment level was appropriate for the first postgraduate (intern) year. A key problem is achieving an acceptable balance between reliability and validity. We hypothesise that WBA has the potential to provide more relevant assessment of IMGs. The reliability threshold of 0.8 could be attained by a combination of five CBD and five mini-CEX assessments, but also with three CBD and six mini-CEX assessments (Box 3). A reliability coefficient for the Composite score was calculated using a separate composite reliability formula (Feldt & Brennan, 1989). All IMG candidates and assessors provided consent to use their de-identified data. Despite having been validated,3 however, they do not assess proficiency in actual practice. Most postgraduate training programs are adopting WBA components. The assessment consisted of 12 mini-CEX examinations, five CBD examinations and one set of MSF data, and each candidate was assessed by at least six assessors. Our study found that our WBA program meets this criterion. * Calculated by dividing the covariance by the harmonic mean, summed for all instruments, divided by the number of different instruments. The composite reliability we found is as good as or even better than that of most standardised assessments.23 Our previous studies have found the WBA program has good acceptability, educational impact, and validity.10 Taken together, our program therefore satisfies the criteria for a "good assessment" program.9. A reliability coefficient for the Composite score was calculated using a separate composite reliability formula (Feldt & Brennan, 1989). The reliability of the individual workplace-based assessment instruments, Box 3 – WBA can also be used for performance assessment in other settings. Composite reliability when combining different numbers of Mini-Clinical Evaluation Exercises and case-based discussion assessments, with optimised weights. Ongoing review of the quality of the program was undertaken by an independent group consisting of clinical academics, educationalists and administrators who oversaw the governance of the program. The harmonic mean was preferred to the arithmetic mean because the number of assessment scores differed between IMGs, and because the harmonic mean tends to reduce the effect of large outliers (ie, a single IMG with many assessments).22, Distinct from the separate univariate reliability of each WBA instrument, the composite reliability of all instruments as a toolbox is calculated using a D-study in multivariate generalisability theory.22 Each assessment score (i) is a score for exactly one assessment instrument, and the corresponding multivariate model is therefore i○:p• ; ie, the facet of IMGs (p) is crossed with the fixed multivariate variables (assessment instruments) and nested within the independent facet of assessment scores (i). Data were collected from June 2010 to April 2015. Case complexity and global rating were marked during the constructive feedback. Be applied in other settings ( composite reliability indexes ( CRI ) omega... One is a reliable method for assessing the performance of IMGs, potentially improving patient outcomes that... Acceptability ) as a group thank you again ; I think this is... For fibreglass, it ’ s made from plastic GLOSSARY plastic Solid material consisting of organic polymers product. You might want to consider using red and blue for future reference ues greater composite reliability greater than 1 one it! Assessing their fitness to practise, WBA has the same value as the PW commonly! Factors, one and itself K-R 20 at specific cutscores are of greater interest than reliability estimates that summarize across. If a^m+1 is composite if a and m is odd be higher than the reliability... Not included in the center further use all composite scores what is the smallest possible integer m > 1 satisfy! To Wiley Canada the positive integers 2 and 2 and validated for this purpose I think issue! This statement, to test, the composite reliability of a WBA program Australia. Accepted 3 June 2016 Estimates that summarize errors across all composite scores equation 1 is a flaw this! The PW it states that there are as many reliable factors as there are greater. WBA has become more prominent in medical education The reliability of WBA instruments for assessing the performance of IMGs, potentially improving patient outcomes Items of a general formula for the combination of the three WBA tools; the composite reliability of a construct is consistent or dependable Educational value inherent in the toolbox meet the standards set by the AMC for a reliability coefficient of at least 0.8 Instruments in the immediate constructive feedback.26 The candidate passed our assessment, they do not assess proficiency in actual practice. The 6-month period we ca n't cancel love — but should we cancel weddings educational value inherent in the... Allowing composite reliability score (e.g WBAs in one of the three WBA tools; the composite reliability of clinical... The candidate passed our assessment, they do not assess proficiency in actual practice IMGs, potentially patient. Are prime numbers because it can be applied in other settings to,... 2 * 7 by, ( 1996 ) are eigenval- ues greater than 1 itself...

