Intra-rater and inter-rater reliability of ultrasonographic measurements of acromion-greater tuberosity distance in patients with post-stroke hemiplegia

Background: Glenohumeral subluxation (GHS) is reported in up to 81% of patients with stroke. Ultrasonographic measurements of GHS by measuring the acromion-greater tuberosity (AGT) have been found to be reliable for experienced raters. Objectives:The primary aim was to assess the intra-rater reliability of measurements of AGT distance in people with stroke following a short course of rater training. A secondary aim was to compare the inter-rater reliability of these measurements between novice and experienced raters. Methods:Patients with stroke (n = 16; 5 men, 11 women; 74 ± 10 years) with 1-sided weakness who gave informed consent were recruited. Ultrasonographic measurements were recorded at the bedside by two physiotherapists with patients seated upright in a hospital chair. Reliability was assessed by intra-class correlation coefficients (ICCs) and the standard error of measurements (SEM). Minimum detectable change (MDC90) scores were used to estimate the magnitude of change that is likely to exceed measurement error. Results:Mean ± SD AGT distances on the affected and unaffected sides for rater 1 were 2.2 ± 0.7 and 1.7 ± 0.4 cm, respectively. Corresponding values for rater 2 were 2.5 ± 0.6 and 2.0 ± 0.4 cm. Intra-class correlation coefficient values for the affected and unaffected shoulders for rater 1 were 0.96 and 0.91, respectively. Corresponding values for rater 2 were 0.95 and 0.90.SEM and MDC90 for both affected and unaffected shoulders were ≤ 0.2 cm. Inter-rater reliability coefficients were 0.86 (affected) and 0.76 (unaffected) shoulders. Conclusion:Ultrasonographic measurement of AGT distance demonstrates excellent intra-rater reliability for a novice rater. Inter-rater reliability of ultrasonographic measurement of AGT also demonstrates good reliability between novice and experienced raters.

Glenohumeral subluxation (GHS) is one of the most common musculoskeletal problems in people with post-stroke hemiplegia with a reported incidence of up to 81%. 1,2 evere loss of motor function and apparent absence of supraspinatus contraction are potential risk factors. 3GHS presents considerable challenges to the rehabilitation of the upper limb such as impaired normal shoulder function, prolonged hospital stay, and depression as a result of increased disability. 4The association between GHS and other post-stroke complications such as pain and poor motor recovery is uncertain. 5When present in combination, however, these could have a significant impact on overall upper limb function.8][9][10] To evaluate the effectiveness of treatment interventions, accurate, reliable and valid outcome measures are required.Current clinical measurements include the fingerbreadth palpation method 11 and plain radiographs. 12The fingerbreadth palpation method lacks the sensitivity to detect early signs and/or minor subluxations. 9There is a concern that without treatment subluxation can progress to an uncorrectable level over time. 7Early GHS can contribute to irreversible partial or complete tears of the non-elastic shoulder capsule. 6,7,13diographs are considered to be objective, and have high reliability and validity, 14 but problems relating to cost, time involved and risks inherent to exposure to radiation 14,15 limit their utility in the clinical setting.
More recently, diagnostic ultrasound has also been used for the assessment of GHS in people with post-stroke hemiplegia by measuring the acromion-greater tuberosity (AGT) distance between the lateral border of the acromion and the apex of the greater tuberosity of the humerus. 16,17Using a large, static ultrasound machine Park et al 16 report high intra-rater reliability (ICC = 0.979) of ultrasound measurements of GHS undertaken with patients seated in the upright sitting position with arms dependent in a neutral position, without the arm support.
Although, radiographic measurements were also reported in this study, a comparison was not possible because different landmarks were used for the radiographic and ultrasonographic measurements of GHS.More recently, Kumar et al 17 recruited 26 patients with stroke, and, using a new standardized position with the forearm supported, found that bedside assessment of acromion-greater tuberosity (AGT) distance, undertaken by a physiotherapist trained in shoulder ultrasound, demonstrates good intra-rater reliability (ICC = 0.98) and discriminant validity.Another study compared ultrasound with the fingerbreadth palpation method and it reports good agreement between these two methods highlighting the potential clinical utility of the ultrasound method. 18though there were differences in the measurement procedure, high reliability coefficients (ICC =0.979 to 0.98) reported from these studies 16,17 suggest that ultrasonographic measurements are reliable when measured by the same rater.
However, the raters in these studies had good experience in musculoskeletal ultrasound.To maximise clinical usefulness, it is critical to be able to produce reliable results with limited training in the use of ultrasound technique for the measurement of AGT.Furthermore, none of the previous studies assessed interrater reliability.Without good inter-rater reliability, the usefulness of the ultrasound technique as an assessment tool is limited in the clinical setting.The primary aim of this study was to assess the intra-rater reliability of measurements of AGT in people with stroke following a short course of physiotherapist rater training.A secondary aim was to compare inter-rater reliability of these measurements when undertaken by novice and experienced raters.

Patients
The study used a test-retest design and received approval from Frenchay Research Ethics Committee Research Ethics Committee, North Bristol NHS Trust, UK.Patients aged over 50 years, with stroke resulting in one-sided weakness and who were able to sit upright, were eligible to participate.
Diagnosis/presence of GHS was not a requirement to be able to participate in the study.Patients with other neurologic conditions, traumatic brain injury, brain tumours or other serious co-morbidities, shoulder pathology ('adhesive capsulitis'), and recent surgery to the neck, arm, or shoulder, unavailable for testing, and unable to volunteer due to any reason were excluded.Patients were recruited from two local hospital trusts in the South West of England.Each patient gave informed written consent to take part and, for those who lacked mental capacity, appropriate procedures were followed and involved a family member signing a 'personal consultee agreement form' in the presence of the patient.

Apparatus
A portable diagnostic ultrasound, (TITAN model, M-Mode, Depth 3.9, L38/10-5MHz broadband 38 mm linear array transducer, Sonosite Limited, Hitchin, UK) a was used for scanning the shoulder and for recording the AGT distance.The equipment was tested and calibrated according to the manufacturer's guidelines prior to commencement of the data collection process.The precision of linear measures based on manufacturer specifications is ± 2%.

Raters
Two raters (both physiotherapists) were involved in the assessment procedure.
For the experienced rater, the training protocol consisted of a one day manufacturers course, supervised training from a consultant radiologist (14 hours), pilot work on 6 healthy volunteers and reliability studies on healthy volunteers (n=32) 19 and patients with stroke (n=26). 18The novice rater received training in shoulder ultrasound which included 1) one hour of formal training on the portable ultrasound technique for AGT measurements 2) practice on five healthy volunteers (2-3 hours) to become familiar with the protocol and measurement procedure.

Procedure
Baseline demographic data including age and gender, date of onset, type of stroke, site of stroke, and side affected were collected from patient's medical records by the chief researcher (PK).The general neurological examination included assessment of muscle strength in the shoulder muscles (Medical Research Council Scale) 20 and muscle tone 21,22 on both affected and unaffected sides.Muscle tone was classified as low tone (grade 0), normal (grade 1) and high (grades 2-5) as described by Culham et al. 22 For both muscle strength and tone, the shoulder flexors, abductors, and internal and external rotators were assessed.
For ultrasound measurements of AGT distance, each patient was placed in the standardized position to allow measurement of AGT distance (Fig 1). 19Patients were seated upright in a chair and all measurements were recorded at the bedside.The shoulder was in neutral rotation and adduction, with the elbow at 90°of flexion and forearm in pronation.The forearms rested on a pillow placed on the patients lap with the elbow joint itself remaining unsupported.Assistance was provided by the researcher if patient was unable to move the arm.The ultrasound transducer was then placed over the lateral border of the acromion along the vertical/longitudinal axis of the humerus to scan the shoulder.AGT distance was recorded on the frozen image using an on-screen calliper that automatically calculates distances (Fig 2 ).AGT distance was defined as the relative lateral distance between the lateral edge of the acromion process of the scapula and the nearest margin of the superior part of the greater tuberosity of the humerus. 19A dark linear acoustic shadow beneath the acromion helped to identify the lateral edge of the acromion.The tendon of supraspinatus was clearly visible as a thick band (acoustic hyperechoic appearance) at its point of insertion, which facilitated identification of the greater tuberosity (Fig 2).
To assess intra-rater reliability, three ultrasound images of the right shoulder were obtained and AGT distance was measured on each image (set 1) by rater one.This was repeated on the left shoulder.A ten minute interval was then provided during which patients were encouraged to move both shoulders out of the standardised position.If necessary, assistance was provided.Patients were then repositioned and a further 3 ultrasound images of each shoulder were obtained and AGT distance was measured on each image (set 2).The same procedure was then repeated by rater two.Thus, a total of six measurements were recorded on each shoulder for each participant by each rater.
In order to ensure the rater was blind to measurements, the values displayed were obscured by placing a sticker on the ultrasound screen.The experienced rateralways performed the ultrasound measurements first.When one rater was undertaking ultrasonographic measurements, the other rater was not present at the bed-side and vice-versa, therefore, both raters were blind to each other's measurements.The total time spent with each participant was approximately 45 minutes; however, the actual time for scanning the shoulder and for recording individual measurements was just over 1minute.

Data Analysis
Data were analyzed using the Statistical Package for Social Sciences(SPSS version 21.0).b Descriptive statistics such as mean and standard deviation of AGT distance measurements for both affected and unaffected shoulders for both raters were calculated.
The Intra-rater and inter-rater reliability of ultrasonographic measurements of AGT distance were assessed using intra-class correlation coefficients (ICC 2,1 and ICC 3,3 respectively) with 95% confidence intervals.For both raters, testretest reliability was assessed using the mean of the three measurements in Set 1 (M1, M2, M3) and Set 2 (M4, M5, M6).For the calculation of inter-rater reliability, the mean of three measurements in Set 1 (M1, M2, M3) recorded by Rater 1 were compared with the mean of three measurements in Set 1 (M1, M2, M3) recorded by Rater 2. Reliability was considered excellent if the ICC value was greater than or equal to 0.75, fair to good if the value was 0.40 to 0.74, and poor if the ICC value was less than 0.40. 23e standard error of measurement was used to define 95% confidence limits around individual measurements.Minimum detectable change (MDC), a distribution-based approach, was used to quantify the magnitude of change that was not likely to be a result of measurement error. 24For MDC, a confidence interval of 90% is commonly recommended in the literature (MDC90), and it is calculated by using the formula: MDC90 = 1.65 X SEM X √2, where SEM indicates the standard error of the measurement. 24,25peated measures analysis of variance (ANOVA) was used to analyse testretest (set one versus set two) variability of repeated ultrasonographic measurements of AGT distance on each shoulder for both raters and between raters (set one of rater one versus set one of rater two).

RESULTS
Over a four month period, 18 patients with stroke were approached to participate in the study.Two patients were medically unstable and were excluded from the study.Therefore 16 patients (11 men, 5 women) with a mean age ± SD of 74±10 years were recruited into the study.Fourteen patients had a stroke because of infarction, and two had a stroke because of haemorrhage.Eleven patients had right sided weakness and five patients had left sided weakness.Seven patients had low tone, four had high tone, and five had normal tone.Ten patients had a motor power score of less than or equal to 2, and 6 had a motor power of greater than or equal to 3. The mean time from the onset of stroke to data collection was 28 days.
A summary of descriptive data for repeated measurements of AGT distance for both the raters is presented in Table 1.
However, repeated measures analysis of variance (ANOVA) showed a significant difference in mean AGT distance measurements of 0.3cm in both affected (F (5, 75) = 12.861, p = 0.001) and unaffected (F (5, 75) = 48.073,p = 0.001) shoulders for between rater measurements.ICC, standard error of measurements and MDC90 for both affected and unaffected shoulders for intra-rater and inter-rater reliability are presented in Table 2.

DISCUSSION
The primary aim of this study was to assess the intra-rater reliability of measurements of AGT in people with stroke following a short course of rater training.A secondary aim was to compare inter-rater reliability of these measurements when undertaken by novice and experienced raters.Two physiotherapists, acted as experienced and novice raters and recorded AGT measurements at the bed-side using portable ultrasound equipment.
This study found excellent intrarater (test-retest) for both affected (ICC, 0.95) and unaffected (ICC, 0.90) ultrasonographic measurements AGT distance in patients with post-stroke hemiplegia for the novice rater.
Corresponding reliability values for the experienced rater were ICC, 0.96 and 0.91 respectively.The inter-rater reliability between novice and experienced raters was also found to be excellent (ICC 0.76 unaffected; 0.86 affected).
These findings are in agreement with previous studies on people with stroke. 15,16rk et al 15 report excellent within-day intra-rater reliability (ICC 0.97 unaffected; 0.95 affected) for AGT distance measurements taken in a younger stroke population (mean age 56±11 years).Similarly, Kumar et al 16 found excellent within-day (ICC 0.95unaffected; 0.98 affected) and between-day (ICC 0.94 affected and 0.76 unaffected) reliability for AGT distance measurements taken in an older stroke population (mean age 71±10 years).In these studies, raters involved in undertaking ultrasound measurements were experienced in shoulder ultrasound.In Park et al 15 study, the rater had 5 years of experience in musculoskeletal ultrasound.Similarly, in Kumar et al 16 study, the rater (physiotherapist) had specific training in AGT measurements.In contrast, our study assessed both intra-rater and inter-rater reliability involving a novice rater (physiotherapist) with minimal training in AGT measurements.The experienced rater in this study was involved in previous reliability studies on healthy and stroke participants and had experience of taking measurements on 128 shoulders and 94 shoulders respectively.To our knowledge, this is the first report of and inter-rater reliability of AGT distance measurements taken by a physiotherapist a short period of training using portable ultrasound on patients with stroke older than 50 years.Excellent reliability of measurements suggests that a physiotherapist with minimal training (4 hours) in diagnostic ultrasound is capable of undertaking reliable ultrasound measurements of AGT distance.These results are very encouraging for clinical applications with a potential for immediate feedback for therapeutic choices.
Evidence from the literature suggests that intra-rater reliability is generally superior to inter-rater reliability and the latter tends to be lower due to error and variation in decision-making between therapists. 26In this reliability study, interrater reliability for the ultrasonographic measurements of AGT distance obtained by the two raters was comparatively less than intra-rater reliability but remained excellent (ICC 0.76 unaffected; 0.86 affected).This is because physiotherapists are generally considered having a good basic knowledge of anatomy and therefore with minimal training are able to produce reliable results.These findings are supported by a study which provided minimal training to three physiotherapy students and report excellent inter-rater reliability (ICC 0.79) for the ultrasonographic measurements of AGT distance undertaken in young healthy participants. 27spite excellent inter-rater reliability, significant differences in the mean AGT measurements between two raters were noted.For the experienced rater, the mean AGT distance measurements for unaffected and affected shoulders were 1.7±0.4cm and 2.2±0.6 cm values for novice rater were 2.0±0.4 cm and 2.5±0.6 cm suggesting a mean of 0.3 cm between two raters for both the affected and unaffected shoulder measurements.Some individual variation in identification of bony point for measurement purposes on the bony acromion process resulted in increased AGT distance measurements for rater 2.
Interestingly both standard error of measurement and MDC90 for both rater 1 and 2 were 0.1 and 0.2cm which is in agreement with the previous study. 16These values suggest that the mean differences between the two raters did not have an effect on the reliability coefficients, however is of more concern in the consideration of validity.For the purpose of standardization, however, it is critical that all raters measure the AGT distance using the same bony reference points.It is worth noting that rater two practiced ultrasound on only five healthy people, but not on patients with stroke and also lacked recent experience of handling patients with stroke.

Study Limitations
The current study has several limitations.This work was part of a bigger study, and rater 1 always undertook measurements first, therefore an order effect cannot be entirely ruled out.Secondly, although an inclusion criterion was to incorporate patients as soon as they are medically stable for rehabilitation, was not always possible.The mean time from onset of stroke to first measurement was 28 days, it is therefore difficult to confirm that the proposed technique would be feasible for patients in the first few days after stroke.This is important because early treatment of subluxation may help prevent further secondary complications.Reliable and objective measurements are required to monitor the effectiveness of interventions in the early stage of rehabilitation.
These limitations need to be addressed in future studies.

CONCLUSION
In conclusion, intra-rater and inter-rater ultrasonographic measurements of AGT are very reliable in people with stroke when assessed by a physiotherapist rater following a short period of training.Portable ultrasound offers a quick bed-side assessment tool with the potential to assess shoulder subluxation in post-stroke hemiplegia.Further work to establish the reliability of AGT distance measurements in patients early after stroke is required.

Figure 2 :
Figure 2:Longitudinal view of ultrasonographic image measuring the distance between the lateral tip of the acromion process and the nearest medial margin of the greater tuberosity (GT).The tendon of Supraspinatus (Sup) is visible above the GT.

FIGURE 1 :Figure 2 :
FIGURE 1: Patients' standardized position for Ultrasonographic measurements of AGT distance