Appendix - Body of evidence assessment
Overall, the evidence base informing the recommendations was poor, as the majority of studies of potential biomarkers were Level III-3 diagnostic case control studies. These studies are often performed to establish proof of concept, laying down the basis for a more rigorous study design to follow. This study design inherently has a high risk of bias because patients are selected based on their known diagnosis, rather than reflecting a representative sample of patients presenting for testing. Diagnostic case control studies that were designed to minimise this problem were judged to have a low risk of bias, hence the evidence base for these studies was deemed to be satisfactory. For example, McIntire et al randomly selected patients with and without biopsy findings of goblet cells (consistent with a BO diagnosis) from a high-risk group of chronic GORD sufferers to examine the accuracy of mucin expression (in comparison to the stringent selection criteria applied by other Level III-3 studies to identify case and controls). The McIntire et al study was further designed to minimise bias by performing evaluation of both the index test and reference standard in a blinded manner with results confirmed independently.
Where estimates of test sensitivity and specificity varied widely between studies assessing the same biomarker or diagnostic technology, the evidence was classified as inconsistent (grade D). For example, estimates of cytokeratin staining sensitivity for detecting BO ranged from 10% to 100% (median 65%); and estimates of specificity for distinguishing between BO and gastric (cardiac) intestinal metaplasia (IM) ranged from 34% to 100% (median 78%). Differences in both the histopathological criteria used to select eligible BO specimens and the method of interpretation may have contributed to the wide differences in sensitivity observed across these studies. In contrast, two studies investigating Trefoil Factor 3 showed much less variation in reported sensitivity for detection of BO (78% > 3cm BO, 73% >1cm BO, 90% >2cm BO and similar specificity (94% >3cm BO and 94% >1cm BO, 94% >2cm BO, thus resulting in a grade B (good) consistency rating being applied to these studies.
None of the included studies were designed to provide direct evidence about the impact of adopting the study biomarker on patient relevant outcomes. The clinical impact of the body of evidence was determined by defining the proposed purpose and role of the biomarker, assessing test sensitivity and specificity compared to standard practice, the clinical consequences of the test results and other potential benefits of the biomarker compared to standard endoscopy and biopsy methods.
Clinical impact was deemed to be slight or restricted (grade D) if: the evidence base was assessed as poor quality or inconsistent; or the accuracy of the biomarker was demonstrated to be inferior to, or was not compared to standard testing methods for the same purpose; or if the biomarker did not meet a minimally acceptable threshold for sensitivity and specificity based on consideration of the clinical consequences of the test results. For biomarkers that met the pre-specified minimal clinical performance criteria, a moderate clinical impact was assigned when endoscopic biopsy was still required, but the proposed test provided other advantages for improved accuracy for the detection of BO (such as diagnosis of BO from non-goblet epithelium, to reduce the problem of biopsy sampling error; or use of magnifying endoscopy to more accurately target BO for biopsy). A biomarker was deemed to have substantial clinical impact if it demonstrated acceptable accuracy and avoided invasive endoscopic biopsy for the diagnosis of BO (such as the use of a non-endoscopic capsule device to attain oesophageal cell samples).
To determine generalizability, the extent to which the study population and setting represented patients at risk of BO who would be eligible for testing was assessed. Most of the studies were conducted in highly selected patient populations, for example case control studies that used stringent inclusion criteria to define BO cases; thus studying a subset of the target population. However it is clinically sensible to apply the evidence from these studies to the target population, albeit with appropriate qualification for the risk of bias using this study design, hence the satisfactory (C) grading for the studies in this category. In contrast, the prospective cohort study performed by Kadri et al was a multicentre study, analysing 501 patients from 12 UK general practices, thus improving generalizability of the associated recommendation to good (grade B).
To determine applicability, the extent to which the biomarker and technology could be adopted in the Australian healthcare system was assessed. In most cases, the tests could be implemented into the Australian healthcare system with few caveats (such as basic immunohistochemistry, serum testing and magnifying endoscopy)(recommendation grade B). In cases where highly trained pathologists would be required for analysis, or more complex testing is needed outside that already established in hospital laboratories (such as analysis of proteins extracted from tissue biopsies); applicability was downgraded (recommendation grade C).
- Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JH, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999 Sep 15;282(11):1061-6 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/10493205.
- McIntire MG, Soucy G, Vaughan TL, Shahsafaei A, Odze RD. MUC2 is a highly specific marker of goblet cell metaplasia in the distal esophagus and gastroesophageal junction. Am J Surg Pathol 2011 Jul;35(7):1007-13 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/21602660.
- Kurtkaya-Yapicier O, Gencosmanoglu R, Avsar E, Bakirci N, Tozun N, Sav A. The utility of cytokeratins 7 and 20 (CK7/20) immunohistochemistry in the distinction of short-segment Barrett esophagus from gastric intestinal metaplasia: Is it reliable? BMC Clin Pathol 2003 Dec 2;3(1):5 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/14651756.
- White NM, Gabril M, Ejeckam G, Mathews M, Fardy J, Kamel F, et al. Barrett's esophagus and cardiac intestinal metaplasia: two conditions within the same spectrum. Can J Gastroenterol 2008 Apr;22(4):369-75 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/18414711.
- Ormsby AH, Vaezi MF, Richter JE, Goldblum JR, Rice TW, Falk GW, et al. Cytokeratin immunoreactivity patterns in the diagnosis of short-segment Barrett's esophagus. Gastroenterology 2000 Sep;119(3):683-90 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/10982762.
- Lao-Sirieix P, Boussioutas A, Kadri SR, O'Donovan M, Debiram I, Das M, et al. Non-endoscopic screening biomarkers for Barrett's oesophagus: from microarray analysis to the clinic. Gut 2009 Nov;58(11):1451-9 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/19651633.
- Kadri SR, Lao-Sirieix P, O'Donovan M, Debiram I, Das M, Blazeby JM, et al. Acceptability and accuracy of a non-endoscopic screening test for Barrett's oesophagus in primary care: cohort study. BMJ 2010 Sep 10;341:c4372 Abstract available at http://www.ncbi.nlm.nih.gov/pubmed/20833740.