Irritable bowel syndrome and active inflammatory bowel disease diagnosed by faecal gas analysis

Inflammatory bowel disease and irritable bowel syndrome may present in a similar manner. Measuring faecal calprotectin concentration is often recommended to rule out inflammatory bowel disease, however, there are no tests to positively diagnose irritable bowel syndrome and invasive tests are still used to rule out other pathologies.


INTRODUCTION
Irritable bowel syndrome is a chronic relapsing gastrointestinal disorder characterised by abdominal pain, bloating and a change in bowel habit. 1 The disorder can be diagnosed on the symptoms alone, especially in younger patients and those with a long history. At present, the preferred means of diagnosing irritable bowel syndrome is the application of the Rome criteria. 2 However, despite the recommendations of the American College of Gastroenterology 3 and the British Society of Gastroenterology, 1 many clinicians still view irritable bowel syndrome as a diagnosis of exclusion and perform numerous investigations to rule out organic diseases. 4 Inflammatory bowel disease, ulcerative colitis and Crohn's disease, are also chronic relapsing gastrointestinal disorders and their symptoms may resemble irritable bowel syndrome. The use of faecal calprotectin in primary care is being promoted to aid referral to secondary care for patients suspected to have inflammatory bowel disease. In essence, the calprotectin test is being used to rule out organic diseases in the hope that the primary care physicians will manage patients with irritable bowel syndrome. Irritable bowel syndrome is more common than inflammatory bowel disease and it is not surprising that some patients develop irritable bowel syndrome before their inflammatory bowel disease is discovered, 5 or that some patients with inflammatory bowel disease clearly have a component of irritable bowel syndrome to account for their symptoms when the inflammatory bowel disease is in remission. 6 Irritable bowel syndrome and inflammatory bowel disease may be associated with dysbiosis 7,8 which may account for the abnormal odour emitted from the faeces of patients with both irritable bowel syndrome and inflammatory bowel disease. The volatile chemicals contributing to faecal odour are mostly products of digestion and fermentation performed by the microbiota and cells shed into the intestine. 9,10 Traditionally, volatile chemicals are characterised by gas chromatographymass spectrometry. Several publications based on gas chromatographymass spectrometry have shown changes in volatile chemicals found in faeces, 11 urine 12 and breath 13 during relapse of inflammatory bowel disease and in faeces of patients with diarrhoea predominant irritable bowel syndrome. 14 These studies give an indication of the potential use of volatile chemicals as biomarkers for inflammatory bowel disease and irritable bowel syndrome, however, the gas chromatographymass spectrometry technology is not yet suitable for high-throughput applications in clinical practice, which has limited the utility of these observations. We have designed and built a prototype based on gas chromatography-sensor technology for the point of care analysis of volatile chemical profiles from biological samples. We have reported the preliminary analysis of faecal samples from patients with irritable bowel syndrome and inflammatory bowel disease using the gas chromatography-sensor system and an in-house developed artificial neural network (ANN). 15 However, some important comparisons from the medical point of view were not performed (e.g. active Crohn's disease vs. irritable bowel syndrome and active Crohn's disease vs. inactive Crohn's disease). In addition, the use of ANNs for diagnostic methods has been questioned by regulatory institutions such as the Food and Drug Administration (FDA). 16,17 Here, we report the use of a gas chromatography-sensor-pipeline 18 (a data processing procedure) to analyse faecal samples from patients with irritable bowel syndrome, inflammatory bowel disease and healthy donors. After rigorous validation schemes, the results reported by the gas chromatography-sensor-pipeline indicate a successful discrimination of faecal samples from patients with irritable bowel syndrome, active inflammatory bowel disease, inactive inflammatory bowel disease, active Crohn's disease, inactive Crohn's disease, active ulcerative colitis, inactive ulcerative colitis and healthy donors. These results support the development of a point of care device not only for the positive diagnosis of irritable bowel syndrome, but also to assist in the diagnosis of both Crohn's disease and ulcerative colitis.

Patient recruitment
Patients were recruited as described by Shepherd et al., 15 although several patients were excluded from the present work as the diagnosis of inflammatory bowel disease was subsequently questioned. In summary, patients attending the gastroenterology clinic at the Bristol Royal Infirmary were invited to participate in this study and to bring a faecal sample to the clinic. Prospective demographic data and faecal samples were obtained from 152 different participants between October 2010 and October 2011.
Irritable bowel syndrome samples include samples from patients with diarrhoea or constipation and patients alternating between diarrhoea and constipation. Most patients had diarrhoea predominant irritable bowel syndrome, however, two patients reported constipation as the predominant symptom. The diagnosis was based on the Rome II criteria. 19 The inflammatory bowel disease samples were collected from patients with active and non-active ulcerative colitis and Crohn's disease. Inflammatory bowel disease was diagnosed by a physician based on endoscopy and histology, or by radiology in the case of small intestinal disease. The activity of the disease in patients with ulcerative colitis was calculated by their colitis simple clinical activity index score, 20 where a score of 3 or more indicated active UC. Patients with Crohn's disease were assessed using the Harvey Bradshaw index score, 21 where a score of 4 or more indicated active Crohn's disease. Simple clinical activity index has been compared to other tools and found to be 'valid, reliable and responsive' 22 : it has the advantage over most tools of not requiring an assessment of the mucosa by sigmoidoscopy/colonoscopy. The use of Harvey Bradshaw index is supported by National Institute for Health and Care Excellence (NICE) in the assessment of Crohn's disease patients for anti-TNF therapies: it does not require a diary to be kept by patients for several days, invasive investigations or blood tests. Faecal calprotectin was not measured because it was not a routinely available test in 2010/11. We are not able to perform the test now because the samples collected for this study were disposed of after the sensor work had been completed, in accordance with the Human Tissue Act.
Healthy control samples (Control) (n = 41) were collected from partners or healthy relatives of patients visiting the clinic and from healthy patients referred for early endoscopy/colonoscopy due to a family history of upper gastrointestinal or colon cancer; (mean age 53.6 years, 24 women: 17 men): because of the similarity of their diet and lifestyle to that of patients, partners were recruited where possible as an attempt to reduce bias resulting from such factors. The patients who agreed to participate in the study gave verbal consent to the physician during the clinic appointment as stipulated in the participant information sheet and the ethics approval, as granted by the Wiltshire Research and Ethics Committee (NRES 06/Q2008/6). All patients were on an ad lib diet before sample collection to maximise recruitment and to give 'real-world' data.

Sample processing
All the samples were analysed by the gas chromatographysensor system in 2012, the device not being available prior to 2012. Faecal samples were processed following the method proposed by Ahmed et al. 23 In summary, 1-g aliquots of faecal samples were stored in 10 mL glass headspace vials (Supelco; Sigma Aldrich, Dorset, UK) within 6 h of sample production and frozen at À20°C. In 2012, samples were processed by the system. Previous studies showed no loss of volatile chemicals from faecal samples stored at À20°C. 23 Each frozen sample was heated for 10 min at 50°C. After this, 2 cm 3 of its headspace were collected and injected into the GC column of the gas chromatography-sensor system. 15 Detailed descriptions of the hardware and software 18 are reported elsewhere. In summary, the gas chromatography-sensor system is composed of a gas chromatography column coupled to a metal oxide gas sensor. The sensor is controlled via an electronic circuit monitored by computer software, which records the electrical resistance of the sensor at 0.5 s intervals during each 40 min machine run. The resistance profile of each sample generated by the gas chromatography-sensor system was stored in individual text files.

Statistical analysis
The gas chromatography-sensor data generated in 2012 were analysed by a new pipeline in 2015/6. A thorough description of the pipeline used here for statistical analysis is described in Aggio et al. 18 In summary, the gas chromatography-sensor characterises the volatile chemicals present in biological samples. It produces a profile of the sensor resistance vs. time, which describes how the abundances of volatile compounds change with time. Figure S1A is a illustrative plot of the average normalised resistance for each of the irritable bowel syndrome, inflammatory bowel disease and Control (healthy patients) samples (data normalised to be between 0 and 1). Note that this graphic indicates similarity in average profiles in the initial stages, but with noticeably characteristic differences in the later stages ( Figure S1B).
Our in-house-developed pipeline performs chromatogram alignment and data transformation techniques for highlighting volatile chemical patterns specific to different medical conditions. The features or resistance levels that best describe the differences between medical conditions are selected by two random forest-based algorithms. 24,25 Partial least squares (PLS) 26 and support vector machine (SVM) with polynomial kernel 25 were applied as statistical modelling techniques to classify unknown samples using the derived features. The results reported by the gas chromatography-sensor-pipeline were validated using leave-one-out cross-validation, 10-fold cross-validation repeated 30 times, 27 threefold double cross-validation repeated 30 times with an inner loop of twofold cross-validation repeated five times, 28 and their Monte Carlo variation with random class labels permutation. An additional validation scheme was applied, where the feature selection stage of our pipeline was included as part of the double cross-validation. In this case, a fivefold cross-validation repeated 30 times with inner loop of threefolds repeated 15 times was applied ( Figure S2). Principal component analysis (PCA) on the transformed resistance values was also performed. Receiver operating characteristic (ROC) curves were generated, based on the double cross-validation results to visualise the performance of the gas chromatography-sensor-pipeline. Statistical analyses were performed solely on the resistance profiles processed by the gas chromatography-sensorpipeline. No other demographic or clinical features were considered for statistical modelling. Confidence intervals (CI) were calculated using bootstrapping. Data analysis was carried out using R software. 29 This study is based on data from n = 152 different patient samples comprising data from Controls (n = 41), irritable bowel syndrome (n = 28) and inflammatory bowel disease (n = 83). Pairwise comparisons were performed between these three groups. For detailed comparisons, inflammatory bowel disease is further considered as active (n = 33) or inactive (n = 50) and compared with Controls and irritable bowel syndrome. The inflammatory bowel disease data comprises n = 47 ulcerative colitis (active n = 14; inactive n = 33) and n = 36 Crohn's disease (active n = 19; inactive n = 17) and these four subgroups of inflammatory bowel disease are compared with the data from Controls and the irritable bowel syndrome donors. A listing of the comparisons is given in Table S1.

RESULTS
We have applied an in-house-developed gas chromatography-sensor-pipeline to analyse 152 faecal samples from patients with irritable bowel syndrome, active inflammatory bowel disease, inactive inflammatory bowel disease, active Crohn's disease, inactive Crohn's disease, active ulcerative colitis, inactive ulcerative colitis and health donors or Control. Table 1 shows the demographics for the patient groups studied with their respective diagnosis, site of disease, Harvey Bradshaw index scores and simple clinical activity index (SCAI) score, when applied, smoking status, diet, medication and routine laboratory data. Table S2 contains a summary of the results reported by the double cross-validation for each comparison performed and Table S3 contains the results of their associated ROC analysis. The results reported by the leaveone-out cross-validation, 10-fold cross-validation and Monte Carlo are available as Tables S4-S8. For example, Figures 1 and 2 show the features selected for the comparisons active Crohn's disease | irritable bowel syndrome and active inflammatory bowel disease | inactive inflammatory bowel disease, respectively, in addition to their associated plot of principal components and ROC curves. The results indicate that the platform is able to successfully differentiate most of the conditions studied here, with active Crohn's disease | irritable bowel syndrome being an example of near perfect sample classification and active inflammatory bowel disease | inactive inflammatory bowel disease being an example of a scenario, where the platform has difficulty in classifying samples.

DISCUSSION
The prototype device we have built is able to distinguish faecal samples from healthy donors, patients with irritable bowel syndrome and patients with inflammatory bowel disease; the sensitivity and specificity for each is shown in Table S2.
The pattern recognition software we have developed is based on wavelet transformation and does not rely on a neural network. In contrast to neural networks, which has been described as a 'black box' approach, 16,17 the wavelet transformation underpins the technology used to interpret electrocardiogram, a well-known methodology accepted by the scientific community. We have used repeated double cross-validation to validate results. Furthermore, we have undertaken Monte Carlo randomisation to ensure the model is not over-fitted to the data. This is the first time these stringent methods have been used to report faecal volatile compound profiles. These results are supported by Figure S1B, which is a plot of average resistance normalised between 0 and 1 over the time 240-540 s. Figure S1B shows distinctive signature differences in average profiles between irritable bowel syndrome, inflammatory bowel disease and Controls over a sustained period.
Making the diagnosis of irritable bowel syndrome, the second most prevalent gastrointestinal disease of westernised populations, is problematic: despite the introduction of the Manning Criteria in 1978 30 and the numerous updates of the Rome Criteria many clinicians still feel that irritable bowel syndrome is a diagnosis of exclusion. 4 The introduction of faecal calprotectin has helped to 'rule out' disorders such as inflammatory bowel disease, but still treats irritable bowel syndrome as a diagnosis of exclusion. The data we have presented is the best method to date for making a positive diagnosis of irritable bowel syndrome based on an investigation.
The new pipeline is an improvement on the previously reported neural network, for active inflammatory bowel disease (Crohn's disease and ulcerative colitis) vs. irritable bowel syndrome, the pipeline has a mean sensitivity and specificity of 93% and 90%, respectively, the neural network had mean values of 76% and 88%; for irritable bowel syndrome vs. controls, the mean accuracy of pipeline and neural network were 91% and 54%, respectively; while for inflammatory bowel disease (combined Crohn's disease and ulcerative colitis) vs. controls the mean accuracies were 78% and 79% for the pipeline and neural network, respectively. When Crohn's disease and ulcerative colitis were analysed separately, the pipeline had an accuracy of 89%. This assessment was out undertaken when using the neural network.
The new analysis also compared patients with active Crohn's disease and ulcerative colitis for the first time. The traditional Partial Least Squares (PLS) approach gave a mean accuracy of 94% with an area under the ROC of 99%, SVM gave 96% and 99% respectively. There are no faecal markers with an ability to distinguish Crohn's disease and ulcerative colitis, although some serology panels show promise. 31 We do not expect faecal volatile compounds to replace standard diagnostic tools such as colonoscopy and MRI or capsular endoscopy, but they could be used to help direct the choice of investigation. Importantly, the technique appears to provide a tool for diagnosing irritable bowel syndrome in a positive way, which will be of reassurance to patients, while saving them from unnecessary tests to rule out other conditions, saving time, money and risk to the patients.
Clinically, the most challenging comparison is between irritable bowel syndrome and active Crohn's disease since both cause abdominal pain and a change in bowel habit. The gas chromatography-sensor pipeline performed well for this comparison with area under the ROC of 91% and  94% using PLS and SVM, respectively, after double crossvalidation. The same performance was observed when classifying irritable bowel syndrome and active inflammatory bowel disease, inactive inflammatory bowel disease, inactive Crohn's disease, or inactive ulcerative colitis samples (Tables S2 and S3). The assessment of active inflammatory bowel disease/inactive inflammatory bowel disease is rarely useful, as all patients ought to have a clear diagnosis of Crohn's disease or ulcerative colitis; the models were relatively poor and reflect the mixed nature of ulcerative colitis and Crohn's disease patients in the inflammatory bowel disease group (Figure 2). We have chosen to use this figure to emphasise that the profiles for inactive and active inflammatory bowel disease do overlap. More useful were the comparisons of inactive/active ulcerative colitis or Crohn's disease, here the models were better, especially for ulcerative colitis ( Figure S2) in which all but one sample from patients with active ulcerative colitis lay to the right of the vertical line on the principal component plot.
The clinical assessment of disease activity in ulcerative colitis is more accurate than that of Crohn's disease, because the colon is more readily assessed than the small bowel and the Crohn's disease activity index, a commonly used scoring index in clinical trials, is very subjective. The samples were collected in 2010-2011. Faecal calprotectin testing was not routinely available at the research centre, which means we had no robust measure of disease activity. In addition, faecal calprotectin has its limitations in the assessment of small bowel Crohn's disease. Consequently, we chose two patient-friendly but reliable clinical tools, the simple clinical activity index and Harvey Bradshaw index. Future work will need to assess the performance compared with a robust goldstandard such as colonoscopy for ulcerative colitis, or colonoscopy with MRI for Crohn's disease.
The results reported here were validated using the following different validation schemes: leave-one-out cross-validation, 10-fold cross-validation, double crossvalidation Monte Carlo randomisation. Extant statistical literature holds all of these methods and approaches in good standing. Among them, the double cross-validation is certainly the most stringent method. This stringent approach performed least well when comparing inactive/ active Crohn's disease, or ulcerative colitis. This is not unexpected for two reasons in addition to the sample size; (i) the change in the volatile compounds are a continuous variable that was compared to an arbitrary cut point in a second continuous variable (Harvey Bradshaw index or simple clinical activity index)comparing upper and lower quartiles may have been more discriminating, but the data set was too small for this; (ii) patients may have had other reasons for their symptoms (such as bile salt diarrhoea, bacteria overgrowth or irritable bowel syndrome) which meant the clinical scores over-estimated disease activity.
The Monte Carlo technique was applied to check for potential over-fitting of the developed classification models. The Monte Carlo method was applied as described for the validation methods tested (i.e. leave-one-out cross-validation, cross-validation and double cross-validation), however, in this case, sample labels were randomly permuted before model construction. This procedure simulates what would have happened if samples were to be classified simply by chance. The results (Tables S4, S6, and S8) suggest the data were not over-fitted.
We have developed a gas chromatography-sensor pipeline for the diagnosis and assessment of inflammatory bowel disease and irritable bowel syndrome. The separation of active disease groups is excellent. The potential to use the pipeline to determine disease activity will require more work. Although the sample sizes were too small to provide separate validation sets, we have used stringent statistical tools to double cross-validate our models. We are planning further large cohort studies with gold-standard assessments of disease activity and validation sets. If confirmed, these findings could mean that irritable bowel syndrome can be diagnosed positively and offers the potential to develop new tools to diagnose and assess inflammatory bowel disease and distinguish ulcerative colitis and Crohn's disease.

SUPPORTING INFORMATION
Additional Supporting Information may be found in the online version of this article: Figure S1. Examples of sensor outputs generated from faecal samples from patients with IBD, IBS and Controls. (A) Resistance normalised output from 0-1800 seconds; (B) output from 240 to 540 seconds. Figure S2. Principal component analysis based on the features selected for the double cross-validation (DoubleCV). The numbers represent the simple clinical colitis activity index (SCCAI). Table S1. Comparisons performed. Table S2. Results of the repeated double cross-validation based on support vector machine polynomial and partial least squares. Table S3. ROC analysis based on the results of the repeated double cross-validation based on support vector machine polynomial and partial least squares. Table S4. Double cross-validation (DoubleCV) Monte Carlo based on support vector machine polynomial and partial least squares. Table S5. Leave-out-cross-validation (LOOCV) based on support vector machine polynomial and partial least squares. Table S6. Leave-one-out-cross-validation (LOOCV) Monte Carlo based on support vector machine polynomial and partial least squares. Table S7. Results of the repeated 10 fold cross-validation (10FoldCV) based on support vector machine polynomial and partial least squares. Table S8. Results of the repeated 10 fold cross-validation Monte Carlo based on support vector machine polynomial and partial least squares.