If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Department of Urology, Graduate School of Medicine, The University of Tokyo, Tokyo, JapanDepartment of Urology, Center Hospital of the National Center for Global Health and Medicine, Tokyo, JapanArtificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan
Corresponding author. Department of Urology, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-8655, Japan. Tel. +81 35 800 8753; Fax: +81 35 800 8917.
Accurate cystoscopic recognition of Hunner lesions (HLs) is indispensable for better treatment prognosis in managing patients with Hunner-type interstitial cystitis (HIC), but frequently challenging due to its varying appearance.
Objective
To develop a deep learning (DL) system for cystoscopic recognition of a HL using artificial intelligence (AI).
Design, setting, and participants
A total of 626 cystoscopic images collected from January 8, 2019 to December 24, 2020, consisting of 360 images of HLs from 41 patients with HIC and 266 images of flat reddish mucosal lesions resembling HLs from 41 control patients including those with bladder cancer and other chronic cystitis, were used to create a dataset with an 8:2 ratio of training images and test images for transfer learning and external validation, respectively. AI-based five DL models were constructed, using a pretrained convolutional neural network model that was retrained to output 1 for a HL and 0 for control. A five-fold cross-validation method was applied for internal validation.
Outcome measurements and statistical analysis
True- and false-positive rates were plotted as a receiver operating curve when the threshold changed from 0 to 1. Accuracy, sensitivity, and specificity were evaluated at a threshold of 0.5. Diagnostic performance of the models was compared with that of urologists as a reader study.
Results and limitations
The mean area under the curve of the models reached 0.919, with mean sensitivity of 81.9% and specificity of 85.2% in the test dataset. In the reader study, the mean accuracy, sensitivity, and specificity were, respectively, 83.0%, 80.4%, and 85.6% for the models, and 62.4%, 79.6%, and 45.2% for expert urologists. Limitations include the diagnostic nature of a HL as warranted assertibility.
Conclusions
We constructed the first DL system that recognizes HLs with accuracy exceeding that of humans. This AI-driven system assists physicians with proper cystoscopic recognition of a HL.
Patient summary
In this diagnostic study, we developed a deep learning system for cystoscopic recognition of Hunner lesions in patients with interstitial cystitis. The mean area under the curve of the constructed system reached 0.919 with mean sensitivity of 81.9% and specificity of 85.2%, demonstrating diagnostic accuracy exceeding that of human expert urologists in detecting Hunner lesions. This deep learning system assists physicians with proper diagnosis of a Hunner lesion.
]. IC/BPS can be divided into two subtypes based on cystoscopic findings: Hunner-type IC (HIC, having Hunner lesions), which corresponds to the International Society for the Study of IC/BPS (ESSIC) BPS type 3, and BPS (lackingHunner lesions), corresponding to ESSIC BPS types 1 and 2 [
]. Growing evidence has revealed that these two subtypes are different in terms of clinical characteristics, bladder pathology, and gene expression profiles, suggesting distinct causes of pathogenesis [
Hunner-type (classic) interstitial cystitis: a distinct inflammatory disorder characterized by pancystitis, with frequent expansion of clonal B-cells and epithelial denudation.
Clinical characterization of interstitial cystitis/bladder pain syndrome in women based on the presence or absence of Hunner lesions and glomerulations.
]. Hence, treatment strategies should be devised separately in a subtype-directed manner, and proper recognition of a Hunner lesion is of great importance [
]. However, there have been no objective and standardized diagnostic criteria for a Hunner lesion, and thus diagnosis of HIC is made subjectively by physicians based on cystoscopic findings and other clinical information including patient’s characteristics and demographics. In addition, Hunner lesions vary in appearance, which can make recognition challenging [
AI outperformed every dermatologist in dermoscopic melanoma diagnosis, using an optimized deep-CNN architecture with custom mini-batch logic and loss function.
]. Previous research has demonstrated that deep learning models can exceed the abilities of humans to detect several diseases, including bladder cancer [
AI outperformed every dermatologist in dermoscopic melanoma diagnosis, using an optimized deep-CNN architecture with custom mini-batch logic and loss function.
Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.
]. Herein, we developed a computer-aided diagnosis (CAD) system for Hunner lesions by applying a pretrained convolutional neural network (CNN), the most frequently used and established deep learning algorithm for image-data classification, which distinguished Hunner lesions from other confusable flat reddish mucosal lesions with higher accuracy than IC/BPS-proficient physicians.
2. Patients and methods
2.1 Ethics statement
This study was approved by the Institutional Review Board of the University of Tokyo Hospital (no. 2019114NI), Kyorin University Hospital (no. H30-182), and National Institute of Advanced Industrial Science and Technology (no. Hi2019-304). All participants were informed about this study using generally accessible contact information, and written informed consent was obtained from patients who chose to participate. All procedures followed appropriate guidelines.
2.2 Participants and cystoscopic image preparation
A total of 82 participants were enrolled in this study, including 41 patients with HIC who had undergone and responded to endoscopic surgery (electrocautery of Hunner lesions with bladder hydrodistension), and 41 control patients who had flat reddish mucosal lesions in their bladders and underwent transurethral resection/biopsy of the lesions at the University of Tokyo Hospital from January 8, 2019 to December 24, 2020. All surgeries were performed under general or spinal anesthesia on an inpatient basis. The flat reddish mucosal lesions were carefully searched and cystoscopically visualized, with the bladder minimally filled with normal saline. Diagnosis of HIC was made by two urologists with expertise in managing IC/BPS, both board members of the East Asian IC/BPS Clinical Guidelines Committee (Y.A. and Y.H.) [
Demographics of patients with HIC retrieved from medical records included the following: O’Leary and Sant’s Symptom Index and Problem Index; an 11-point numerical rating of pain intensity, with 0 indicating no pain and 10 indicating maximum pain; a 7-grade quality of life scale derived from the International Prostate Symptom Score, with 0 indicating excellent and 6 indicating terrible; daytime and nocturnal urinary frequency; average and maximum voided volume; and bladder capacity measured during bladder hydrodistension at a pressure of 80 cmH2O under general/spinal anesthesia. Paired cold cup biopsies of the Hunner lesion and nonlesion background mucosa were obtained and sent to the Department of Pathology for histological analysis. Diagnoses of control patients were made based on histology at surgery: 23 patients were diagnosed with non–muscle-invasive bladder cancer, including 20 with carcinoma in situ (CIS) and three with papillary urothelial carcinoma (pTa; one with high grade and two with low grade); eight patients exhibited histological evidence of subepithelial chronic inflammatory changes accompanied by granulomas, epithelial denudation and reactive atypia, and stromal edema in bladders that had undergone intravesical mycobacteria bacillus Calmette-Guérin (BCG) injection for previous bladder cancers, and were diagnosed with BCG-related cystitis; and ten patients were diagnosed with chronic cystitis unrelated to HIC or BCG, including one with malakoplakia and three with radiation cystitis having a previous history of radiation therapy for prostate cancer (two) or cervical cancer (one) [
]. Of the 20 patients with CIS, two had previously undergone intravesical BCG therapy. All control patients underwent transurethral resection/biopsy of the bladder tumors and/or the flat reddish mucosal lesions if suspected of bladder cancer, suggested by urine cytology class III or worse, and/or associated with asymptomatic macrohematuria. All surgeries were performed using white-light rigid endoscopes (Olympus Medical System, Tokyo, Japan, or Karl Storz, Tuttlingen, Germany). Still cystoscopic images were obtained from the operative video records of each surgery. The flat reddish mucosal lesions resembling Hunner lesions in control patients served as control images for Hunner lesions, regardless of the presence or absence of malignancy.
2.3 A CAD model for Hunner lesions
We first processed cystoscopic images to highlight and correct differences between images obtained from the Olympus and Karl Storz cystoscopes. In each image, a region of interest (ROI) was outlined by a circle, and the color tone and brightness of the bladder mucosa within the ROI were corrected according to the color balance of that area. In addition, the area outside the ROI was replaced by snow noise that was adjusted to the corrected color tone of the ROI (Supplementary Fig. 1). Then, the processed cystoscopic images were randomly assigned to the training set (80%) for transfer learning of a CNN model and the test set (20%) for external validation.
We used InceptionResNetv2, a pretrained CNN model with >1 million natural images from the ImageNet database (http://www.image-net.org), for constructing our CAD models based on transfer learning, to compensate for the relatively small volume of training images (Supplementary Fig. 2) [
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, Inception-ResNet and the impact of residual connections on learning. Thirty-first AAAI Conference on Artificial Intelligence; 2017.
]. We employed a five-fold cross-validation method to evaluate our CAD models. Briefly, the training dataset was further randomly divided into five stratified subsets of equal size and proportion of Hunner lesion images. Among the five subsets, images from four subsets (ie, 80% of the training data) were used to retrain the pretrained CNN model, and images from the remaining subset (20% of the training data) were used to validate the retrained CNN models. In this process, the network parameters of the pretrained CNN model were transferred to the initial network parameters to learn the cystoscopic images according to the proposed method. Then, all network parameters were retrained using images of the four subsets in a supervised manner based on the Stochastic Gradient Descent algorithm to discern Hunner and control lesions and validated using images from the remaining subset to prevent overfitting in this CAD model. These steps were repeated five times by alternating each subset used as test images, yielding five CAD models. Subsequently, the performance of the constructed five CAD models was evaluated using the test dataset for external validation (Supplementary Fig. 2).
2.4 Reader study of diagnostic performance to compare CAD models with urologists
Next, we assessed the potential clinical utility of the constructed CAD models by comparing their diagnostic performance with that of urologists in a reader study. A 100-image dataset was created by randomly selecting 50 images each of Hunner lesions and control lesions from the test dataset (Supplementary Fig. 2). Five IC/BPS experts (defined as those who had performed ≥100 endoscopic surgeries for patients with HIC), 11 Japanese Urological Association board–certified urologists (those who had ≥6 yr of experience in urology), and eight urology residents (≤5 yr of experience in urology) classified each image of the selected 100-image dataset in a blinded manner.
2.5 Statistical analysis
The performance of the CAD models for Hunner lesions was evaluated by creating a receiver operating characteristic (ROC) curve. The CNN was retrained to output 1 if the image was of a Hunner lesion and 0 if of a control lesion. True- and false-positive rates were plotted on the ROC curve when the threshold changed from 0 to 1. The area under the curve (AUC) was calculated from the ROC curve. Accuracy, sensitivity, and specificity were evaluated at a threshold of 0.5.
3. Results
3.1 Participants and cystoscopic image preparation
The demographics of the patients are shown in Table 1. All patients with HIC favorably responded to electrocautery of Hunner lesions and manifested the histological characteristics of HIC, such as lymphoplasmacytic infiltration, epithelial denudation, stromal fibrosis, and edema in bladder pathology [
Hunner-type (classic) interstitial cystitis: a distinct inflammatory disorder characterized by pancystitis, with frequent expansion of clonal B-cells and epithelial denudation.
Assessed on a 7-grade QOL scale derived from the International Prostate Symptom Score, with 0 indicating excellent and 6 indicating terrible.
5.6 ± 0.9 (2–6)
NA
Daytime frequency
13.5 ± 5.7 (5–30)
NA
Nocturia frequency
4.7 ± 2.6 (0–12)
NA
Average voided volume (ml)
101.5 ± 47.8 (30–227)
NA
Maximum voided volume (ml)
164.6 ± 80.6 (50–350)
NA
Maximum bladder capacity at hydrodistension (ml)
426.9 ± 181.0 (150–1000)
NA
HIC = Hunner-type interstitial cystitis; NA = not analyzed; OSPI = O’Leary and Sant’s Problem Index; OSSI = O’Leary and Sant’s Symptom Index; QOL = quality of life; SD =standard deviation.
a Mean ± SD (range).
b Assessed using an 11-point pain intensity numerical rating scale ranging from 0, indicating no pain, to 10, indicating maximum pain.
c Assessed on a 7-grade QOL scale derived from the International Prostate Symptom Score, with 0 indicating excellent and 6 indicating terrible.
A total of 626 images of 233 lesions in 82 surgeries, including 360 images of 129 Hunner lesions and 266 images of 104 control lesions, were obtained (Table 2 and Fig. 1). A total of 338 images, including 236 images of Hunner lesions and 102 images of control lesions, were obtained using the Olympus rigid endoscope, and 288 images, including 124 images of Hunner lesions and 164 images of control lesions, were obtained using the Karl Storz rigid endoscope. Of the 266 control images, 136 were of CIS, 14 of urothelial carcinoma, 78 of BCG cystitis, 11 of radiation cystitis, two of malakoplakia, and 25 of other chronic cystitis (Supplementary Table 1). The training dataset contained 500 images, including 288 of Hunner lesions and 212 of control lesions, and the test dataset contained 126 images, including 72 of Hunner lesions and 54 of control lesions.
3.2 Diagnostic performance of the constructed models
The mean AUC of the five constructed CAD models was 0.919 in the test image dataset for external validation, with mean sensitivity of 81.9% and specificity of 85.2% at a threshold of 0.5 (Fig. 2A). In a reader study, the mean accuracy, sensitivity, and specificity (at a threshold of 0.5) were, respectively, 83.0%, 80.4%, and 85.6% for the five models; 62.4%, 79.6%, and 45.2% for the IC/BPS expert physicians; 51.0%, 40.0%, and 62.0% for the Japanese Urological Association board–certified urologists; and 46.8%, 36.0%, and 57.8% for the urology residents. The diagnostic accuracy of the five models for Hunner lesions (mean AUC of 0.912) exceeded that of the IC/BPS expert physicians (Fig. 2B). Prediction of each image by the CAD models and humans was depicted as a heatmap and box plot (Supplementary Fig. 3 and 4). The results suggested that humans are likely to misrecognize control lesions for Hunner lesions, rather than vice versa.
Fig. 2ROC curves for the five constructed deep learning models. (A) ROC curves for the five constructed deep learning models using the test dataset. (B) ROC curves for the five constructed deep learning models and the operating points of human urologists in the reader study. AUC = area under the curve; BPS = bladder pain syndrome; IC = Interstitial cystitis; ROC = receiver operating characteristic.
Examples of Hunner and control lesions correctly recognized by the CAD models are shown as heatmap visualization in Figure 3, in which areas that were important for diagnosis processing are highlighted. The CAD models seemed to preferentially assess the region of a reddened mucosal area accompanied by radiating/surrounding small vessels in the vicinity for differentiation between images of Hunner and control lesions. Images of control lesions that all IC/BPS expert physicians unanimously misrecognized as Hunner lesions but all five CAD models correctly predicted as control lesions are shown in Supplementary Figure 5. The CAD models seemed to discriminate the images by focusing on features including specific capillary structures that IC/BPS expert physicians were not likely to fully notice during cystoscopy.
Fig. 3Heatmap visualization of Hunner and control images correctly recognized by the deep learning models. The heatmaps, created using Gradient-weighted Class Activation Mapping software, highlight the important regions in cystoscopic images for correctly predicting Hunner and control lesions. Vessels that cluster radially toward or surround the lesions were highlighted. These features might be responsible for image recognition by the deep learning models. The upper number of each heatmap image indicates the value predicted by the models (0, control, and 1, Hunner lesion).
In the present study, we developed an AI-driven CAD system for supporting cystoscopic recognition of a Hunner lesion based on a deep learning algorithm. The constructed models achieved a mean AUC of 0.912, sensitivity of 80.4%, and specificity of 85.6% for the detection of Hunner lesions, which outperformed the diagnostic accuracy of IC/BPS expert physicians.
A Hunner lesion, a characteristic reddish mucosal lesion frequently accompanied by abnormal capillary structures, is a hallmark of HIC. Although the etiology of a Hunner lesion remains elusive, it has been suggested that locally intensified inflammatory responses, in conjunction with ischemia, may be associated with the pathogenesis of a Hunner lesion [
]. This characteristic bladder lesion has crucial implications for diagnosis and treatment prognosis in HIC. In clinical management, Hunner lesion–targeted therapies such as local electrocautery or steroid injection provide more favorable outcomes than other treatment options in patients with HIC [
]. Such lack of objectivity and variability in cystoscopic appearance, in addition to the extremely low prevalence of the Hunner lesion subtype, make recognition of a Hunner lesion challenging for the majority of urologists who do not have as much expertise as expert physicians in managing IC/BPS.
AI and deep learning techniques have successfully been applied in medical image diagnosis. Examples include dermoscopic diagnosis of melanoma [
AI outperformed every dermatologist in dermoscopic melanoma diagnosis, using an optimized deep-CNN architecture with custom mini-batch logic and loss function.
Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.
] developed an AI diagnostic platform using blue-light cystoscopic images that not only detected bladder cancer with high sensitivity of 95.77% and modest specificity of 87.84%, but also classified tumor invasiveness with sensitivity of 88% and specificity of 96.56%. Tokuyama et al [
] developed AI models that predicted early recurrence of non–muscle-invasive bladder cancer with a probability of up to 90% based on machine learning of nuclear features in histological images. Yamamoto et al [
] developed a deep learning algorithm based on the assessment of histological images that accurately predicted recurrence of prostate cancer. These studies consistently demonstrated that deep learning models exert higher diagnostic ability in conjunction with human performance than when using either method alone. Collectively, AI and deep learning techniques have the potential to surpass limitations on conventional image diagnosis performed by humans only.
There are several limitations to this study that relate to the methodology, first among which is the opaque black box nature of AI-based deep learning techniques. Second, this study was performed using images that were acquired only by rigid cystoscopes. The appearance of cystoscopic images may vary depending on the light source and type of cystoscope. The versatility of our CAD models is to be validated using images obtained by other types of light sources or cystoscopes, including flexible cystoscopes. Third, the retrospective study design and the diagnostic nature of a Hunner lesion as warranted assertibility might bias cystoscopic image collection and thereby affect the results. Diagnosis of a Hunner lesion was made by our two urologists in a subjective manner, and thereby it might act as a working hypothesis in the present study. With regard to this, we used images of Hunner lesions that were obtained from cases that favorably responded to electrocautery of the lesions and showed characteristic histological features consistent with HIC [
Hunner-type (classic) interstitial cystitis: a distinct inflammatory disorder characterized by pancystitis, with frequent expansion of clonal B-cells and epithelial denudation.
]. Conversely, this might exclude some HIC cases that did not show those clinical and histological features, and could be another limitation of the present study. Objective, reliable, and reproducible diagnostic markers for a Hunner lesion are urgently needed to standardize the diagnosis of a Hunner lesion. Further multicenter, international prospective studies are warranted to verify the clinical utility of our CAD models in real-world settings.
5. Conclusions
We first developed the deep learning system that recognizes Hunner lesions in cystoscopic images with accuracy (mean AUC up to 0.912) greater than that of IC/BPS expert physicians. Our models provide a platform for developing a system that supports the accurate diagnosis of a Hunner lesion and that can lead to improved treatment outcomes in managing patients with HIC.
Author contributions: Yoshiyuki Akiyama had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Iwaki, Akiyama, Nosato, Fukuhara.
Acquisition of data: Iwaki, Akiyama.
Analysis and interpretation of data: Iwaki, Nosato.
Drafting of the manuscript: Iwaki, Akiyama, Nosato, Homma.
Critical revision of the manuscript for important intellectual content: Nosato, Kinjo, Niimi, Taguchi, Y. Yamada, Sato, Kawai, D. Yamada, Sakanashi, Kume, Homma, Fukuhara.
Statistical analysis: Iwaki, Nosato.
Obtaining funding: Akiyama.
Administrative, technical, or material support: Akiyama, Nosato.
Supervision: Akiyama, Homma, Fukuhara.
Other: None.
Financial disclosures: Yoshiyuki Akiyama certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: This study was financially supported by a KAKENHI Grants-in-Aid from the Japanese Society for the Promotion of Science (JSPS; grant number 22K16788, to Yoshiyuki Akiyama).
Acknowledgments: This paper is partly based on the results obtained from a project, JPNP20006, commissioned by New Energy and Industrial Technology Development Organization (NEDO).
Appendix A. Supplementary data
The following are the Supplementary data to this article:
Hunner-type (classic) interstitial cystitis: a distinct inflammatory disorder characterized by pancystitis, with frequent expansion of clonal B-cells and epithelial denudation.
Clinical characterization of interstitial cystitis/bladder pain syndrome in women based on the presence or absence of Hunner lesions and glomerulations.
AI outperformed every dermatologist in dermoscopic melanoma diagnosis, using an optimized deep-CNN architecture with custom mini-batch logic and loss function.
Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, Inception-ResNet and the impact of residual connections on learning. Thirty-first AAAI Conference on Artificial Intelligence; 2017.