REGISTRO DOI: 10.69849/revistaft/ni10202510051127
Mayana Almeida Araújo dos Santos1
João Batista Pacheco Junior2
Halinna Larissa Cruz Correia de Carvalho Buonocore3
Lorena Lúcia Costa Ladeira1
ABSTRACT
Background: Ameloblastoma (AM) and odontogenic keratocyst (OKC) are benign intraosseous lesions with similar radiographic features but distinct clinical behaviors and treatment approaches. Their differentiation through imaging remains challenging due to overlapping anatomical presentations and the absence of well-defined diagnostic criteria. In this context, artificial intelligence—particularly Convolutional Neural Networks (CNNs)—emerges as a promising tool for enhancing diagnostic accuracy.
Methods: An integrative literature review was conducted using PubMed/Medline and Scopus databases, covering studies from 2018 to 2023. The search included the descriptors “Ameloblastoma,” “Odontogenic Keratocyst,” “Odontogenic Cysts,” “Convolutional Neural Network,” “Artificial Intelligence,” and “Deep Learning.”
Results: Eleven studies met the inclusion criteria. CNNs achieved diagnostic accuracy ranging from 70% to 99.25%, with sensitivity from 71% to 98.08% and specificity up to 100%. Studies using CT imaging generally outperformed those using panoramic radiographs. However, most models relied exclusively on imaging data, and lesion-specific performance metrics were often underreported. Methodological inconsistencies—such as limited dataset diversity and overreliance on accuracy alone—were also observed.
Conclusion: CNNs show strong potential as complementary tools in the differential diagnosis of AM and OKC, particularly when applied to high-resolution imaging modalities. Nonetheless, their clinical applicability remains limited by the lack of stratified performance data, inconsistent reporting standards, and the absence of multimodal integration. Future research should emphasize external validation, multicenter datasets, and the combination of imaging with clinical and histopathological features to improve real-world diagnostic performance.
KEYWORDS: Ameloblastoma. Odontogenic Cysts. Artificial Intelligence. Image Diagnosis.
1 INTRODUCTION
Ameloblastoma (AM) and odontogenic keratocyst (OKC) are benign intraosseous lesions with significant destructive potential and high recurrence rates, potentially reaching large dimensions before becoming symptomatic, reducing the quality of life for affected patients.1,2 AM is the most common asymptomatic odontogenic epithelial tumor, slow-growing, and most frequently affecting individuals between the fourth and sixth decades of life.3 On the other hand, OKC is the third most common odontogenic cyst, often affecting younger individuals, with a peak incidence between the second and third decades of life.4,5 These lesions are usually detected incidentally during routine radiographic examination, as symptoms are uncommon in the early stages. Therefore, their similar imaging findings and locations make differentiation a challenge.6
Radiographically, AM presents as a unilocular or multilocular radiolucent lesion associated with impacted teeth, dental displacement, resorption of teeth adjacent to the tumor, root resorption, cortical bone expansion and/or perforation, and the presence of septa, giving it a “soap bubble” or “honeycomb” appearance.7,8 Given the high risk of recurrence, surgical resection with a safety margin is usually the preferred treatment for this tumor. OKC, however, appears as a well-defined unilocular or multilocular radiolucency with minimal vestibulolingual expansion.5,7,9 It frequently involves the posterior mandible and tends to grow toward the mandibular ramus. The treatment generally involves enucleation combined with other adjuvant therapies to reduce the risk of recurrence.10,11
Due to similarities in radiographic examination, such as the presence of internal septations creating uncertainties regarding correct diagnosis, and different treatment approaches for these pathologies, the subjective interpretation by the examiner to identify particularities is time-consuming and depends on their experience to suggest a diagnosis, as there are no welldefined criteria to distinguish the lesions.12 Therefore, more accurate complementary exams like computed tomography and histopathological examination are necessary. The latter is considered the gold standard, as it involves microscopic analysis of tissues and determines the type of lesion, ultimately guiding the patient to the correct treatment.13
However, due to the aggressiveness of these lesions and the high cost of complementary exams, early diagnosis via imaging exams (panoramic radiography and computed tomography) can facilitate and expedite the management of patients with these pathologies. In this context, Artificial Intelligence (AI)—which encompasses algorithms and computational models that simulate human learning and decision-making processes—has been increasingly explored due to its high diagnostic accuracy, often surpassing human performance.14 A prominent subdivision of AI in medical imaging is Convolutional Neural Networks (CNNs), algorithms inspired by the visual cortex of the human brain, with architecture designed to detect complex features in visual data.15,16
Considering the similar imaging characteristics of AM and OKC and the importance of accurate diagnosis to direct appropriate therapy, objective tools can significantly aid the diagnostic process. CNNs have emerged as the state-of-the-art approach in image-based lesion classification, capable of learning hierarchical visual patterns and minimizing subjective interpretation. Given the diagnostic challenge posed by the radiographic similarity between AM and OKC, this study focuses specifically on the application of CNNs in distinguishing these lesions through imaging analysis.
Thus, the objective of this study is to describe the accuracy of AI application— specifically through CNNs—in the differential diagnosis of AM and OKC.
2 MATERIAL AND METHODS
This study followed the integrative literature review methodology, which allows the inclusion of studies with varied methodologies to generate a comprehensive understanding of a phenomenon.17 In this context, the aim was to synthesize existing evidence regarding the diagnostic accuracy of convolutional neural networks in distinguishing ameloblastoma and odontogenic keratocyst through imaging analyses. Accordingly, the systematic plan for conducting the integrative review was structured into three stages.
In the first stage, a bibliographic survey was conducted using the PubMed/Medline and Scopus databases. Cross-referencing with Boolean operators (“AND/OR”) was performed using the following English descriptors: (1) “Ameloblastoma”; (2) “Odontogenic Keratocyst”; (3) “Odontogenic Cyst”; (4) “Convolutional Neural Network”; (5) “Artificial Intelligence”; (6) “Machine Learning”; and (7) “Deep Learning”. No restrictions regarding publication status or journal indexing were applied. The full logical structure was: (“Ameloblastoma” OR “Odontogenic Keratocyst” OR “Odontogenic Cyst”) AND (“Convolutional Neural Network” OR “Artificial Intelligence” OR “Machine Learning” OR “Deep Learning”).
In the second stage, inclusion criteria was defined. It comprised studies published in English between 2018 and 2023, involving human subjects, using diagnostic imaging (panoramic radiographs or tomographic exams), and addressing the identification or classification of ameloblastoma and odontogenic keratocyst. Studies had to apply CNNs or similar AI-based methods and provide histopathological confirmation. Literature reviews were also considered, if focused on diagnostic performance.
In the third stage, title and abstract screening were carried out based on the inclusion criteria. Duplicates were removed, and eligible studies were read in full. A data extraction table was developed to synthesize the following information: study identification, methodological design, sample characteristics, lesion types, imaging modality, CNN architecture, and diagnostic performance metrics (accuracy, sensitivity, specificity, F1-score, AUC, etc.).
3 RESULTS
3.1 Evaluation of the accuracy of CNN usage
This integrative review selected 11 studies published between 2018 and 2023, including 9 original research articles and 2 literature reviews (Table 1). The primary aim was to evaluate the diagnostic performance of CNNs in distinguishing AMs and OKCs, two radiographically similar yet clinically distinct lesions. Although some studies focused exclusively on these two entities, others included AM and OKC among a broader set of maxillofacial pathologies.
Studies by Bispo et al.1 and Chai et al.14 specifically addressed the classification of AM and OKC using computed tomography (CT) images, while Liu et al.18 applied panoramic radiographs for the same task. Other investigations, such as those by Ariji et al.19, Kwon et al.20, Yang et al.21, and Lee et al.22, included AM and OKC in multiclass classification settings alongside other radiolucent jaw lesions like dentigerous cysts, radicular cysts, Stafne bone cavities, and periapical cysts. Rašić et al.23 and Watanabe et al.24 evaluated general radiolucent or cyst-like lesions, with AM and OKC present in their datasets but not analyzed separately. Additionally, the literature reviews by Hung et al.25 and Mureșanu et al.26 provided broader insights into CNN applications across dental imaging modalities, covering both AM and OKC.
Although several studies included AM and OKC in their classification tasks, most reported only global performance metrics—such as overall accuracy, sensitivity, and F1score—without specifying results per lesion type. As a result, it was not always possible to extract distinct performance values for AM or OKC individually. Nonetheless, these studies were retained in this review because AM and/or OKC were explicitly part of the classification targets, allowing for a broader yet relevant assessment of CNN performance in differential diagnosis. In contrast, a subset of studies—such as those by Bispo et al.1, Chai et al.14, and Liu et al.18—did provide detailed, lesion-specific metrics, which were used to support more targeted comparisons. Although Bispo et al.1 did not directly report all standard metrics, the confusion matrix included in their results allowed us to derive key values such as sensitivity, specificity, precision, and F1-score for both AM and OKC.
Table 1. Summary of information from articles that evaluated the accuracy of CNN use in the differential diagnosis of AM and OKC.
| Author/Year of Publication | Design | Objectives | Sample Evaluated | Main Results | Conclusion |
| Bispo, M. S., et al.20211 | Retrospective diagnostic accuracy study – CNN classifier | To evaluate the performance of automatic classification by Google Inception v3 CNN, in CT images of AM and OKC. | CT images of 22 AM and 18 OKC cases (n = 350, augmented to n = 2500). | 90% accuracy in differentiation tests, with 83% for AM and 91.2% for OKC. Based on the confusion matrix provided by the authors, sensitivity, specificity, precision, and F1-score averaged 83%, 91.2%, 90.4%, and 86.5% for AM, and 91.2%, 83%, 84.3%, and 87.6% for OKC, respectively. | The study showed high accuracy in results, with a higher error rate in identifying AM images. It is suggested to include other lesions for analysis in future studies. Based on the confusion matrix, the model also demonstrated balanced precision and recall for both classes, confirming its robustness across multiple training iterations. |
| Ariji, Y., et al.201919 | Retrospective diagnostic accuracy study – CNN object detection | To propose and evaluate DetecNet CNN for automatic classification of mandibular radiolucent lesions. | 210 images with histopathological diagnosis, with 31 AM, 33 OKC, 66 DC, 68 RC, and 12 COS. | 88% accuracy in detection tests. Detection and classification sensitivity was 71% and 60% for AM, 100% and 13% for OKC, 88% and 82% for DC, 81% and 77% for RC respectively. | The Digits and DetectNet CNNs showed high capacity for detection and differentiation of radiolucent lesions in panoramic radiographs, demonstrating potential for clinical diagnosis. |
| Kwon, O., et al. 202020 | Retrospective diagnostic accuracy study – CNN classifier | To evaluate the performance of YOLOv3 CNN to automatically diagnose jaw cysts and tumors. | 1282 images with histopathological diagnosis (350 DC, 302 CP, 300 OKC, 230 AM, and 100 without lesions) in the first training set, and another set resulting from data augmentation on these same images in the second training set. | Classification performance without data augmentation was 78.2% sensitivity, 93.9% specificity, 91.3% accuracy, 86% AUC, compared to classification performance with data augmentation resulting in 88.9% sensitivity, 97.2% specificity, 95.6% accuracy, and 94% AUC. | The performance of CNN in lesion classification was positive, showing even better results after applying data augmentation in images, supporting its use in clinical practice. |
| Yang, H., et al. 202021 | Retrospective diagnostic accuracy study – CNN classifier | To evaluate the diagnostic performance of the (YOLO) v2 CNN in the classification of lesions by YOLO CNN. | 1602 images with histopathological diagnosis, of OKC, AM, DC, and without lesion. Data augmentation was applied to the images resulting in 16,224 images. | YOLO v2 achieved a higher real-time detection rate (accuracy: 70%; recall: 68%) while human evaluation took an average of 33.8 minutes. Quantifying CNN performance by F1 Score, only one WHO surgeon managed to surpass it. False positives were like or fewer than human evaluation. | The study emphasized the usefulness of CNNs in lesion detection in imaging exams as an auxiliary tool for preliminary diagnostics, doublechecking, and support for inexperienced professionals. |
| Lee, A., et al. 202122 | Retrospective diagnostic accuracy study – CNN classifier | To analyse the classification performance between Stafne Bone Cavity (SBC) and other lesions (AM and OKC included) using DenseNet CNN through panoramic radiographs. | 458 images with histopathological diagnosis (176 SBC, 98 DC, 91 OKC, 93 AM). | It achieved 99.25% accuracy, 98.08% sensitivity, and 100% specificity, with only one SBC case misclassified. | The CNN showed high performance, being a useful tool for health professionals providing support in determining diagnoses. |
| Chai, Z., et al.202214 | Retrospective diagnostic accuracy study – Customized CNN Classifier | To propose a CNN to make preoperative differential diagnoses between AM and OKC on CBCT. | 350 images with histopathological diagnosis (178 AM and 172 OKC). | It achieved 87.2% sensitivity, 82.1% specificity, 84.6% accuracy, and F1 of 85%. | The CNN achieved satisfactory accuracy in results and in a short differentiation time between AM and OKC when compared to human analysis. |
| Liu, Z. et al. 202118 | Retrospective diagnostic accuracy study – ML Classifiers. | To compare the performance of three CNNs — a proposed model, VGG-19, and ResNet-50 — in the classification of AM and OKC. | 420 mandibular radiographs (209 AM and 211 OKC). | The proposed CNN achieved an accuracy of 90.36%, sensitivity of 92.88%, and specificity of 87.80%. The VGG-19 and ResNet-50 networks achieved accuracies of 80.72% and 78.31%, respectively. | The proposed CNN showed better results when compared to VGG-19 and ResNet-50 CNNs. The study indicates that using CNN can offer reliable support to specialists for diagnostics. |
| Hung, K.F. et al. 202325 | Narrative review | To provide an updated overview of deep learning and radiomics on CT and CBCT for maxillofacial diagnosis and management. | 39 studies across multiple disease types, including AM and OKC. | The best-performing AI models achieved high diagnostic performance, with accuracy, sensitivity, and specificity often exceeding 90%. Notably, some models attained sensitivity values above 95%, and AUCs approaching or exceeding 98% for specific tasks, with several achieving diagnostic accuracy equal to or surpassing that of human experts. | AI models using CT and CBCT hold strong promise for enhancing diagnostic workflows and clinical decision-making in oral and maxillofacial radiology. Nonetheless, current barriers—such as lack of reproducibility, low explainability, and methodological inconsistencies— need to be addressed before routine clinical adoption is feasible. |
| Mureșanu, S. et al. 202326 | Systematic review | To review current applications of AI (deep learning and radiomics) using CT and CBCT for maxillofacial diseases. | 59 studies, published between 20122022, involving CT/CBCT images of maxillofacial diseases. 42 of these studies employed deep learning models, while 17 employed conventional machine learning techniques. | Many models demonstrated high diagnostic performance, with accuracies reaching up to 97%, sensitivities up to 96%, and specificities up to 98%. A particularly relevant study by Fukuda et al. combined CBCT images with clinical data and achieved an area under the curve (AUC) of 92,6% for detecting lesions. | AI models, especially CNNs applied to CBCT, show strong clinical promise with performance comparable to expert clinicians. However, their routine use is limited by the lack of large annotated datasets, external validation, and model interpretability. Combining imaging with clinical data may enhance reliability, but broader adoption requires multicenter studies focused on diverse cases, including rare lesions like ameloblastomas and OKCs. |
| Rašić et al., 202323 | Retrospective diagnostic accuracy study – CNN object detection | To develop and evaluate a YOLOv8‑based CNN for the automated detection and segmentation of radiolucent lesions in the lower jaw using panoramic radiographs | 226 lesion instances collected from 200 panoramic images, each corresponding to an individual patient, gathered between 2013 and 2023, including radicular cysts, ameloblastomas (AM), odontogenic keratocysts (OKC), dentigerous cysts, and residual cysts. The set was augmented and five-fold cross-validated. | In the detection task, the model achieved precision up to 95.2%, recall up to 94.4%, mAP@50 of 97.5%, and mAP@50‑95 of 68.7% after augmentation. In segmentation, performance improved with augmentation, reaching precision of 100%, recall of 94.5%, mAP@50 of 96.6%, and mAP@50‑95 of 72.2% | CNN demonstrated robust capability to detect and segment radiolucent lower‑jaw lesions, including AM and OKC, indicating strong potential for clinical diagnostic assistance. |
| Watanabe et al., 202024 | Retrospective diagnostic accuracy study – CNN classifier | To evaluate the performance of a DetectNet-based CNN for the automatic detection of maxillary cystlike lesions on panoramic radiographs. | 412 patients, each contributing one panoramic image (412 images in total). The dataset comprised various cystic lesions, including odontogenic keratocyst (OKC), but did not include ameloblastoma (AM). | The model achieved recall values of 74.6% and 77.1%, precision of 89.8%. and 90.0%, and F1-scores of 81.5% and 83.1% across two test sets. | CNN-based object detection showed good performance in detecting cystic lesions, particularly radicular cysts, and highlighted its potential as a diagnostic aid in clinical practice. |
The experimental studies evaluated included a range of intraosseous pathologies, such as odontogenic keratocyst, ameloblastoma, dentigerous cyst, simple bone cyst, radicular cyst, Stafne bone cavity, nasopalatine duct cysts, and periapical cyst.1,14,18–24 All lesions were histopathologically confirmed for diagnostic validation.
These studies examined the performance of convolutional neural networks (CNNs) in classifying and distinguishing such lesions using metrics including sensitivity, specificity, F1score, and accuracy. Only two studies21,25 compared CNN diagnostic ability with that of human experts, and in one of them, a specialist from the World Health Organization matched the CNN’s accuracy.21
The evaluated CNN architectures included Inception V31,14, DetectNet19, DenseNet22, VGG-1918, ResNet-5018, and various versions of YOLO (YOLO21, YOLOv320,21, YOLOv823), along with an unidentified modified model. Inception V3 and the YOLO family were the most frequently employed, each appearing in at least two independent studies.
Panoramic radiograph (PR)18–24 and computed tomography (CT)1,14 images were used for CNN training. Only two studies included non-lesion panoramic images in their test datasets.20,21 Although panoramic radiographs are more accessible, some studies noted that they present limitations in evaluating maxillary lesions due to anatomical structure overlap and image distortion, which may reduce diagnostic reliability.14,20,25 To define lesion boundaries, all images were manually segmented by specialists, establishing the region of interest (ROI) prior to training.
Following segmentation, image datasets were divided into training and testing sets for supervised CNN learning. Some studies additionally applied data augmentation to enhance the training set and improve model performance.1,20,21 One of these studies used data augmentation specifically to compare performance between original and expanded datasets, reporting significant accuracy gains when augmented data were used.20
Performance metrics reported in the selected studies showed considerable variability. Accuracy values ranged from 70% to 99.25%, while sensitivity ranged from 71% to 98.08% and specificity from 98.1% to 100%. When available, precision, F1-scores, and negative predictive values also reflected high classification performance. Studies using CTs tended to report higher numerical results compared to those using PRs, which may reflect differences in image resolution and anatomical detail.
Finally, the literature reviews included in this study emphasize the growing potential of AI in dentistry.25,26 CNNs have demonstrated applicability across various dental specialties, supporting diagnosis and surgical planning in endodontics, orthodontics, implantology, temporomandibular joint disorders, and the evaluation of cysts and tumours in the head and neck region. Such tools can expedite diagnostic workflows and contribute to more accurate and timely treatment decisions.
3.2 Differential diagnosis of AM and OKC
Among the analyzed studies, all included ameloblastoma (AM) and/or odontogenic keratocyst (OKC) either as specific diagnostic targets or as part of a broader set of intraosseous lesions. However, only three studies were explicitly designed to differentiate AM from OKC.14,18 These focused exclusively on these two pathologies and reported performance metrics specific to each, allowing more accurate analysis.
Bispo et al.1 reported 90% accuracy in their classification task using CT images, noting a higher error rate for AM, which may be attributed to its variable appearance and anatomical overlap with other lesions. Although not all standard metrics were provided, the confusion matrix allowed us to calculate balanced performance values, including 83% sensitivity for AM and 91.2% for OKC. This lower sensitivity further supports the observed difficulty in correctly identifying AM cases.
Chai et al.14 also employed CT imaging to evaluate CNN performance in distinguishing AM from OKC. They found that diagnostic accuracy decreased when the lesion’s ROI was located in the maxilla. This was attributed to the lower incidence of these lesions in that region, resulting in insufficient training samples and higher morphological overlap with other entities. While the study did not include a confusion matrix, it reported sensitivity and specificity of 80.6% and 85.3%, respectively, for the model’s differential classification, reinforcing its clinical relevance despite contextual challenges.
Liu et al.18 used panoramic radiographs to evaluate the classification performance of three different CNN architectures in distinguishing AM and OKC. One of the models achieved approximately 87% accuracy, showing better results than the others. Although class-specific metrics were not reported and no confusion matrix was provided, the overall performance suggests that CNNs can reach consistent diagnostic accuracy even when applied to lowerresolution imaging modalities such as panoramic radiographs. However, the lack of granular performance data limits further clinical interpretation, especially considering that panoramic radiographs are more susceptible to overlapping anatomical structures and subtle lesion variations—factors particularly relevant in the differentiation of AM and OKC.
Other studies, such as those by Ariji et al.19 and Kwon et al.20, included AM and OKC among several other maxillary lesions but did not report performance metrics specific to each lesion. Instead, these studies presented general diagnostic performance across all lesion classes, limiting the ability to isolate CNN effectiveness for AM and OKC specifically. Despite this, the inclusion of AM and OKC in multiclass settings provides insight into how CNNs perform in more realistic diagnostic environments.
Altogether, this highlights the need for more granular performance reporting and methodologically rigorous designs when evaluating CNN-based differential diagnosis— especially for radiographically overlapping lesions such as AM and OKC.
4 DISCUSSIONS
4.1 CNN usage limitations
Although CNNs have shown promising performance in the differential diagnosis of odontogenic lesions, several limitations affect their applicability and reliability in clinical settings. The primary challenges identified across the reviewed studies include the limited size and diversity of imaging dataset1,14,25, particularly for rare lesions such as AM and OKC. This scarcity contributes to restricted generalization capacity and increased risk of overfitting27, especially when models are trained on imbalanced or homogenous data.
Some studies used data augmentation techniques to mitigate this issue1,20,21,23, improving classification accuracy by synthetically expanding the dataset. However, augmentation does not fully compensate for the lack of real, high-quality images, particularly from underrepresented anatomical regions like the maxilla.
Another critical limitation concerns the CNNs exclusive reliance on imaging data14,25. While they are effective at analyzing radiographic and tomographic images, CNNs cannot interpret histopathological or cellular-level features—essential elements for a definitive diagnosis. As a result, CNNs function best as complementary tools rather than replacements for conventional diagnostic methods.
Furthermore, the evaluated models were trained in supervised learning settings, where annotated data is required. This process is time-consuming and depends on expert labeling, which may introduce subjectivity or inconsistencies. The variability in training protocols and CNN architectures also contributes to performance fluctuations across studies.
4.2 Methodological issues
The integration of CNNs into the differential diagnosis of AMs and OKCs represents a compelling intersection between medical imaging and artificial intelligence. However, this integrative review reveals that the potential of these technologies is still hampered by methodological inconsistencies, theoretical gaps, and a lack of clinical alignment.
While reported accuracy rates reached as high as 99.25%22, these values must be interpreted cautiously. The variation across studies—stemming from different CNN architectures, imaging modalities, and sample sizes—is not inherently problematic, but the reliance on accuracy as a standalone metric is. Especially in class-imbalanced scenarios like AM and OKC, such dependence may mask diagnostic fragilities. Theoretical literature on diagnostic AI consistently emphasizes the importance of sensitivity, specificity, and F1-score in evaluating model robustness28, yet not all of the analyzed studies adhered to this standard. This suggests a disconnect between empirical enthusiasm and methodological rigor.
The underreporting of lesion-specific metrics was another critical limitation. Although AM and OKC were often included in multiclass classification tasks, few studies provided stratified performance data. This limits interpretability and reflects a broader issue in medical AI: the tendency to optimize for overall performance at the expense of rare but clinically significant cases. In contrast, Bispo et al.1 offered a more granular view, allowing us to infer lower sensitivity for AM—likely due to its histological variability and radiographic overlap with other lesions. This not only illustrates the diagnostic challenge posed by AM but also exemplifies how detailed error analysis can yield clinically meaningful insights.
The choice of imaging modality also shaped outcomes significantly. Studies using CT outperformed those using panoramic radiographs, aligning with broader radiological principles about image resolution and anatomical clarity. Yet, most models remained constrained by their exclusive focus on imaging data, disregarding other diagnostic layers such as histopathology, patient history, or clinical signs. This reveals a theoretical limitation: CNNs excel at pattern recognition but lack contextual reasoning—a feature crucial to medical diagnosis.
4.3 Toward clinical integration
Addressing the limitations discussed in the previous sections—ranging from restricted dataset diversity to underreporting of key performance metrics—requires a shift toward more robust and clinically aligned methodologies. Future research should prioritize the development of larger, well-annotated, and heterogeneous datasets, ideally through multicenter collaborations. It is equally essential to explore integrative approaches that combine imaging data with clinical and histopathological information, thereby improving diagnostic precision and model generalizability. Only by confronting these technical and methodological constraints can CNN-based tools advance from experimental prototypes to reliable assets in clinical dentistry.
Successful clinical translation will also depend on the adoption of a more comprehensive framework. This includes transparent reporting of all key metrics, rigorous external validation, and stratified analysis—especially for underrepresented lesion types. Beyond methodological rigor, CNNs must be integrated with complementary data sources to reflect the multifactorial nature of medical decision-making. Ethical and operational considerations such as data privacy, algorithm accountability, and professional trust must likewise be addressed to support sustainable implementation.
Rather than viewing CNNs as diagnostic replacements, a more productive perspective is to see them as augmentative tools—capable of highlighting patterns and reducing cognitive load, yet ultimately dependent on human oversight. The reviewed literature supports this potential but also warns against overconfidence rooted in uncritical metrics and limited datasets. In this sense, the differential diagnosis of AM and OKC becomes a microcosm for broader debates in AI-driven healthcare: not just about what machines can recognize, but about how they should be deployed, interpreted, and held accountable.
5 CONCLUSIONS
The use of convolutional neural networks (CNNs) in the differential diagnosis of ameloblastoma (AM) and odontogenic keratocyst (OKC) demonstrates notable promise, particularly when high-resolution imaging modalities and optimized architectures are employed. Yet, the findings of this review reveal that technical success alone is not sufficient for clinical integration. High accuracy rates, while impressive, must be interpreted within a broader diagnostic framework—one that acknowledges the risks of class imbalance, underreported metrics, and limited generalizability.
CNNs have shown the capacity to match or surpass human experts in specific tasks, but their reliance on imaging data alone constrains their diagnostic depth. As AM and OKC may exhibit overlapping radiographic features, particularly in anatomically complex regions like the maxilla, sensitivity and lesion-specific performance become critical indicators of model reliability. Unfortunately, most studies did not provide such granularity, limiting the interpretability of results and reinforcing the need for more transparent and disaggregated reporting.
Beyond technical refinements, the clinical adoption of CNNs depends on integrating them into the complex, multimodal landscape of medical decision-making. Future studies should therefore pursue not only improved algorithmic performance but also the fusion of imaging data with histopathological, molecular, and clinical information. Ethical considerations—such as transparency, reproducibility, and patient safety—must also be central in this evolution.
In sum, CNNs should not be viewed as replacements for clinical reasoning, but as complementary instruments capable of enhancing diagnostic workflows. Their effective integration into practice will require interdisciplinary collaboration, rigorous validation, and sustained theoretical engagement with the limitations and affordances of AI in healthcare. When these conditions are met, CNNs may play a transformative role in the early and accurate diagnosis of odontogenic lesions.
Ethics Approval
Does not apply.
Author contribution
All authors have made substantial contributions to the conception, design and review of the final version of the work. The authors agree to be accountable for all aspects of the work and ensure that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Data availability statement
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. All authors agree to be accountable for any aspects of the work, and we ensure that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
Funding
Does not apply.
REFERENCES
1. Bispo MS, Pierre Júnior MLG de Q, Apolinário Jr AL, et al. Computer tomographic differential diagnosis of ameloblastoma and odontogenic keratocyst: classification using a convolutional neural network. Dentomaxillofacial Radiology. 2021;50(7):20210002. doi:10.1259/dmfr.20210002
2. Ledesma‐Montes C, Mosqueda‐Taylor A, Carlos‐Bregni R, et al. Ameloblastomas: a regional Latin‐American multicentric study. Oral Dis. 2007;13(3):303-307. doi:10.1111/j.1601-0825.2006.01284.x
3. Siriwardena BSMS, Crane H, O’Neill N, et al. Odontogenic tumors and lesions treated in a single specialist oral and maxillofacial pathology unit in the United Kingdom in 1992–2016. Oral Surg Oral Med Oral Pathol Oral Radiol. 2019;127(2):151-166. doi:10.1016/j.oooo.2018.09.011
4. Johnson NR, Gannon OM, Savage NW, Batstone MD. Frequency of odontogenic cysts and tumors: a systematic review. J Investig Clin Dent. 2014;5(1):9-14. doi:10.1111/jicd.12044
5. Alves DBM, Tuji FM, Alves FA, et al. Evaluation of mandibular odontogenic keratocyst and ameloblastoma by panoramic radiograph and computed tomography. Dentomaxillofacial Radiology. 2018;47(7):20170288. doi:10.1259/dmfr.20170288
6. Kitisubkanchana J, Reduwan NH, Poomsawat S, Pornprasertsuk-Damrongsri S, Wongchuensoontorn C. Odontogenic keratocyst and ameloblastoma: radiographic evaluation. Oral Radiol. 2021;37(1):55-65. doi:10.1007/s11282-020-00425-2
7. WHO Classification of Tumours Editorial Board. Head and Neck Tumours. Lyon (France): International Agency for Research on Cancer; 2022. (WHO Classification of Tumours Series, 5th Ed.; Vol. 9) Https://Publications.Iarc.Fr/.
8. Scarfe WC, Toghyani S, Azevedo B. Imaging of Benign Odontogenic Lesions. Radiol Clin North Am. 2018;56(1):45-62. doi:10.1016/j.rcl.2017.08.004
9. Ringer E, Kolokythas A. Bone Margin Analysis for Benign Odontogenic Tumors. Oral Maxillofac Surg Clin North Am. 2017;29(3):293-300. doi:10.1016/j.coms.2017.03.006
10. Rajendra Santosh AB. Odontogenic Cysts. Dent Clin North Am. 2020;64(1):105-119. doi:10.1016/j.cden.2019.08.002
11. Chrcanovic BR, Gomez RS. Recurrence probability for keratocystic odontogenic tumors: An analysis of 6427 cases. Journal of Cranio-Maxillofacial Surgery. 2017;45(2):244-251. doi:10.1016/j.jcms.2016.11.010
12. Crusoé-Rebello I, Oliveira C, Campos PSF, Azevedo RA, dos Santos JN. Assessment of computerized tomography density patterns of ameloblastomas and keratocystic odontogenic tumors. Oral Surgery, Oral Medicine, Oral Pathology, Oral Radiology, and Endodontology. 2009;108(4):604-608. doi:10.1016/j.tripleo.2009.03.008
13. Neville B, Damm D, Allen C, Chi A. Oral and Maxillofacial Pathology. 5th ed. Elsevier; 2023.
14. Chai ZK, Mao L, Chen H, et al. Improved Diagnostic Accuracy of Ameloblastoma and Odontogenic Keratocyst on Cone-Beam CT by Artificial Intelligence. Front Oncol. 2022;11. doi:10.3389/fonc.2021.793417
15. Eickenberg M, Gramfort A, Varoquaux G, Thirion B. Seeing it all: Convolutional network layers map the function of the human visual system. Neuroimage. 2017;152:184-194. doi:10.1016/j.neuroimage.2016.10.001
16. Mupparapu M, Wu CW, Chen YC. Artificial intelligence, machine learning, neural networks, and deep learning: Futuristic concepts for new dental diagnosis. Quintessence Int. 2018;49(9):687-688. doi:10.3290/j.qi.a41107
17. Whittemore R, Knafl K. The integrative review: updated methodology. J Adv Nurs. 2005;52(5):546-553. doi:10.1111/j.1365-2648.2005.03621.x
18. Liu Z, Liu J, Zhou Z, et al. Differential diagnosis of ameloblastoma and odontogenic keratocyst by machine learning of panoramic radiographs. Int J Comput Assist Radiol Surg. 2021;16(3):415-422. doi:10.1007/s11548-021-02309-0
19. Ariji Y, Yanashita Y, Kutsuna S, et al. Automatic detection and classification of radiolucent lesions in the mandible on panoramic radiographs using a deep learning object detection technique. Oral Surg Oral Med Oral Pathol Oral Radiol. 2019;128(4):424-430. doi:10.1016/j.oooo.2019.05.014
20. Kwon O, Yong TH, Kang SR, et al. Automatic diagnosis for cysts and tumors of both jaws on panoramic radiographs using a deep convolution neural network. Dentomaxillofacial Radiology. 2020;49(8):20200185. doi:10.1259/dmfr.20200185
21. Yang H, Jo E, Kim HJ, et al. Deep Learning for Automated Detection of Cyst and Tumors of the Jaw in Panoramic Radiographs. J Clin Med. 2020;9(6):1839. doi:10.3390/jcm9061839
22. Lee A, Kim MS, Han SS, Park P, Lee C, Yun JP. Deep learning neural networks to differentiate Stafne’s bone cavity from pathological radiolucent lesions of the mandible in heterogeneous panoramic radiography. Xie H, ed. PLoS One. 2021;16(7):e0254997. doi:10.1371/journal.pone.0254997
23. Raši´c MR, Tropči´c MT, Karlovi´c PK, Gabri´c DG, Subaši´csubaši´c M, Kneževi´c PK. Detection and Segmentation of Radiolucent Lesions in the Lower Jaw on Panoramic Radiographs Using Deep Neural Networks. Published online 2023. doi:10.3390/medicina
24. Watanabe H, Ariji Y, Fukuda M, et al. Deep learning object detection of maxillary cystlike lesions on panoramic radiographs: preliminary study. Oral Radiol. 2021;37(3):487- 493. doi:10.1007/s11282-020-00485-4
25. Hung KF, Ai QYH, Wong LM, Yeung AWK, Li DTS, Leung YY. Current Applications of Deep Learning and Radiomics on CT and CBCT for Maxillofacial Diseases. Diagnostics. 2022;13(1):110. doi:10.3390/diagnostics13010110
26. Mureșanu S, Almășan O, Hedeșiu M, Dioșan L, Dinu C, Jacobs R. Artificial intelligence models for clinical usage in dentistry with a focus on dentomaxillofacial CBCT: a systematic review. Oral Radiol. 2023;39(1):18-40. doi:10.1007/s11282-022-00660-9
27. Siriwardena BSMS, Crane H, O’Neill N, et al. Odontogenic tumors and lesions treated in a single specialist oral and maxillofacial pathology unit in the United Kingdom in 1992–2016. Oral Surg Oral Med Oral Pathol Oral Radiol. 2019;127(2):151-166. doi:10.1016/j.oooo.2018.09.011
28. Rainio O, Teuho J, Klén R. Evaluation metrics and statistical tests for machine learning. Sci Rep. 2024;14(1). doi:10.1038/s41598-024-56706-x
1Florence University Center, São Luís, Maranhão, Brazil.
2State University of Maranhão, São Luís, Maranhão, Brazil
3Federal University of Maranhão, São Luís, Maranhão, Brazil
Corresponding Author:
Mayana Almeida Araújo dos Santos, DDS
Dental Surgery Program
Florence University Center
e-mail: maya.almeidaa20@gmail.com.
Conflict of Interest: None to declare
