Abstract
Cancer remains one of the leading causes of mortality worldwide, characterized by the uncontrolled
growth and spread of abnormal cells. Effective management of cancer, including brain tumors and breast
cancer, relies on precise detection, classification, and grading to inform treatment strategies. Traditional
histopathological methods, although effective, are labor-intensive and subject to inter-pathologist variability. The advent of digital and computational pathology, using whole slide imaging (WSI) and deep
learning algorithms, has revolutionized cancer diagnosis and prognosis by enabling more accurate, consistent, and efficient analysis of histopathological images.
Computational pathology integrates advanced computational techniques with digitized histopathology slides, providing detailed quantitative assessments that significantly enhance diagnostic precision.
Deep learning has shown remarkable performance in image classification, segmentation, and detection,
making it particularly effective for analyzing complex medical images. By automating routine pathology
tasks, deep learning improves diagnostic accuracy and reduces the workload on pathologists, facilitating faster and more reliable diagnoses. This thesis aims to utilize these advancements to improve the
automated diagnosis and classification of brain gliomas and breast cancer, thus enhancing patient care
through more personalized and timely treatment interventions.
In the first part of the thesis, we focus on brain gliomas, a diverse group of tumors arising from glial
cells in the central nervous system. Accurate typing, subtyping, and grading of gliomas are critical for
determining prognosis and guiding treatment. Using a multiple-instance-learning (MIL) framework,
we employ self-supervised pre-trained feature extractors and feature aggregators to classify glioma
subtypes, determine tumor grades, and predict key immunohistochemistry (IHC) biomarkers such as
IDH, ATRX, TP53, and Ki-67 using H&E images. Our study establishes new performance benchmarks in glioma subtype classification across multiple datasets, including the in-house developed Indian
Pathology Brain Dataset (IPD-Brain), which provides a valuable resource for existing research. Using
a ResNet-50 pretrained on histopathology datasets for feature extraction, combined with the DoubleTier Feature Distillation (DTFD) feature aggregator, our approach achieves state-of-the-art AUCs of
88.08 ± 3.98 on IPD-Brain and 95.81 ± 1.78 on the TCGA-Brain dataset for three-way glioma subtype
classification. This work also highlights a significant correlation between the model’s decision-making
processes and the diagnostic reasoning of pathologists, underscoring its capability to mimic professional
diagnostic procedures.
The second part of the thesis addresses breast cancer, the most common malignancy among women.
Effective treatment of breast cancer requires precise detection and classification, particularly of IHC biomarkers like HER2, ER, and PR, which are critical for identifying breast cancer subtypes and informing treatment decisions. Traditional IHC classification relies on pathologist expertise, making it
labor-intensive and prone to high inter-pathologist variability. We curate the Indian Pathology Breast
Dataset (IPD-Breast) and develop automated deep learning models for accurate IHC subtype classification, specifically focusing on an evolving three-way HER2 classification scheme for better prognosis
while also extending to binary and four-way classifications. Our proposed end-to-end ConvNext approach which uses low resolution whole slide images outperforms existing methods. For the three-way
HER2 classification, it achieved an AUC of 91.79 ± 2.14, an F1 score of 83.52 ± 2.69, and an accuracy of 83.56 ± 2.80, outperforming existing approaches by at least 5.35% in F1 score. This study
demonstrates that simple yet effective deep learning approaches can significantly improve accuracy and
reproducibility in breast cancer classification, supporting their integration into clinical workflows for
better patient outcomes.
In summary, this thesis makes a significant contribution to the field of computational pathology by
bridging the gap between computational techniques and clinical applications. Our contributions include
the development of extensive open-source datasets for brain and breast cancer, the creation of robust
diagnostic models, and the offering of valuable guidance on effective computational approaches for
medical image analysis. By enhancing diagnostic precision and efficiency, these advancements support
personalized treatment planning and improve patient outcomes. The integration of these computational
tools into routine clinical practice holds promise for significantly advancing the field of pathology.