Published · 4th IEEE World Conference on Applied Intelligence and Computing (AIC 2025)
Shubham Gajjar, Om Rathod, Deep Joshi, Harshal Joshi, Vishal Barot
Hybrid architecture combining a frozen ResNet50 backbone (feature extractor) with Vision Transformer blocks using a four-head multi-head self-attention mechanism, Global Average Pooling, and transformer-based global dependency modeling for seven-class skin lesion classification (melanoma, nevus, basal cell carcinoma, benign keratosis, dermatofibroma, actinic keratosis, vascular lesions). Achieves 96.3% test accuracy, macro F1 0.961, and AUC ~1.00 across all seven classes on HAM10000. Applies stratified data augmentation (rotation ±20°, horizontal/vertical flips, brightness ±25 / contrast ±10%) to address class imbalance, scaling 10,015 source images to ~74,353. Trained with the Nadam optimizer (lr 0.001) and sparse categorical crossentropy on a 70/15/15 stratified split (7,010 / 1,503 / 1,502).
Computer VisionDeep LearningMedical AISkin CancerResNetVision TransformerIEEE Xplore →GitHub →DOI: 10.1109/AIC66080.2025.11212073 Under Review · Manuscript · July 2025
Shubham Gajjar, Deep Joshi, Avi Poptani, Vishal Barot
Hybrid segmentation framework integrating a pretrained VGG16 encoder with a Multi-Channel Attention decoder for brain tumor segmentation in FLAIR MRI. Applies Focal Tversky Loss to address severe class imbalance (tumor regions ~2–5% of total image area), with ensemble learning over multiple checkpoints and architectural configurations. Skip connections, batch normalization, and dropout regularize against overfitting. Achieves 99.59% accuracy and 99.71% specificity on the LGG Brain MRI Segmentation dataset (110 low-grade glioma patients, TCGA). Trained with Adam (lr 0.05), ReduceLROnPlateau, and EarlyStopping; 35 systematic experiments compare against standard UNet, Attention UNet, and Scaler Attention UNet. Preprocessing pipeline: skull stripping, intensity normalization, 256×256 resize. Improves Dice and IoU with enhanced boundary delineation.
Computer VisionDeep LearningMedical AIBrain TumorUNetFLAIR MRI
Under Review · Manuscript · 2025
Shubham Gajjar, Om Rathod, Deep Joshi, Harshal Joshi, Vishal Barot
Two-stage pipeline integrating a U-Net++ hair segmentation model with an Extended ResNet50 classifier featuring a novel Inverse Soft Mask Attention mechanism. Dense residual blocks and Squeeze-and-Excitation modules with learnable weighted feature aggregation combine hair-occluded and unoccluded image regions. Achieves 97.89% test accuracy, 99.67% train, and 97.74% validation at epoch 22 on HAM10000 (10,015 dermoscopic images, seven classes). Trained with Nadam + Cosine Decay Restarts and sparse categorical crossentropy. Systematic experimentation across 21 architectural trials covering Vision Transformers, hybrid models, and custom attention mechanisms. Outperforms SCCNet (95.20%), VCCINet (93.18%), and SPCB-Net (97.10%).
Computer VisionMedical AISkin CancerAttention MechanismsResNetDermatoscopic