RViT-FusionNet: A Local Cross-Attention Feature Fusion-based Hybrid Framework for Brain Tumor Classification
Published in Neural Computing and Applications (Under Revision; Springer-Q1), 2025
This manuscript presents RViT-FusionNet, a hybrid deep learning architecture that combines convolutional neural networks (CNNs) with a refined Vision Transformer (RViT) for brain tumor classification using MRI scans. The key contribution is in the local cross-attention feature fusion mechanism, designed to capture both local spatial features and global contextual dependencies across multiple MRI modalities. Extensive experiments on benchmark datasets demonstrate that the proposed framework achieves higher accuracy and robustness compared to standard CNN, ViT, and hybrid transformer models.
*Equal contribution with first author
Recommended citation: N. Islam, S. K. Ray, M. A. Hossain et al., “RViT-FusionNet: A Local Cross-Attention Feature Fusion-based Hybrid Framework for Brain Tumor Classification,” Neural Computing and Applications.
