Hybrid CNN–Vision Transformer Architecture forAccurate Liver Cancer Diagnosis from MedicalImaging

Authors

  • Satyendra Sharma ITM (SLS) Baroda University Author
  • Pradeep Laxkar ITM (SLS) Baroda University Author

DOI:

https://doi.org/10.5281/zenodo.20047881

Keywords:

Hybrid Deep Learning, CNN, Vision Transformer, Liver Cancer, Medical Imaging, Feature Fusion, Transfer Learning

Abstract

Detecting liver cancer early remains challenging because medical images can vary widely between patients, and differences
in scan contrast are often subtle. This study describes a hybrid model that combines a CNN with a Vision Transformer, aiming to
capture both fine, local image details and broader contextual information. In this setup, the CNN is used to focus on nearby visual
signals such as edges and textures, while the transformer analyzes the full image to learn longer-range relationships between different
regions. The method is evaluated on public datasets, including LiTS and TCGA-LIHC, with consistent preprocessing applied across
all data. The reported accuracy is 94.8%, which is higher than the results from models using only a CNN or only a transformer.
These findings indicate that leveraging both local and global features may lead to better performance in liver cancer detection

Downloads

Published

2026-04-18

Issue

Section

Review Article

How to Cite

Hybrid CNN–Vision Transformer Architecture forAccurate Liver Cancer Diagnosis from MedicalImaging. (2026). Journal of Global Research in Multidisciplinary Studies(JGRMS), 2(4), 07-11. https://doi.org/10.5281/zenodo.20047881

Similar Articles

1-10 of 71

You may also start an advanced similarity search for this article.