A Robust Cross-Platform Deepfake DetectionFramework Using Multi-Modal Deep Learning and Explainable AI
DOI:
https://doi.org/10.5281/zenodo.20265382Keywords:
deepfake detection, multi-modal learning, explainable AI, cross-dataset generalization, CNN-Transformer, biological signalsAbstract
The Deepfake technology presents unprecedented challenges to the authenticity of digital media. The paper gives a presentation of a top-down detection systems that combine lightweight CNN Transformer hybrids, multi-modal combination of visual, audio and biological evidence, and universal forensic clues. Our systematic review of 21 state-of-the-art methods reveals key barriers to deployment, and offers solutions that achieve 96.8% accuracy with 88.5% cross-dataset generalization – a 25–30% improvement over state-of-the-art methods in existence today. The framework includes explainable AI elements that produce transparent decisions that can be used in forensic applications, with inference latency of 200 ms to produce transparent decisions suitable to be used in forensic applications.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Sweta Sharma, Pradeep Chouksey, Aastha Rathi, Parveen Sadotra, Mayank Chopra (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
