Hybrid Convolutional-Recurrent Neural Networks (CNN-RNN) Model with Temporal Attention and Particle Swarm Optimization for Deepfake Video Detection
Keywords:
Deepfake video detection, hybrid machine learning, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), temporal attention, Particle Swarm Optimization (PSO)Abstract
The rapid advancement of deepfake technology presents a growing threat to information integrity and online security. To address this, this research proposed an efficient deepfake video detection framework that integrates Convolutional Neural Networks (CNNs) for spatial feature extraction, Recurrent Neural Networks (RNNs) with a temporal attention mechanism for modeling sequential dependencies, and Particle Swarm Optimization (PSO) for hyperparameter tuning. The pipeline included frame extraction, face alignment, and feature processing using a pre-trained CNN, followed by an RNN that emphasizes critical temporal artifacts through attention. PSO further enhanced model performance by optimizing key hyperparameters such as learning rate and hidden dimensions. To evaluate the effectiveness of the proposed model, a comparative analysis against existing deepfake detection methods, including XceptionNet, LSTM with frame-level features, and CNN-GRU without attention, was conducted. The proposed CNN-RNN model with Temporal Attention and PSO outperformed the baselines, demonstrating the model's improved generalization and reliability, particularly in reducing false negatives, making it a robust solution for real-world media forensics and platform integrity.