DEEP FAKE AUDIO DETECTION USING DEEP LEARNING
DOI:
https://doi.org/10.64751/29zw3c60Abstract
The rapid advancement of artificial intelligence has enabled the creation of highly realistic synthetic speech, commonly known as deep fake audio, which poses significant risks to security, privacy, and digital trust. Deep fake audio can be used for misinformation, identity fraud, and unauthorized voice cloning, making reliable detection methods essential. This research proposes a deep learning-based framework for detecting fake audio by analyzing subtle acoustic patterns and inconsistencies that are difficult for humans to identify. The proposed system utilizes advanced neural network architectures such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), or Transformer-based models to learn discriminative features from audio spectrograms and raw waveform signals. Preprocessing techniques including noise reduction, feature extraction (MFCC, Mel-spectrogram), and normalization are applied to enhance model performance. The model is trained and evaluated on benchmark datasets containing both real and synthetic speech samples. Experimental results demonstrate that deep learning models can effectively differentiate between genuine and manipulated audio with high accuracy, precision, and recall. The system shows strong potential for real-time applications in cybersecurity, digital forensics, and social media monitoring. This work highlights the importance of combining robust feature engineering with advanced deep learning architectures to combat the growing threat of AI-generated audio manipulation.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.







