An Interpretable Audio Intelligence Framework for Driver Vigilance Using WavLM-Based Deep Representations

Authors

  • Kola Rishika Author
  • Dayakar Thalla Author
  • Kiran Gadapaka Author
  • Emmadi Swathi Author
  • Lellela Vishnu Author
  • Jannu Sushman Author

DOI:

https://doi.org/10.64751/tkhpt493

Keywords:

acoustic signals, audio classification, deep learning, driver alertness, intelligent transportation, waveform modelling

Abstract

The in-vehicle environment produces a wide range of acoustic signals originating from driver actions, engine dynamics, road interactions, and surrounding traffic conditions. These audio cues provide valuable information related to driver behavior and vehicle safety. With the advancement of intelligent transportation systems, automated and real-time monitoring of driver alertness using in-vehicle audio has become increasingly important. To address these limitations, this study proposes a deep learning– based framework for in-vehicle audio event detection aimed at improving driver alertness monitoring. The system employs WavLM, a transformer-based architecture, to extract deep acoustic feature representations from raw audio signals. These features effectively capture temporal and contextual dependencies, enabling robust modeling of complex sound patterns within the vehicle environment. The extracted feature vectors are used to train multiple classifiers, including CatBoost Classifier, Histogram Gradient Boosting Classifier, Extra Trees Classifier, and a proposed Tree-Based Generalized Additive Model (TGAM). The framework supports both main-class and sub-class classification to accurately identify driver-related events such as distractions, alerts, and environmental sounds. Performance evaluation using accuracy, precision, recall, and F1-score demonstrates that the TGAM model achieves superior results, with 99.92% accuracy for main-class classification and 99.58% for sub-class classification. The proposed approach enhances the reliability and efficiency of in-vehicle audio analysis, contributing significantly to driver safety and intelligent monitoring systems.

Downloads

Published

2026-04-09

How to Cite

An Interpretable Audio Intelligence Framework for Driver Vigilance Using WavLM-Based Deep Representations. (2026). International Journal of AI Electronics and Nexus Energy, 2(2), 102-114. https://doi.org/10.64751/tkhpt493

Similar Articles

11-20 of 103

You may also start an advanced similarity search for this article.