A Hyper-Contextualized Audio Semantics Framework for Driver Attentiveness Disambiguation and Risk-Aware Safety Preemption

A. Hareesha; Rayalla Madhukar; G. Manideep; Naga Akshaya Kumari Kota

doi:10.64751/3n8g2y12

Authors

A. Hareesha Author
Rayalla Madhukar Author
G. Manideep Author
Naga Akshaya Kumari Kota Author

DOI:

https://doi.org/10.64751/3n8g2y12

Keywords:

Vehicle audio analysis, acoustic signal processing, intelligent transportation systems, automotive acoustics, vehicle monitoring, sound event classification, acoustic feature extraction, engine noise analysis

Abstract

Vehicle environments generate diverse acoustic signals from engines, braking systems, road interactions, and surrounding traffic. Analysing these sounds provides valuable insights into vehicle behaviour and operational status. With the rapid growth of intelligent transportation systems and datadriven automotive technologies, automated vehicle audio analysis has emerged as a key research area. Traditional monitoring systems relied on manual inspection or basic signal processing techniques, which were limited in handling large-scale audio data and complex acoustic environments. These approaches also struggle with the high volume of data generated by modern sensor-equipped vehicles, highlighting the need for intelligent, data-driven solutions. This research presents a machine learningbased vehicle audio event classification framework that integrates advanced feature extraction and classification techniques. The system processes raw audio signals and extracts deep acoustic representations using the Waveform Language Model (WavLM), a transformer-based feature extraction model. These extracted feature vectors are used to train multiple classifiers, including CatBoost Classifier (CBC), Histogram Gradient Boosting Classifier (HGBC), Extra Tree Classifier (ETC), and the proposed Tree-Based Generalized Additive Model (TGAM). The models are evaluated using standard performance metrics such as accuracy, precision, recall, and F1-score. Experimental results show that the TGAM model significantly outperforms the other classifiers. It achieves an accuracy of 99.92% for main class classification and 99.58% for sub class classification, demonstrating its effectiveness in recognising complex vehicle audio events. This framework enhances intelligent vehicle monitoring systems by enabling accurate and efficient acoustic signal analysis