Smart Detection of Automated Social Media Accounts Using Feature Optimization and Ensemble Learning Techniques
DOI:
https://doi.org/10.64751/5jg4ec95Keywords:
Interpretable AI, social network, bot detection, fake followers, spambotsAbstract
Social networking services like Twitter facilitate extensive human contact but are increasingly infiltrated by automated accounts that emulate human behavior, disseminating misinformation and influencing public opinion. Identifying such spambots is essential for preserving information integrity; however, numerous traditional methods depend on opaque, black-box models, which restrict transparency and interpretability. This research utilizes the Cresci15 and Cresci-17 datasets to examine interpretable machine learning methods for detecting spambots and fraudulent followers. Both feature-based and text-based data are employed, using preprocessing techniques such as normalization, tokenization, and the elimination of extraneous content. Recursive Feature Elimination (RFE) decreases feature dimensionality, whereas resampling techniques like SMOTE and SMOTEENN rectify class imbalance. A variety of machine learning methods, such as Decision Tree, Random Forest, SVM, XGBoost, AdaBoost, Stacking Classifier, and Voting Classifier, are assessed. The results indicate that the Stacking Classifier attains exceptional performance, achieving 99.9% accuracy on the Cresci-15 dataset and 99.5% accuracy on the Cresci-17 dataset. Moreover, explainable AI techniques like LIME and SHAP offer explicit insights into feature significance, hence improving model transparency and facilitating informed decision-making. These findings underscore the efficacy of integrating feature selection, sophisticated resampling, and ensemble learning methodologies with interpretable techniques for the reliable identification of automated accounts in social networks
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.







