Significant API Calls in Android Malware Detection (Using Feature Selection Techniques and Correla-tion Based Feature Elimination)

Published in In Proceedings of the 32nd International Conference on Software Engineering Knowledge Engineering, 2020 (SEKE 2020), 2020

Recommended citation: Galib, A. H., Hossain, B. M. (2020, July). Significant API Calls in Android Malware Detection (Using Feature Selection Techniques and Correlation Based Feature Elimination). In Proceedings of the 32nd International Conference on Software Engineering Knowledge Engineering (pp.566-571).

[PDF]

Abstract

Android API Calls are an important factor in differentiating malware from benign applications. Due to the increasing number of API Calls and considering computational complexity, the number of API calls in Android malware detection should be assessed and reduced without affecting detection performance. This study tries to figure out a feature reduction approach for identifying significant API Calls in Android malware detection. It incrementally analyzes various feature selection techniques to find out the minimal feature set and the most suitable technique. Also, it incorporated a correlation-based feature elimination strategy for further reduction of API Calls. Experiments on two benchmark datasets show that the Recursive Feature Elimination with Random Forest Classifier causes the minimal number of API Calls. Evaluation results indicate that the reduced set of significant API Calls (SigAPI) will perform relatively close to the full set of features in terms of accuracy, accuracy, recall, f-1 performance, AUC, and execution time. It also compares the performance with the existing malware detection works and the SigAPI outperforms most of the work regarding malware detection rate. Furthermore, it reports the top significant API Calls in malware detection. Finally, this work suggests that reduced features set of significant API Calls would be useful in classifying Android malware effectively.

Keywords

Significant API Calls, Android Malware Detection, Feature Selection