Skip to content

Commit

Permalink
* update 2025-01-11 06:20:27
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Jan 10, 2025
1 parent af86fc6 commit 45a051e
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 1 deletion.
24 changes: 24 additions & 0 deletions arXiv_db/Malware/2025.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,27 @@

</details>

<details>

<summary>2025-01-08 21:22:45 - Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware</summary>

- *Brandon J Walton, Mst Eshita Khatun, James M Ghawaly, Aisha Ali-Gombe*

- `2501.04848v1` - [abs](http://arxiv.org/abs/2501.04848v1) - [pdf](http://arxiv.org/pdf/2501.04848v1)

> Malware analysis is a complex process of examining and evaluating malicious software's functionality, origin, and potential impact. This arduous process typically involves dissecting the software to understand its components, infection vector, propagation mechanism, and payload. Over the years, deep reverse engineering of malware has become increasingly tedious, mainly due to modern malicious codebases' fast evolution and sophistication. Essentially, analysts are tasked with identifying the elusive needle in the haystack within the complexities of zero-day malware, all while under tight time constraints. Thus, in this paper, we explore leveraging Large Language Models (LLMs) for semantic malware analysis to expedite the analysis of known and novel samples. Built on GPT-4o-mini model, \msp is designed to augment malware analysis for Android through a hierarchical-tiered summarization chain and strategic prompt engineering. Additionally, \msp performs malware categorization, distinguishing potential malware from benign applications, thereby saving time during the malware reverse engineering process. Despite not being fine-tuned for Android malware analysis, we demonstrate that through optimized and advanced prompt engineering \msp can achieve up to 77% classification accuracy while providing highly robust summaries at functional, class, and package levels. In addition, leveraging the backward tracing of the summaries from package to function levels allowed us to pinpoint the precise code snippets responsible for malicious behavior.

</details>

<details>

<summary>2025-01-09 17:21:00 - Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic</summary>

- *Sileshi Nibret Zeleke, Amsalu Fentie Jember, Mario Bochicchio*

- `2501.05387v1` - [abs](http://arxiv.org/abs/2501.05387v1) - [pdf](http://arxiv.org/pdf/2501.05387v1)

> Encrypted network communication ensures confidentiality, integrity, and privacy between endpoints. However, attackers are increasingly exploiting encryption to conceal malicious behavior. Detecting unknown encrypted malicious traffic without decrypting the payloads remains a significant challenge. In this study, we investigate the integration of explainable artificial intelligence (XAI) techniques to detect malicious network traffic. We employ ensemble learning models to identify malicious activity using multi-view features extracted from various aspects of encrypted communication. To effectively represent malicious communication, we compiled a robust dataset with 1,127 unique connections, more than any other available open-source dataset, and spanning 54 malware families. Our models were benchmarked against the CTU-13 dataset, achieving performance of over 99% accuracy, precision, and F1-score. Additionally, the eXtreme Gradient Boosting (XGB) model demonstrated 99.32% accuracy, 99.53% precision, and 99.43% F1-score on our custom dataset. By leveraging Shapley Additive Explanations (SHAP), we identified that the maximum packet size, mean inter-arrival time of packets, and transport layer security version used are the most critical features for the global model explanation. Furthermore, key features were identified as important for local explanations across both datasets for individual traffic samples. These insights provide a deeper understanding of the model decision-making process, enhancing the transparency and reliability of detecting malicious encrypted traffic.

</details>

Loading

0 comments on commit 45a051e

Please sign in to comment.