* update 2024-11-06 06:20:45

yuriufo · Nov 5, 2024 · e851b60 · e851b60
1 parent ea407b1
commit e851b60
Show file tree

Hide file tree

Showing 2 changed files with 37 additions and 1 deletion.
diff --git a/arXiv_db/Malware/2024.md b/arXiv_db/Malware/2024.md
@@ -3334,3 +3334,39 @@
 
 </details>
 
+<details>
+
+<summary>2024-11-01 20:51:17 - AI-Enabled System for Efficient and Effective Cyber Incident Detection and Response in Cloud Environments</summary>
+
+- *Mohammed Ashfaaq M. Farzaan, Mohamed Chahine Ghanem, Ayman El-Hajjar, Deepthi N. Ratnayake*
+
+- `2404.05602v3` - [abs](http://arxiv.org/abs/2404.05602v3) - [pdf](http://arxiv.org/pdf/2404.05602v3)
+
+> The escalating sophistication and volume of cyber threats in cloud environments necessitate a paradigm shift in strategies. Recognising the need for an automated and precise response to cyber threats, this research explores the application of AI and ML and proposes an AI-powered cyber incident response system for cloud environments. This system, encompassing Network Traffic Classification, Web Intrusion Detection, and post-incident Malware Analysis (built as a Flask application), achieves seamless integration across platforms like Google Cloud and Microsoft Azure. The findings from this research highlight the effectiveness of the Random Forest model, achieving an accuracy of 90% for the Network Traffic Classifier and 96% for the Malware Analysis Dual Model application. Our research highlights the strengths of AI-powered cyber security. The Random Forest model excels at classifying cyber threats, offering an efficient and robust solution. Deep learning models significantly improve accuracy, and their resource demands can be managed using cloud-based TPUs and GPUs. Cloud environments themselves provide a perfect platform for hosting these AI/ML systems, while container technology ensures both efficiency and scalability. These findings demonstrate the contribution of the AI-led system in guaranteeing a robust and scalable cyber incident response solution in the cloud.
+
+</details>
+
+<details>
+
+<summary>2024-11-02 14:52:04 - PARIS: A Practical, Adaptive Trace-Fetching and Real-Time Malicious Behavior Detection System</summary>
+
+- *Jian Wang, Lingzhi Wang, Husheng Yu, Xiangmin Shen, Yan Chen*
+
+- `2411.01273v1` - [abs](http://arxiv.org/abs/2411.01273v1) - [pdf](http://arxiv.org/pdf/2411.01273v1)
+
+> The escalating sophistication of cyber-attacks and the widespread utilization of stealth tactics have led to significant security threats globally. Nevertheless, the existing static detection methods exhibit limited coverage, and traditional dynamic monitoring approaches encounter challenges in bypassing evasion techniques. Thus, it has become imperative to implement nuanced and dynamic analysis to achieve precise behavior detection in real time. There are two pressing concerns associated with current dynamic malware behavior detection solutions. Firstly, the collection and processing of data entail a significant amount of overhead, making it challenging to be employed for real-time detection on the end host. Secondly, these approaches tend to treat malware as a singular entity, thereby overlooking varied behaviors within one instance. To fill these gaps, we propose PARIS, an adaptive trace fetching, lightweight, real-time malicious behavior detection system. Specifically, we monitor malicious behavior with Event Tracing for Windows (ETW) and learn to selectively collect maliciousness-related APIs or call stacks, significantly reducing the data collection overhead. As a result, we can monitor a wider range of APIs and detect more intricate attack behavior. We implemented a prototype of PARIS and evaluated the system overhead, the accuracy of comparative behavior recognition, and the impact of different models and parameters. The result demonstrates that PARIS can reduce over 98.8% of data compared to the raw ETW trace and hence decreases the overhead on the host in terms of memory, bandwidth, and CPU usage with a similar detection accuracy to the baselines that suffer from the high overhead. Furthermore, a breakdown evaluation shows that 80% of the memory and bandwidth savings and a complete reduction in CPU usage can be attributed to our adaptive trace-fetching collector.
+
+</details>
+
+<details>
+
+<summary>2024-11-02 21:13:59 - Assemblage: Automatic Binary Dataset Construction for Machine Learning</summary>
+
+- *Chang Liu, Rebecca Saul, Yihao Sun, Edward Raff, Maya Fuchs, Townsend Southard Pantano, James Holt, Kristopher Micinski*
+
+- `2405.03991v2` - [abs](http://arxiv.org/abs/2405.03991v2) - [pdf](http://arxiv.org/pdf/2405.03991v2)
+
+> Binary code is pervasive, and binary analysis is a key task in reverse engineering, malware classification, and vulnerability discovery. Unfortunately, while there exist large corpora of malicious binaries, obtaining high-quality corpora of benign binaries for modern systems has proven challenging (e.g., due to licensing issues). Consequently, machine learning based pipelines for binary analysis utilize either costly commercial corpora (e.g., VirusTotal) or open-source binaries (e.g., coreutils) available in limited quantities. To address these issues, we present Assemblage: an extensible cloud-based distributed system that crawls, configures, and builds Windows PE binaries to obtain high-quality binary corpuses suitable for training state-of-the-art models in binary analysis. We have run Assemblage on AWS over the past year, producing 890k Windows PE and 428k Linux ELF binaries across 29 configurations. Assemblage is designed to be both reproducible and extensible, enabling users to publish "recipes" for their datasets, and facilitating the extraction of a wide array of features. We evaluated Assemblage by using its data to train modern learning-based pipelines for compiler provenance and binary function similarity. Our results illustrate the practical need for robust corpora of high-quality Windows PE binaries in training modern learning-based binary analyses. Assemblage code is open sourced under the MIT license, and the dataset can be downloaded from https://assemblage-dataset.net
+
+</details>
+