* update 2024-12-13 06:22:29

yuriufo · Dec 12, 2024 · f66a308 · f66a308
1 parent 9354839
commit f66a308
Show file tree

Hide file tree

Showing 2 changed files with 25 additions and 1 deletion.
diff --git a/arXiv_db/Malware/2024.md b/arXiv_db/Malware/2024.md
@@ -3818,3 +3818,27 @@
 
 </details>
 
+<details>
+
+<summary>2024-12-10 20:17:49 - PBP: Post-training Backdoor Purification for Malware Classifiers</summary>
+
+- *Dung Thuy Nguyen, Ngoc N. Tran, Taylor T. Johnson, Kevin Leach*
+
+- `2412.03441v3` - [abs](http://arxiv.org/abs/2412.03441v3) - [pdf](http://arxiv.org/pdf/2412.03441v3)
+
+> In recent years, the rise of machine learning (ML) in cybersecurity has brought new challenges, including the increasing threat of backdoor poisoning attacks on ML malware classifiers. For instance, adversaries could inject malicious samples into public malware repositories, contaminating the training data and potentially misclassifying malware by the ML model. Current countermeasures predominantly focus on detecting poisoned samples by leveraging disagreements within the outputs of a diverse set of ensemble models on training data points. However, these methods are not suitable for scenarios where Machine Learning-as-a-Service (MLaaS) is used or when users aim to remove backdoors from a model after it has been trained. Addressing this scenario, we introduce PBP, a post-training defense for malware classifiers that mitigates various types of backdoor embeddings without assuming any specific backdoor embedding mechanism. Our method exploits the influence of backdoor attacks on the activation distribution of neural networks, independent of the trigger-embedding method. In the presence of a backdoor attack, the activation distribution of each layer is distorted into a mixture of distributions. By regulating the statistics of the batch normalization layers, we can guide a backdoored model to perform similarly to a clean one. Our method demonstrates substantial advantages over several state-of-the-art methods, as evidenced by experiments on two datasets, two types of backdoor methods, and various attack configurations. Notably, our approach requires only a small portion of the training data -- only 1\% -- to purify the backdoor and reduce the attack success rate from 100\% to almost 0\%, a 100-fold improvement over the baseline methods. Our code is available at \url{https://github.com/judydnguyen/pbp-backdoor-purification-official}.
+
+</details>
+
+<details>
+
+<summary>2024-12-11 16:25:06 - Image-Based Malware Classification Using QR and Aztec Codes</summary>
+
+- *Atharva Khadilkar, Mark Stamp*
+
+- `2412.08514v1` - [abs](http://arxiv.org/abs/2412.08514v1) - [pdf](http://arxiv.org/pdf/2412.08514v1)
+
+> In recent years, the use of image-based techniques for malware detection has gained prominence, with numerous studies demonstrating the efficacy of deep learning approaches such as Convolutional Neural Networks (CNN) in classifying images derived from executable files. In this paper, we consider an innovative method that relies on an image conversion process that consists of transforming features extracted from executable files into QR and Aztec codes. These codes capture structural patterns in a format that may enhance the learning capabilities of CNNs. We design and implement CNN architectures tailored to the unique properties of these codes and apply them to a comprehensive analysis involving two extensive malware datasets, both of which include a significant corpus of benign samples. Our results yield a split decision, with CNNs trained on QR and Aztec codes outperforming the state of the art on one of the datasets, but underperforming more typical techniques on the other dataset. These results indicate that the use of QR and Aztec codes as a form of feature engineering holds considerable promise in the malware domain, and that additional research is needed to better understand the relative strengths and weaknesses of such an approach.
+
+</details>
+