HyperGLLM: An Efficient Framework for Endpoint Threat Detection via Hypergraph-Enhanced Large Language Models

An efficient framework that introduces hypergraph reasoning into LLMs for malicious behavior detection in EDR logs.

Overview | Features | Changelog | Quick-Start | Dataset | FAQs

Overview

HyperGLLM is a novel detection framework that introduces hypergraph reasoning into LLMs. It first constructs an attribute-value level relation-aware graph to model low-order structural semantics while reducing textual redundancy. Then, it introduces a differential hypergraph module with multi-granularity clustering to capture high-order behavioral dependencies embedded in interleaved events and reinforce threat semantics. Finally, the hypergraph representations are aligned with an LLM for efficient contextual reasoning over potential malicious behaviors. We curate EDR3.6B-63F, a large-scale EDR dataset containing 3.6 billion events across 63 distinct behavior families. Extensive experiments demonstrate that HyperGLLM significantly outperforms state-of-the-art methods by reducing the false alarm rate to 1.67%, achieving 94.65% accuracy across 63 behavior families, and improving the modeling efficiency of LLMs on long EDR logs. Our framework and dataset provide a solid foundation for future research and support the development of advanced detection solutions in endpoint security.

✨ Features

Framework: We propose HyperGLLM, an efficient framework that introduces hypergraph reasoning into LLMs for malicious behavior detection in EDR logs, capturing both structural semantics and long-range temporal dependencies.
Structural Semantics & Temporal Dependencies: We design an attribute-value level relation-aware graph and a differential hypergraph module with multi-granularity clustering to jointly model low- and high- order behavior semantics, thereby enhancing the semantic representation of threat behaviors.
EDR3.6B-63F Dataset: We construct EDR3.6B-63F, a large-scale EDR dataset that serves as a high-quality benchmark for advancing AI-driven research in endpoint security, offering diverse behavior types and detailed event records.
Effective and Efficient: Extensive experiments demonstrate that HyperGLLM consistently outperforms state-of-the-art baselines across multiple metrics while maintaining high inference efficiency.

📃 Changelog

[23/10/25] The project launched!

🚀 Quick Start

We conduct training on eight NVIDIA H100 (80GB) GPUs and perform evaluation on a single GPU.

Directory Overview

requirements.txt — Lists all dependencies required to reproduce this project.
datasets/ — Contains all datasets used in both the main experiments and the ablation study described in the paper.
experiments_main/ — Source code for reproducing the main experiments.
experiments_ablation/ — Source code for reproducing the ablation study experiments in the paper.
experiments_appendix/ — Code used for the appendix experiments included in the paper.

Install Dependencies

To reproduce our work, you need to have Python installed along with the required libraries. You can install the necessary dependencies using the following command:

pip install -r requirements.txt

Reproducing the Main Experiment

To reproduce the main experimental results:

cd to the main experiment directory:

cd experiments_main

Run the training and inference script:

sh run.sh

Obtain evaluation metrics:

python get_metrics.py

You can also obtain runtime efficiency metrics (GPU memory usage and Time-to-First-Token) by running:

python get_gpumu_tps.py

This provides performance metrics on an input of 1,024K tokens.

*To reproduce other experiments (e.g., the ablation study), cd to the corresponding directory (e.g., cd experiments_ablation/Analysis_DHGNN), run sh run.sh for training and inference, and obtain evaluation metrics with python get_metrics.py.

💾 Dataset

EDR3.6B-63F Dataset: this repo.

🙌 FAQs

🔖 License

Our project is licensed under the MIT License.

Citation

If you use the EDR3.6B-63F dataset in your research or find our method HyperGLLM inspiring, please consider citing our paper:

@inproceedings{
  title     = {HyperGLLM: An Efficient Framework for Endpoint Threat Detection via Hypergraph-Enhanced Large Language Models},
  year      = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
experiments_ablation		experiments_ablation
experiments_main		experiments_main
Method_Overview.png		Method_Overview.png
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HyperGLLM: An Efficient Framework for Endpoint Threat Detection via Hypergraph-Enhanced Large Language Models

Overview | Features | Changelog | Quick-Start | Dataset | FAQs

Overview

✨ Features

📃 Changelog

🚀 Quick Start

💾 Dataset

🙌 FAQs

🔖 License

Citation

About

Uh oh!

Releases

Packages

Languages

Qihoo360/HyperGLLM

Folders and files

Latest commit

History

Repository files navigation

HyperGLLM: An Efficient Framework for Endpoint Threat Detection via Hypergraph-Enhanced Large Language Models

Overview | Features | Changelog | Quick-Start | Dataset | FAQs

Overview

✨ Features

📃 Changelog

🚀 Quick Start

💾 Dataset

🙌 FAQs

🔖 License

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages