Building on ARTIFACT 1 we are going to analyze the deviations in the syscall sequences under different behavioral scenrios such as privilege commands and the commands used for environment probing are used and compare them with normal execution, to to quantify the deviations using mismatch and coverage metrics
By tracing syscall execution and transforming them into structured datasets, this artifact tries to identify patterns that characterize normal system activity and potentially anomalous behavior.
The syscalls collected for this artifact are mainly focussed on scenarios designed to simulate privilege-impacting and exploratory behaviors covering both single-command and multi-command execution modes
Behavior is analyzed using sliding window-based syscall sequence comparison. Sequences are evaluated using mismatch and coverage metrics relative to baseline behavior
- Capture syscall traces under predefined behavioral scenarios (normal, recon, priv)
- Extract structured telemetry from raw syscall logs.
- Compare them against baseline syscall traces from normal execution
- Identify syscall distribution differences between normal and modified program behavior(priv and recon scenarios under single-command and multiple-command execution modes).
- Visualize behavioral differences through graphical analysis.
Environment
- Linux (Ubuntu)
- Python 3
- strace syscall tracing tool
Tools Used
- strace – intercepts system calls made by running processes.
- Python scripts – parse syscall logs and generate structured datasets.
- Matplotlib – used for visualizing syscall behavior.
System calls were collected by executing programs under the strace monitoring environment:
strace -o trace.log
Each trace contains detailed telemetry including:
- syscall name
- timestamp
- return values
- arguments
These traces represent low-level behavioral telemetry generated by program execution.
Data Processing Pipeline
Raw syscall logs were processed through the following pipeline:
- Collect raw syscall traces.
- Parse syscall entries using Python scripts.
- Extract relevant behavioral attributes.
- Convert extracted data into CSV format.
- Generate behavioral datasets for visualization.
The collected telemetry was analyzed to observe:
- syscall frequency distribution
- behavioral variations between processes
- differences in system resource interaction patterns
This analysis helps reveal how program behavior can be represented through syscall telemetry patterns.
These observations are derived from sequence-level comparisons using sliding window-based mismatch and coverage metrics
Behavioral patterns were visualized using two plots:
Coverage Plot - proportion of sequences matching baseline behavior
Mismatch Plot - highlights deviations between expected and observed behavior patterns. These visualizations provide an interpretable view of behavioral differences in system execution.
- System call telemetry provides a consistent behavioral fingerprint of running processes.
- Behavioral patterns vary depending on program execution context.
- Structured telemetry analysis can reveal subtle differences in process behavior.
To reproduce this experiment:
- Run the target program under "strace".
- Parse syscall logs using the provided Python scripts.
- Generate structured datasets.
- Produce behavioral plots.
The derived behavioral interpretation:-
- High priv + low recon -> may indicate stronger anomalous behavior
- Low priv + low recon -> may indicate beingn or low-signal behavior
- High priv + high recon -> may indicate controlled or expected privileged behavior