PySATL-CPD

PySATL Change point detection subproject (abbreviated pysatl-cpd) is a module, designed for detecting anomalies in time series data, which refer to significant deviations from expected patterns or trends. Anomalies can indicate unusual events or changes in a system, making them crucial for monitoring and analysis in various fields such as finance, healthcare, and network security.

At the moment, the module implements the following CPD algorithms:

Bayesian algorithm (scrubbing, online and linear heuristic online versions)
Density based algorithms:
- KLIEP
- RuLSIF
Graph algorithm
k-NN based algorithm
Algorithms, based on classifiers:
- SVM
- KNN
- Decision Tree
- Logistic Regression
- Random Forest

Requirements

Python 3.10+
Poetry 2.1.0+

Installation

Clone repository:

git clone https://github.com/PySATL/pysatl-cpd.git

Linux

Go to repository directory and run:

cd pysatl-cpd
poetry install

Windows

Check if python and poetry in PATH (Optional)

py --version
poetry --version

If python not in the PATH, reinstall it and make sure that you checked the box "Add Python to PATH"

If poetry not in the PATH, add it by running:

setx PATH "%PATH%;%APPDATA%\Python\Scripts"

After that, do not forget to restart the terminal.

Go to repository folder and run:

Set-Location pysatl-cpd
poetry install

Change point detection example:

from pathlib import Path

from pysatl_cpd.labeled_data import LabeledCpdData

# import change point detection solver
from pysatl_cpd.online_cpd_solver import OnlineCpdSolver
from pysatl_cpd.core.problem import CpdProblem

# import algorithm
from pysatl_cpd.core.algorithms.bayesian_online_algorithm import BayesianOnline
from pysatl_cpd.core.algorithms.bayesian.likelihoods.gaussian_conjugate import GaussianConjugate
from pysatl_cpd.core.algorithms.bayesian.hazards.constant import ConstantHazard
from pysatl_cpd.core.algorithms.bayesian.detectors.threshold import ThresholdDetector
from pysatl_cpd.core.algorithms.bayesian.localizers.argmax import ArgmaxLocalizer


labeled_data = LabeledCpdData.generate_cp_datasets(Path("examples/configs/test_config_exp.yml"))["example"]

# specify CPD algorithm with parameters
algorithm = BayesianOnline(
    learning_sample_size=5,
    likelihood=GaussianConjugate(),
    hazard=ConstantHazard(rate=1.0 / (1.0 - 0.5 ** (1.0 / 500))),
    detector=ThresholdDetector(threshold=0.005),
    localizer=ArgmaxLocalizer(),
)
# make a solver object
solver = OnlineCpdSolver(CpdProblem(True), algorithm, labeled_data)


# then run algorithm
cpd_results = solver.run()

# print the results
print(cpd_results)
# output:
# Located change points: (200;400)
# Expected change point: (200;400)
# Difference: ()
# Computation time (sec): 0.2

# visualize data with located changepoints
cpd_results.visualize()

Experiment pipeline example:

from pathlib import Path

from benchmarking.pipeline.pipeline import Pipeline
from benchmarking.steps.data_generation_step.data_generation_step import DataGenerationStep
from benchmarking.steps.data_generation_step.data_handlers.generators.cpd_generator import CpdGenerator
from benchmarking.steps.experiment_execution_step.experiment_execution_step import ExperimentExecutionStep
from benchmarking.steps.experiment_execution_step.workers.run_complete_algorithm_worker import (
  RunCompleteAlgorithmWorker,
)
from benchmarking.steps.report_generation_step.report_builders.change_point_builder import CpBuilder
from benchmarking.steps.report_generation_step.report_generation_step import ReportGenerationStep
from benchmarking.steps.report_generation_step.report_visualizers.change_point_text_visualizer import CpTextVisualizer
from benchmarking.steps.report_generation_step.reporters.reporter import Reporter
from pysatl_cpd.core.algorithms.bayesian.detectors.threshold import ThresholdDetector
from pysatl_cpd.core.algorithms.bayesian.hazards.constant import ConstantHazard
from pysatl_cpd.core.algorithms.bayesian.likelihoods.heuristic_gaussian_vs_exponential import (
  HeuristicGaussianVsExponential,
)
from pysatl_cpd.core.algorithms.bayesian.localizers.argmax import ArgmaxLocalizer
from pysatl_cpd.core.algorithms.bayesian_algorithm import BayesianAlgorithm

# Generate data with example config and save as my_experiment_dataset
generator = CpdGenerator(
  name="cpd_generator", output_storage_names={"example"}, config=Path("examples/configs/test_config_exp.yml")
)
step_1 = DataGenerationStep(
  data_handler=generator,
  name="cpd_generation_test_config_exp_step",
  output_storage_names={"example": "my_experiment_dataset"},
)

# Initialize BayesianAlgorithm and run with generated data
algorithm = BayesianAlgorithm(
  learning_steps=5,
  likelihood=HeuristicGaussianVsExponential(),
  hazard=ConstantHazard(rate=1.0 / (1.0 - 0.5 ** (1.0 / 500))),
  detector=ThresholdDetector(threshold=0.005),
  localizer=ArgmaxLocalizer(),
)
algo_worker = RunCompleteAlgorithmWorker(algorithm=algorithm, name="run_bayesian_algorithm_worker")
step_2 = ExperimentExecutionStep(
  worker=algo_worker, name="run_bayesian_algorithm_step", input_storage_names={"my_experiment_dataset": "dataset"}
)

# Generate text report with change points from Result Storage
builder = CpBuilder()
visualizer = CpTextVisualizer(file_name="my_experiment_change_points_report")
reporter = Reporter(builder, visualizer, name="text_reporter")
step_3 = ReportGenerationStep(reporter, name="ReportGeneration", input_storage_names={"change_points"})

# configure pipeline and start the experiment
steps = [step_1, step_2, step_3]
pipeline = Pipeline(steps)
pipeline.run()

output in results/my_experiment_change_points_report.txt:

Located change points: [25, 201, 396]

Development

If you want to contribute, you should create development environment as follows:

poetry install --with-dev

Pre-commit

Install pre-commit hooks:

poetry run pre-commit install

Starting manually:

poetry run pre-commit run --all-files --color always --verbose --show-diff-on-failure

License

This project is licensed under the terms of the MIT license. See the LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 370 Commits
.github		.github
assets		assets
benchmarking		benchmarking
docs		docs
examples		examples
pysatl_cpd		pysatl_cpd
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
example.py		example.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PySATL-CPD

Requirements

Installation

Linux

Windows

Check if python and poetry in PATH (Optional)

Change point detection example:

Experiment pipeline example:

Development

Pre-commit

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 10

Uh oh!

Languages

License

PySATL/pysatl-cpd

Folders and files

Latest commit

History

Repository files navigation

PySATL-CPD

Requirements

Installation

Linux

Windows

Check if python and poetry in PATH (Optional)

Change point detection example:

Experiment pipeline example:

Development

Pre-commit

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 10

Uh oh!

Languages

Packages