OmniSafe

OmniSafe is a comprehensive and reliable benchmark for safe reinforcement learning, covering a multitude of SafeRL domains and delivering a new suite of testing environments.

The simulation environment around OmniSafe and a series of reliable algorithm implementations will help the SafeRL research community easier to replicate and improve the excellent work already done while also helping to facilitate the validation of new ideas and new algorithms.

Overview

Here we provide a table for comparison of OmniSafe's algorithm core and existing algorithm baseline.

SafeRL Platform	Backend	Engine	# Safe Algo.	Parallel CPU/GPU	New Gym API⁽⁴⁾	Vision Input
Safety-Gym	TF1	`mujoco-py`⁽¹⁾	3	CPU Only (`mpi4py`)	❌	minimally supported
safe-control-gym	PyTorch	PyBullet	5⁽²⁾		❌	❌
Velocity-Constraints⁽³⁾	N/A	N/A	N/A	N/A	❌	❌
mujoco-circle	PyTorch	N/A	0	N/A	❌	❌
OmniSafe	PyTorch	MuJoCo 2.3.0+	25+	`torch.distributed`	✅	✅

^{(1): Maintenance (expect bug fixes and minor updates), the last commit is 19 Nov 2021. Safety Gym depends on mujoco-py 2.0.2.7, which was updated on Oct 12, 2019.}
^{(2): We only count the safe's algorithm.}
^{(3): There is no official library for speed-related libraries, and its associated cost constraints are constructed from info. But the task is widely used in the study of SafeRL, and we encapsulate it in OmniSafe.}
^{(4): In the gym 0.26.0 release update, a new API of interaction was redefined.}

Implemented Algorithms

The supported interface algorithms currently include:

Newly Published in 2022

[AAAI 2023] Augmented Proximal Policy Optimization for Safe Reinforcement Learning (APPO) The original author of the paper contributed code
[NeurIPS 2022] Constrained Update Projection Approach to Safe Policy Optimization (CUP) The original author of the paper contributed code
[NeurIPS 2022] Effects of Safety State Augmentation on Safe Exploration (Simmer)
[NeurIPS 2022] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
[ICML 2022] Sauté RL: Almost Surely Safe Reinforcement Learning Using State Augmentation (SauteRL)
[ICML 2022] Constrained Variational Policy Optimization for Safe Reinforcement Learning (CVPO)
[IJCAI 2022] Penalized Proximal Policy Optimization for Safe Reinforcement Learning The original author of the paper contributed code
[ICLR 2022] Constrained Policy Optimization via Bayesian World Models (LA-MBDA)
[AAAI 2022] Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (CAP)

List of Algorithms

SafeRL Environments

Safety Gymnasium

We designed a variety of safety-enhanced learning tasks around the latest version of Gymnasium, including safety-run, safety-circle, safety-goal, safety-button, etc., leading to a unified safety-enhanced learning benchmark environment called safety-gymnasium.

Further, to facilitate the progress of community research, we redesigned Safety-Gym and removed the dependency on mujoco-py. We build it on top of MuJoCo, and fixed some bugs.

After careful testing, we confirmed that it has the same dynamics parameters and training environment as the original safety-gym, named safety-gymnasium.

Here is a list of all the environments we support, some of them are being tested in our baseline and we will gradually release them within a month.

Tasks	Diffcults	Agents
Safety Velocity Safety Run Safety Circle Safety Goal Safety Button Safety Push Safety Race Safety Narrow	Level-0 Level-1 Level-2	Ant-v4 Humanoid-v4 Hopper-v4 Point Car

Vision-base Safe RL

Vision-based safety reinforcement learning lacks realistic scenarios. Although the original safety-gym could minimally support visual input, the scenarios were too homogeneous. To facilitate the validation of visual-based safety reinforcement learning algorithms, we have developed a set of realistic vision-based safeRL tasks, which are currently being validated on the baseline, and we will release that part of the environment in safety-gymnasium within a month.

For the appetizer, the images are as follows

Environment Usage

Notes: We support new Gymnasium APIs.

import safety_gymnasium

env_name = 'SafetyPointGoal1-v0'
env = safety_gymnasium.make(env_name)

obs, info = env.reset()
terminated = False

while not terminated:
    act = env.action_space.sample()
    obs, reward, cost, terminated, truncated, info = env.step(act)
    env.render()

Installation

Prerequisites

OmniSafe requires Python 3.8+ and PyTorch 1.10+.

Install from source

git clone https://github.com/PKU-MARL/omnisafe
cd omnisafe
conda create -n omnisafe python=3.8
conda activate omnisafe
# Please refer to https://pytorch.org/get-started/previous-versions and install pytorch

# Install safety-gymnasium
pip install -e envs/safety-gymnasium
# Install omnisafe
pip install -e .

Examples

cd examples
python train_policy.py --env-id SafetyPointGoal1-v0 --algo PPOLag --parallel 1 --seed 0

algo: PolicyGradient, PPO, PPOLag, NaturalPG, TRPO, TRPOLag, PDO, NPGLag, CPO, PCPO, FOCOPS, CPPOPid

env-id: Safety{Robot-id}{Task-id}{0/1/2}-v0, (Robot-id: Point Car), (Task-id: Goal Push Button)

parallel: Number of parallels

Getting Started

1. Run Agent from preset yaml file

import omnisafe

env = omnisafe.Env('SafetyPointGoal1-v0')

agent = omnisafe.Agent('PPOLag', env)
agent.learn()

# obs = env.reset()
# for i in range(1000):
#     action, _states = agent.predict(obs, deterministic=True)
#     obs, reward, cost, done, info = env.step(action)
#     env.render()
#     if done:
#         obs = env.reset()
# env.close()

2. Run Agent from custom config dict

import omnisafe

env = omnisafe.Env('SafetyPointGoal1-v0')

custom_dict = {'epochs': 1, 'data_dir': './runs'}
agent = omnisafe.Agent('PPOLag', env, custom_cfgs=custom_dict)
agent.learn()

# obs = env.reset()
# for i in range(1000):
#     action, _states = agent.predict(obs, deterministic=True)
#     obs, reward, done, info = env.step(action)
#     env.render()
#     if done:
#         obs = env.reset()
# env.close()

3. Run Agent from custom terminal config

cd examples
python train_on_policy.py --env-id SafetyPointGoal1-v0 --algo PPOLag --parallel 5 --epochs 1

The OmniSafe Team

OmniSafe is currently maintained by Borong Zhang, Jiayi Zhou, JTao Dai, Weidong Huang, Ruiyang Sun ,Xuehai Pan, Jiamg Ji and under the instruction of Prof. Yaodong Yang. If you have any question in the process of using omnisafe, don't hesitate to ask your question in the GitHub issue page, we will reply you in 2-3 working days.

License

OmniSafe is released under Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
docs		docs
envs/safety-gymnasium		envs/safety-gymnasium
examples		examples
images		images
omnisafe		omnisafe
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
conda-recipe.yaml		conda-recipe.yaml
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmniSafe

Table of Contents

Overview

Implemented Algorithms

Newly Published in 2022

List of Algorithms

On-Policy Safe

Off-Policy Safe

Model-Based Safe

Offline Safe

Others

SafeRL Environments

Safety Gymnasium

Vision-base Safe RL

Environment Usage

Installation

Prerequisites

Install from source

Examples

Getting Started

1. Run Agent from preset yaml file

2. Run Agent from custom config dict

3. Run Agent from custom terminal config

The OmniSafe Team

License

About

Releases

Packages

Languages

License

rockmagma02/omnisafe

Folders and files

Latest commit

History

Repository files navigation

OmniSafe

Table of Contents

Overview

Implemented Algorithms

Newly Published in 2022

List of Algorithms

On-Policy Safe

Off-Policy Safe

Model-Based Safe

Offline Safe

Others

SafeRL Environments

Safety Gymnasium

Vision-base Safe RL

Environment Usage

Installation

Prerequisites

Install from source

Examples

Getting Started

1. Run Agent from preset yaml file

2. Run Agent from custom config dict

3. Run Agent from custom terminal config

The OmniSafe Team

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages