You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -22,6 +22,8 @@ These algorithms will make it easier for the research community and industry to
22
22
**The performance of each algorithm was tested** (see *Results* section in their respective page),
23
23
you can take a look at the issues [#48](https://github.com/DLR-RM/stable-baselines3/issues/48) and [#49](https://github.com/DLR-RM/stable-baselines3/issues/49) for more details.
24
24
25
+
We also provide detailed logs and reports on the [OpenRL Benchmark](https://wandb.ai/openrlbenchmark/sb3) platform.
@@ -41,7 +43,13 @@ you can take a look at the issues [#48](https://github.com/DLR-RM/stable-baselin
41
43
42
44
### Planned features
43
45
44
-
Please take a look at the [Roadmap](https://github.com/DLR-RM/stable-baselines3/issues/1) and [Milestones](https://github.com/DLR-RM/stable-baselines3/milestones).
46
+
Since most of the features from the [original roadmap](https://github.com/DLR-RM/stable-baselines3/issues/1) have been implemented, there are no major changes planned for SB3, it is now *stable*.
47
+
If you want to contribute, you can search in the issues for the ones where [help is welcomed](https://github.com/DLR-RM/stable-baselines3/labels/help%20wanted) and the other [proposed enhancements](https://github.com/DLR-RM/stable-baselines3/labels/enhancement).
48
+
49
+
While SB3 development is now focused on bug fixes and maintenance (doc update, user experience, ...), there is more active development going on in the associated repositories:
50
+
- newer algorithms are regularly added to the [SB3 Contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib) repository
51
+
- faster variants are developed in the [SBX (SB3 + Jax)](https://github.com/araffin/sbx) repository
52
+
- the training framework for SB3, the RL Zoo, has an active [roadmap](https://github.com/DLR-RM/rl-baselines3-zoo/issues/299)
45
53
46
54
## Migration guide: from Stable-Baselines (SB2) to Stable-Baselines3 (SB3)
We implement experimental features in a separate contrib repository: [SB3-Contrib](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib)
81
89
82
-
This allows SB3 to maintain a stable and compact core, while still providing the latest features, like Recurrent PPO (PPO LSTM), Truncated Quantile Critics (TQC), Quantile Regression DQN (QR-DQN) or PPO with invalid action masking (Maskable PPO).
90
+
This allows SB3 to maintain a stable and compact core, while still providing the latest features, like Recurrent PPO (PPO LSTM), CrossQ, Truncated Quantile Critics (TQC), Quantile Regression DQN (QR-DQN) or PPO with invalid action masking (Maskable PPO).
83
91
84
92
Documentation is available online: [https://sb3-contrib.readthedocs.io/](https://sb3-contrib.readthedocs.io/)
85
93
@@ -97,17 +105,16 @@ It provides a minimal number of features compared to SB3 but can be much faster
97
105
### Prerequisites
98
106
Stable Baselines3 requires Python 3.8+.
99
107
100
-
#### Windows 10
108
+
#### Windows
101
109
102
110
To install stable-baselines on Windows, please look at the [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/install.html#prerequisites).
103
111
104
112
105
113
### Install using pip
106
114
Install the Stable Baselines3 package:
115
+
```sh
116
+
pip install 'stable-baselines3[extra]'
107
117
```
108
-
pip install stable-baselines3[extra]
109
-
```
110
-
**Note:** Some shells such as Zsh require quotation marks around brackets, i.e. `pip install 'stable-baselines3[extra]'` ([More Info](https://stackoverflow.com/a/30539963)).
111
118
112
119
This includes an optional dependencies like Tensorboard, OpenCV or `ale-py` to train on atari games. If you do not need those, you can use:
113
120
```sh
@@ -177,6 +184,7 @@ All the following examples can be executed online using Google Colab notebooks:
Copy file name to clipboardExpand all lines: docs/modules/ppo.rst
+17Lines changed: 17 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,6 +88,23 @@ Train a PPO agent on ``CartPole-v1`` using 4 environments.
88
88
vec_env.render("human")
89
89
90
90
91
+
.. note::
92
+
93
+
PPO is meant to be run primarily on the CPU, especially when you are not using a CNN. To improve CPU utilization, try turning off the GPU and using ``SubprocVecEnv`` instead of the default ``DummyVecEnv``:
94
+
95
+
.. code-block::
96
+
97
+
from stable_baselines3 import PPO
98
+
from stable_baselines3.common.env_util import make_vec_env
99
+
from stable_baselines3.common.vec_env import SubprocVecEnv
For more information, see :ref:`Vectorized Environments <vec_env>`, `Issue #1245 <https://github.com/DLR-RM/stable-baselines3/issues/1245#issuecomment-1435766949>`_ or the `Multiprocessing notebook <https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/sb3/multiprocessing_rl.ipynb>`_.
0 commit comments