SofaDefrost · hunoutl · Nov 23, 2022 · Nov 30, 2022 · Jan 6, 2023 · Jan 9, 2023
diff --git a/README.md b/README.md
@@ -1,8 +1,8 @@
 # SofaGym
 
-Software toolkit to easily create an [OpenAI Gym](https://github.com/openai/gym) environment out of any [SOFA](https://github.com/sofa-framework/sofa) scene.
+Software toolkit to easily create an [Gymnasium](https://github.com/Farama-Foundation/Gymnasium) (previously Gym) environment out of any [SOFA](https://github.com/sofa-framework/sofa) scene.
 
-The toolkit provides an API based on the standard OpenAI Gym API, allowing to train classical Reinforcement Learning algorithms. 
+The toolkit provides an API based on the standard Gymnasium API, allowing to train classical Reinforcement Learning algorithms. 
 
 The toolkit also comprises example scenes based on the SoftRobots plugin for SOFA to illustrate how to include SOFA simulations and train learning algorithms on them.
 
@@ -19,8 +19,7 @@ with some mandatory plugins :
 
 and optional plugins (for some examples): 
 * [ModelOrderReduction](https://github.com/SofaDefrost/ModelOrderReduction)
-
-[comment]: <> (SoftRobots.Inverse and Cosserat)
+* [Cosserat](https://github.com/SofaDefrost/plugin.Cosserat)
 
 [Plugins installation](https://www.sofa-framework.org/community/doc/plugins/build-a-plugin-from-sources/#in-tree-build) with a in-tree build is preferred.
 
@@ -30,23 +29,22 @@ and optional plugins (for some examples):
 We use python3.
 mandatory : 
 ```bash
-pip install gym psutil pygame glfw pyopengl imageio
+pip install gymnasium psutil pygame glfw pyopengl imageio
 ```
-* requierements.txt
 * [stable_baseline](https://github.com/DLR-RM/stable-baselines3)
+
 optional :
 * [Actor with Variance Estimated Critic (AVEC)](https://github.com/yfletberliac/actor-with-variance-estimated-critic)
 
 
 
-### Install
+### Install Sofagym 
 ```bash
 python setup.py bdist_wheel
 pip install -v -e .
 ```
 
 
-
 ## Quick start
 
 ```bash
@@ -57,31 +55,30 @@ export PYTHONPATH=/sofa/build_dir/lib/python3/site-packages:$PYTHONPATH
 import sofagym
 ```
 
-## Usage
-
 The Gym framework allows to interact with an environment using well-known keywords:
 - *step(a)*: allows to perform a simulation step when the agent performs the action *a*. Given the current state of the system *obs_t* and the action *a*, the environment then changes to a new state *obs_{t+1}* and the agent receives the reward *rew*. If the goal is reached, the *done* flag changes to *True*.
 - *reset*: resets the environment.
 - *render*: gives a visual representation of *obs_t*.
 
-The use of this interface allows intuitive interaction with any environment, and this is what SofaGym allows when the environment is a Sofa scene. For more information on Gym, check the official documentation page [here](https://gym.openai.com/docs/).
+The use of this interface allows intuitive interaction with any environment, and this is what SofaGym allows when the environment is a Sofa scene. For more information on Gymnasium, check the official [documentation page](https://gymnasium.farama.org/).
 
-Example of use
+Example of use :
 
 ```python
-import gym
+import gymnasium as gym
 import sofagym.envs
 
 env = gym.make('trunk-v0')
-env.reset()
+observation, info = env.reset(seed=42)
 
 done = False
 while not done:
-    action = ... # Your agent code here
-    state, reward, done, info = env.step(action)
+    action = env.action_space.sample()  # this is where you would insert your policy
+    observation, reward, terminated, truncated, info = env.step(action)
     env.render()
-    print("Step ", idx, " done : ",done,  " state : ", state, " reward : ", reward)
 
+    done = terminated or truncated
+
 env.close()
 ```
 
@@ -93,103 +90,60 @@ The classic running of an episode is therefore:
 
 
 
-## Citing
-
-If you use the project in your work, please consider citing it with:
-```bibtex
-@misc{SofaGym,
-  authors = {Ménager, Etienne and Schegg, Pierre and Duriez, Christian and Marchal, Damien},
-  title = {SofaGym: An OpenAI Gym API for SOFASimulations},
-  year = {2020},
-  publisher = {GitHub},
-  journal = {GitHub repository},
-}
-```
-## The tools
-
-### Server/worker architecture
-
-The major difficulty encountered in this work is the fact that it is not possible to copy the *root* from a Sofa simulation. This implies that when two sequences of actions *A_1 = [a_1, ..., a_n, new_action_1]* and *A_2 = [a_1, ..., a_n, new_action_2]* have to be tried, it is necessary to start again from the beginning each time and simulate again *[a_1, ..., a_n]*. This leads to a huge loss of performance. To solve this problem a server/worker architecture is set up.
-
-A server takes care of distributing the calculations between several clients. Each client *i* is associated with an action sequence *A_i = [a_{i1}, ...., a_{in}]*. Given an action sequence *A = [a_{1}, ...., a_{n}]* and a new action *a*, the server looks for the client with the action sequence *A_i*. This client forks and the child executes the new action *a*. The father and son are referenced to the server as two separate clients and the action sequence *[a_{1}, ...., a_{n}]* and *[a_{1}, ...., a_{n}, a]* can be accessed.
-
-A cleaning system is used to close clients that are no longer used. This makes it possible to avoid having an exponential number of open clients.
-
-When it is not necessary to have access to the different states of the environment, i.e. when the actions are used sequentially, only one client is open and performs the calculations sequentially.
+## The environments
 
-### Vectorized environment
+|Image|Name|Description|Status|
+|----------|:-------------|:-------------|:-------------|
+| |[BubbleMotion](sofagym/envs/BubbleMotion/) bubblemotion-v0| |OK|
+| |[CartStem](sofagym/envs/CartStem/) cartstem-v0| |OK|
+| |[CartStemContact](sofagym/envs/CartStemContact/) cartstemcontact-v0| |OK|
+| |[CatchTheObject](sofagym/envs/CatchTheObject/) catchtheobject-v0| |OK|
+| |[ConcentricTubeRobot](sofagym/envs/CTR/) concentrictuberobot-v0| |OK |
+| |[DiamondRobot](sofagym/envs/Diamond/) diamondrobot-v0| |OSError: [Errno 36] File name too long|
+| |[Gripper](sofagym/envs/Gripper/) gripper-v0| The objective is to grasp a cube and bring it to a certain height.  The closer the cube is to the target, the greater the reward.| OK|
+| |[Maze](sofagym/envs/Maze/) maze-v0| The Maze environment offers one scene  of a ball navigating in a maze. The maze is attached to the tripod robot and the ball is moved by gravity by modifying the maze’s orientation. The tripod is actuated by three servomotors. Similarly to the Trunk Environment, the Maze environment has a dicrete action space of 6 actions, moving  each  servomotor  by  one  increment,  and could easily be extended to be continuous.|Importing your SOFA Scene Failed|
+|  |[MultiGait Robot](sofagym/envs/MultiGaitRobot/) multigaitrobot-v0| The multigait Softrobot has  one  scene. The goal is to move the robot forward in the *x* direction with the highest speed. env = gym.make("multigaitrobot-v0")|[ERROR]   [SofaRuntime] ValueError: Object type MechanicalMatrixMapperMOR<Vec1d,Vec1d> was not created The object is not in the factory. Need MOR plugin ?|
+| |[SimpleMaze](sofagym/envs/SimpleMaze/) simple_maze-v0| |ValueError: Object type Sphere<> was not created  | 
+| |[StemPendulum](sofagym/envs/StemPendulum/) stempendulum-v0| |OK |
+| |[Trunk](sofagym/envs/Trunk/) trunk-v0| The Trunk environment offers two scenarios.  Both are based on the trunk robot.  The first is to bring the trunk’s tip to a certain position. The second scenario is to manipulate a cup using the trunk to get the cup’s center of gravity in a predefined position. The  Trunk  is  controlled  by  eight  cables  that can be contracted or extended by one unit.  There are therefore 16 possible actions. The action space presented here is discrete but could easily be ex-tended to become continuous.|OK |
+| |[TrunkCup](sofagym/envs/TrunkCup/) trunkcup-v0| |OK|
 
 
-Simulation training can be time consuming. It is therefore necessary to be able to parallelise the calculations. Since the actions are chosen sequentially, it is not possible to parallelise the calculations for one environment. The result depends on the previous result. However, it is possible to parallelise on several environments, meaning to run several simulations in parallel. This is done with the baseline of OpenAI: SubprocVecEnv.
 
 
-### Separation between visualisation and computations
 
-SofaGym separates calculations and visualisation. In order to achieve this, two scenes must be created: a scene *A* with all visual elements and a scene *B* with calculation elements (solvers, ...). Scene *A* is used in a viewer and scene *B* in the clients. Once the calculations have been performed in scene *B*, the positions of the points are given to the viewer which updates scene *A*.
 
 ### Adding new environment
 
-
 It is possible to define new environments using SofaGym. For this purpose different elements have to be created:
 - *NameEnv*: inherits from *AbstractEnv*. It allows to give the specificity of the environment like the action domain (discrete or continuous) and the configuration elements.
 - *NameScene*: allows to create the Sofa scene. It must have the classic createScene function and return a *root*. To improve performance it is possible to separate the visual and computational aspects of the scene using the *mode* parameter (*'visu'* or *'simu'*). It allows you to choose the elements in the viewer-related scene or in the client-related scene. We also integrate two Sofa.Core.Controller (rewardShaper and goalSetter) that allow to integrate goal and reward in the scene.
 - *NameToolbox*: allows to customize the environment. It defines the functions to retrieve the reward and the state of the environment as well as the command to apply to the system (link between the Gym action and the Sofa command). Note that we can define the Sofa.Core.Controller here. 
 
 These different elements make it possible to create and personalise the task to be performed. See examples of environments for implementation.
 
-## The environments
-
-### Gripper
-
-The  Gripper  Environmentoffers  two  different  scenes.   In  both  scenes,  the objective is to grasp a cube and bring it to a certain height.  The closer the cube is to the target, the greater the reward.
-
-The two scenes are distinguished by their action  space.   In  one  case  the  actions  are  discrete and correspond to a particular movement. We define a correspondence between a Gym action (int) and corresponding Sofa displacement and direction.
-
-```python
-env = gym.make("gripper-v0")
-```
 
-In the second case,  the actions are continuous  and  correspond  directly  to  a  movement  ofthe gripper’s fingers.  This difference is indicated when defining the environment
-
-```python
-env = gym.make("continuegripper-v0")
-```
-
-### Trunk
-
-The Trunk environment offers two scenarios.  Both are based on the trunk robot.  The first is to bring the trunk’s tip to a certain position.
-
-```python
-env = gym.make("trunk-v0")
-```
-
-The second scenario is to manipulate a cup using the trunk to get the cup’s center of gravity in a predefined position.
+## The tools
 
-```python
-env = gym.make("trunkcup-v0")
-```
+### Server/worker architecture
 
-The  Trunk  is  controlled  by  eight  cables  that can be contracted or extended by one unit.  There are therefore 16 possible actions. The action space presented here is discrete but could easily be ex-tended to become continuous.
+The major difficulty encountered in this work is the fact that it is not possible to copy the *root* from a Sofa simulation. This implies that when two sequences of actions *A_1 = [a_1, ..., a_n, new_action_1]* and *A_2 = [a_1, ..., a_n, new_action_2]* have to be tried, it is necessary to start again from the beginning each time and simulate again *[a_1, ..., a_n]*. This leads to a huge loss of performance. To solve this problem a server/worker architecture is set up.
 
+A server takes care of distributing the calculations between several clients. Each client *i* is associated with an action sequence *A_i = [a_{i1}, ...., a_{in}]*. Given an action sequence *A = [a_{1}, ...., a_{n}]* and a new action *a*, the server looks for the client with the action sequence *A_i*. This client forks and the child executes the new action *a*. The father and son are referenced to the server as two separate clients and the action sequence *[a_{1}, ...., a_{n}]* and *[a_{1}, ...., a_{n}, a]* can be accessed.
 
-### MultiGait Robot
+A cleaning system is used to close clients that are no longer used. This makes it possible to avoid having an exponential number of open clients.
 
-The multigait Softrobot has  one  scene. The goal is to move the robot forward in the *x* direction with the highest speed.
+When it is not necessary to have access to the different states of the environment, i.e. when the actions are used sequentially, only one client is open and performs the calculations sequentially.
 
-```python
-env = gym.make("multigaitrobot-v0")
-```
+### Vectorized environment
 
-### Maze
 
-The Maze environment offers one scene  of a ball navigating in a maze. The maze is attached to the tripod robot and the ball is moved by gravity by modifying the maze’s orientation.
+Simulation training can be time consuming. It is therefore necessary to be able to parallelise the calculations. Since the actions are chosen sequentially, it is not possible to parallelise the calculations for one environment. The result depends on the previous result. However, it is possible to parallelise on several environments, meaning to run several simulations in parallel. This is done with the baseline of OpenAI: SubprocVecEnv.
 
-```python
-env = gym.make("maze-v0")
-```
 
-The tripod is actuated by three servomotors. Similarly to the Trunk Environment, the Maze environment has a dicrete action space of 6 actions, moving  each  servomotor  by  one  increment,  and could easily be extended to be continuous.
+### Separation between visualisation and computations
 
+SofaGym separates calculations and visualisation. In order to achieve this, two scenes must be created: a scene *A* with all visual elements and a scene *B* with calculation elements (solvers, ...). Scene *A* is used in a viewer and scene *B* in the clients. Once the calculations have been performed in scene *B*, the positions of the points are given to the viewer which updates scene *A*.
 
 ## Results
 
@@ -200,6 +154,20 @@ In this section we demonstrate some use cases of the environments available in S
 ### Monte Carlo Tree Search: solving MazeEnv with planning
 
 
+## Citing
+
+If you use the project in your work, please consider citing it with:
+```bibtex
+@misc{SofaGym,
+  authors = {Ménager, Etienne and Schegg, Pierre and Duriez, Christian and Marchal, Damien},
+  title = {SofaGym: An OpenAI Gym API for SOFASimulations},
+  year = {2020},
+  publisher = {GitHub},
+  journal = {GitHub repository},
+}
+```
+
+
 ## Notes
 
 1. At the moment the available action spaces are: continuous, discrete, tuple and dictionary.
diff --git a/plot.py → Tools/plot.py b/plot.py → Tools/plot.py
diff --git a/dist/sofagym-0.0.1-py3-none-any.whl b/dist/sofagym-0.0.1-py3-none-any.whl
diff --git a/requierements.txt b/requierements.txt
@@ -5,5 +5,3 @@ glfw
 pyopengl
 imageio
 
-
-# set PYTHONPATH=plugins\SofaPython3\lib\python3\site-packages:/plugins/STLIB/lib/python3/site-packages
diff --git a/sofagym/AbstractEnv.py b/sofagym/AbstractEnv.py
@@ -8,8 +8,8 @@
 __copyright__ = "(c) 2020, Robocath, CNRS, Inria"
 __date__ = "Oct 7 2020"
 
-import gym
-from gym.utils import seeding
+import gymnasium as gym
+from gymnasium.utils import seeding
 
 import numpy as np
 import copy
@@ -147,7 +147,7 @@ def initialization(self):
         self.goal = None
         self.past_actions = []
 
-
+        
         self.num_envs = 40
 
         self.np_random = None
@@ -156,6 +156,7 @@ def initialization(self):
 
         self.viewer = None
         self.automatic_rendering_callback = None
+
 
         self.timer = 0
         self.timeout = self.config["timeout"]
@@ -281,13 +282,20 @@ def step(self, action):
 
         Returns:
         -------
-            obs:
+            obs(ObsType):
                 The new state of the agent.
-            reward:
+            reward(float):
                 The reward obtain after applying the action in the current state.
-            done:
+            terminated(bool):
+                Whether the agent reaches the terminal state
+            truncated(bool):
+                Whether the truncation condition outside the scope of the MDP is satisfied. 
+                Typically, this is a timelimit, but could also be used to indicate an agent physically going out of bounds.
+            info(dict): 
+                additional information (not used here)
+            done(bool)(Deprecated):
                 Whether the goal is reached or not.
-            {}: additional information (not used here)
+
         """
 
         # assert self.action_space.contains(action), "%r (%s) invalid" % (action, type(action))
@@ -301,20 +309,22 @@ def step(self, action):
         # Request results from the server.
         # print("[INFO]   >>> Result id:", result_id)
         results = get_result(result_id, timeout=self.timeout)
+
         obs = np.array(results["observation"])  # to work with baseline
         reward = results["reward"]
-        done = results["done"]
+        terminated = results["done"]
 
         # Avoid long explorations by using a timer.
+        truncated = False
         self.timer += 1
         if self.timer >= self.config["timer_limit"]:
             # reward = -150
-            done = True
+            truncated = True
+        info={}#(not use here)
 
         if self.config["planning"]:
             self.clean()
-
-        return obs, reward, done, {}
+        return obs, reward, terminated, truncated, info
 
     def async_step(self, action):
         """Executes one action in the environment.
@@ -363,7 +373,7 @@ def reset(self):
 
         Returns:
         -------
-            None.
+            obs, info
 
         """
         self.close()
@@ -380,8 +390,8 @@ def reset(self):
 
         self.timer = 0
         self.past_actions = []
-
-        return
+        
+        return 
 
     def render(self, mode='rgb_array'):
         """See the current state of the environment.

diff --git a/sofagym/__init__ .py b/sofagym/__init__ .py
@@ -8,4 +8,4 @@
 from sofagym.visualisation import visualisation
 
 #sofagym envs for gym registration with sofa envs
-import sofagym.envs
+from sofagym.envs import *
Original file line number	Diff line number	Diff line change
Expand Up		@@ -5,5 +5,3 @@ glfw
		pyopengl
		imageio


		# set PYTHONPATH=plugins\SofaPython3\lib\python3\site-packages:/plugins/STLIB/lib/python3/site-packages