Skip to content

Conversation

@yasserglez
Copy link
Contributor

This pull request is related with issue #6.

I added a simple 4x4 gridworld example that can be solved with discount = 1.0. Some of the implemented algorithms seem to fail when trying to solve this problem. Below is the relevant output of the tests. Please, let me know if I can help finding the problems.

tests.test_PolicyIteration.test_PolicyIteration_gridworld ... ERROR
tests.test_PolicyIterationModified.TestPolicyIterationModified.test_gridworld ... ok
tests.test_QLearning.test_QLearning_gridworld ... FAIL
tests.test_ValueIteration.test_ValueIteration_gridworld ... ok
tests.test_ValueIterationGS.test_ValueIterationGS_gridworld ... ok

======================================================================
ERROR: tests.test_PolicyIteration.test_PolicyIteration_gridworld
----------------------------------------------------------------------
Traceback (most recent call last):
  File "nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "pymdptoolbox/src/tests/test_PolicyIteration.py", line 129, in test_PolicyIteration_gridworld
    pi.run()
  File "pymdptoolbox/src/mdptoolbox/mdp.py", line 810, in run
    self._evalPolicyMatrix()
  File "pymdptoolbox/src/mdptoolbox/mdp.py", line 799, in _evalPolicyMatrix
    (_sp.eye(self.S, self.S) - self.discount * Ppolicy), Rpolicy)
  File "numpy/linalg/linalg.py", line 381, in solve
    r = gufunc(a, b, signature=signature, extobj=extobj)
  File "numpy/linalg/linalg.py", line 90, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
nose.proxy.LinAlgError: Singular matrix
-------------------- >> begin captured stdout << ---------------------
WARNING: check conditions of convergence. With no discount, convergence can not be assumed.

--------------------- >> end captured stdout << ----------------------

======================================================================
FAIL: tests.test_QLearning.test_QLearning_gridworld
----------------------------------------------------------------------
Traceback (most recent call last):
  File "nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "pymdptoolbox/src/tests/test_QLearning.py", line 63, in test_QLearning_gridworld
    assert qlearning.policy == policy_gridworld
AssertionError

----------------------------------------------------------------------
Ran 128 tests in 5.723s

FAILED (errors=1, failures=1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant