添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When training using PPO a custom-made environment I've got the following weird error:

  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/common.py", 
line 144, in train
    result = self._train()
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/ppo.py",
 line 120, in _train
    self.model.reward_filter)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/rollout.
py", line 125, in collect_samples
    trajectory, rewards, lengths, obs_f, rew_f = ray.get(next_trajectory)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/worker.py", line 2
058, in get
    raise RayGetError(object_ids, value)
ray.worker.RayGetError: Could not get objectid ObjectID(c8742107b043fb31798884ee5ea564b234c6bb86). It was
 created by remote function compute_steps which failed with:
Remote function compute_steps failed with:
Traceback (most recent call last):
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/worker.py", line 7
27, in _process_task
    self.actors[task.actor_id().id()], *arguments)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/runner.p
y", line 243, in compute_steps
    trajectory = self.compute_trajectory(gamma, lam, horizon)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/runner.p
y", line 206, in compute_trajectory
    self.env, horizon, self.observation_filter, self.reward_filter)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/rollout.
py", line 31, in rollouts
    observation = observation_filter(env.reset())
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/filter.p
y", line 120, in __call__
    x = x / (self.rs.std + 1e-8)
  File "/home/wojciech/miniconda3/envs/learning_to_run/lib/python3.6/site-packages/ray/rllib/ppo/filter.p
y", line 81, in std
    return np.sqrt(self.var)
AttributeError: 'float' object has no attribute 'sqrt'

It is hard to debug since it pops after an hour of training but I have found that numpy raises this weird error in the following case:

x = np.array(42.42, dtype='object')
np.sqrt(x)

Interesting, I can reproduce this by running

python ray/python/ray/rllib/train.py --alg=PPO --env=CartPole-v0

It looks like the dtype of self.var in

ray/python/ray/rllib/ppo/filter.py Line 81 6ecc899