ERROR:root:module 'warp_drive.numba_includes.env_runner' has no attribute 'NumbaCustomEnvStep'
Finebouche opened this issue · 7 comments
Hi thank your help setting up this,
Following the same tutorial and this examples I have set up the cpu env in custom_env.py
where CustomEnv
class is and the numba env in custom_env_step_numba.py
where NumbaCustomEnvStep
function is.
The environment I register is the first custom_en.py
this way :
env_registrar.add_cuda_env_src_path(CustomEnv.name, "custom_env", env_backend="numba")
So I have been able to charge my cpu environment without any problem.
env_wrapper = EnvWrapper(
env_obj=CustomEnv(**run_config["env"]),
env_name=CustomEnv.name,
num_envs=run_config["trainer"]["num_envs"],
env_backend="cpu",
env_registrar=env_registrar
)
However when I try to set env_backend="numba"
I get the following error ERROR:root:module 'warp_drive.numba_includes.env_runner' has no attribute 'NumbaCustomEnvStep'
Not sure where is this error coming from. Warp_drive should have find the NumbaCustomEnvStep
class in custom_env_step_numba.py
but it obviously did not.
What am I missing again ?
I should precise that this error message is prompted when self.env.initialize_step_function_context
is called in env_wrapper.py
You can simply test from <<YOUR_ENV_NUMBA_LIB>> import *
and see if your NumbaCustomEnvStep
can be identified successfully as a Python module in your running settings. And this <<YOUR_ENV_NUMBA_LIB>>
is what you shall use here env_registrar.add_cuda_env_src_path(CustomEnvironment.name, "<<YOUR_ENV_NUMBA_LIB>>", env_backend="numba")
There is no magic here, here we just secretly integrate your numba module into the entire ecosystem, and essentially if your from <<YOUR_ENV_NUMBA_LIB>> import *
works, then the loading would work. If not, please report to me, because we have multiple custom CUDA environments loaded by other users but not Numba, so I can imagine if there is some hiccup.
So I guess I wasn't using the proper <<YOUR_ENV_NUMBA_LIB>>.
I was using the custom_env
file (so the cpu versions with the Class that call the numba step version) instead of custom_env_step_numba
(the numba file with only functions). This wasn't to clear from the tutorials I guess especially because the tutorial says that you need to register the environment. One question remains though : how does warpdrive knows about that cpu file 'custom_env'
and the CustomEnv class ? Is this done secretly as well ?
Anyway after changing that I tried what you said
from custom_env_step_numba import *
NumbaCustomEnvStep
loads the custom_env_step_numba and outputs correctly
CUDADispatcher(<function NumbaCustomEnvStep at 0x7fe0a451ee50>)
Edit : It seems to work now, I will to the EnvironmentCPUvsGPU test now.
As a remark, the tutorial should refer more carefully to Your_Env_Class and Your_Dual_Mode_Env_Class in the session on how to register the custom environment.
Edit : second remark, what thing that wasn't obvious to debug is that my code in custom_env_step_numba
was buggy and this was causing the NumbaCustomEnvStep
function to not load.. I wouldn't have realized that without your advice to test :
from custom_env_step_numba import *
NumbaCustomEnvStep
Maybe there is a way o indicate what went wrong when loading the file ?
The cpu model class is there when you call import custom_env
, this is just a Python class, this is not magic but just a Python import as you did for any Python codes.
The specialty is that since you have a Numba step(), and this step() is actually outside of your Python class , WarpDrive needs to know where is the source code of the Numba step() and then integrate the compiled Numba step() to your Python class. This is done by the EnvWrapper, that is why you see the error from there because you did not provide the path of Numba step()
It seems there is still something wrong with my implementation as I get the following stacktrace :
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[13], line 1
----> 1 trainer.train()
File /project_ghent/warp-drive/warp_drive/training/trainer.py:422, in Trainer.train(self)
419 start_time = time.time()
421 # Generate a batched rollout for every CUDA environment.
--> 422 self._generate_rollout_batch()
424 # Train / update model parameters.
425 metrics = self._update_model_params(iteration)
File /project_ghent/warp-drive/warp_drive/training/trainer.py:469, in Trainer._generate_rollout_batch(self)
467 # Step through all the environments
468 start_event.record()
--> 469 self.cuda_envs.step_all_envs()
471 # Bookkeeping rewards and done flags
472 _, done_flags = self._bookkeep_rewards_and_done_flags(batch_index=batch_index)
File /project_ghent/warp-drive/warp_drive/env_wrapper.py:355, in EnvWrapper.step_all_envs(self, actions)
351 """
352 Step through all the environments
353 """
354 if not self.env_backend == "cpu":
--> 355 self.env.step()
356 result = None # Do not return anything
357 else:
File /project_ghent/collective_MARL/custom_env.py:598, in CustomEnv.step(self, actions)
595 if self.env_backend == "numba":
596 print("Try calling numba step function")
597 self.cuda_step[self.cuda_function_manager.grid, self.cuda_function_manager.block](
--> 598 *self.cuda_step_function_feed(args)
599 )
600 result = None # do not return anything
602 # CPU version of step()
603 else:
File /project_ghent/warp-drive/warp_drive/managers/function_manager.py:120, in CUDAFunctionFeed.__call__(self, arguments)
118 for arg in arguments:
119 if isinstance(arg, str):
--> 120 data_pointers.append(self.data_manager.device_data(arg))
121 elif isinstance(arg, tuple):
122 key = arg[0]
File /project_ghent/warp-drive/warp_drive/managers/data_manager.py:395, in CUDADataManager.device_data(self, name)
393 assert name in self._host_data
394 return self._host_data[name]
--> 395 assert name in self._device_data_pointer
396 return self._device_data_pointer[name]
AssertionError:
Here self.cuda_step_function_feed(args)
is called in my custom_env.py
file but I don't understand what fails after...
You have some data array not registered in the data manager