facebookresearch/recipes

How to install ai_codesign?

chongxiaoc opened this issue · 1 comments

I'm running torchrec lightning recipe on Ray cluster by following https://github.com/facebookresearch/recipes/tree/main/torchrecipes/rec.
Command:
torchx run -s ray -cfg working_dir=.,dashboard_address=localhost:31024 dist.ddp -j 1x4 --gpu 4 --script ./dlrm_main.py

Error:

ray/0 (CommandActor pid=2748) [2]:Traceback (most recent call last):
ray/0 (CommandActor pid=2748) [2]:  File "./dlrm_main.py", line 25, in <module>
ray/0 (CommandActor pid=2748) [2]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
ray/0 (CommandActor pid=2748) [2]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
ray/0 (CommandActor pid=2748) [2]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
ray/0 (CommandActor pid=2748) [2]:ModuleNotFoundError: No module named 'ai_codesign'
ray/0 (CommandActor pid=2748) [3]:Traceback (most recent call last):
ray/0 (CommandActor pid=2748) [3]:  File "./dlrm_main.py", line 25, in <module>
ray/0 (CommandActor pid=2748) [3]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
ray/0 (CommandActor pid=2748) [3]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
ray/0 (CommandActor pid=2748) [3]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
ray/0 (CommandActor pid=2748) [3]:ModuleNotFoundError: No module named 'ai_codesign'
ray/0 (CommandActor pid=2748) [1]:Traceback (most recent call last):
ray/0 (CommandActor pid=2748) [1]:  File "./dlrm_main.py", line 25, in <module>
ray/0 (CommandActor pid=2748) [1]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
ray/0 (CommandActor pid=2748) [1]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
ray/0 (CommandActor pid=2748) [1]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
ray/0 (CommandActor pid=2748) [1]:ModuleNotFoundError: No module named 'ai_codesign'
ray/0 (CommandActor pid=2748) [0]:Traceback (most recent call last):
ray/0 (CommandActor pid=2748) [0]:  File "./dlrm_main.py", line 25, in <module>
ray/0 (CommandActor pid=2748) [0]:    from torchrecipes.rec.modules.lightning_dlrm import LightningDLRM
ray/0 (CommandActor pid=2748) [0]:  File "/root/recipes/torchrecipes/rec/modules/lightning_dlrm.py", line 18, in <module>
ray/0 (CommandActor pid=2748) [0]:    from ai_codesign.benchmarks.dlrm.torchrec_dlrm.modules.dlrm_train import DLRMTrain
ray/0 (CommandActor pid=2748) [0]:ModuleNotFoundError: No module named 'ai_codesign'

Env:
Python 3.7
Torchrec: 0.1.1
Torch: 1.11.0 + cu113
TorchX: 0.2.0.dev0
Lightning: 1.6.3

I was trying to install ai_codesign from pip but it didn't work out for me.

[root@~/recipes/torchrecipes/rec #]pip3 install ai-codesign
ERROR: Could not find a version that satisfies the requirement ai-codesign (from versions: none)
ERROR: No matching distribution found for ai-codesign
WARNING: You are using pip version 20.2.2; however, version 22.1.2 is available.
You should consider upgrading via the '/usr/bin/python3.7 -m pip install --upgrade pip' command.
[root@~/recipes/torchrecipes/rec #]pip3 install ai_codesign
ERROR: Could not find a version that satisfies the requirement ai_codesign (from versions: none)
ERROR: No matching distribution found for ai_codesign
WARNING: You are using pip version 20.2.2; however, version 22.1.2 is available.
You should consider upgrading via the '/usr/bin/python3.7 -m pip install --upgrade pip' command.

How can I install this module?

cc @colin2328

apologies for this -> #24 should fix which is being merged soon