[Windows] `torchvision=0.18.1` incompatible with `NumPy 2.x` leading to error initializing torch workspaces
Opened this issue · 1 comments
A while back NumPy release v2.x which resulted in issues when using packages compiled for NumPy 1.x
One area where openfl
was specifically affected was in the torch-workspaces [Ref Issue #999]
While updating the workspaces to torch==2.3.1
and torchvision==0.18.1
seemed to work on ubuntu, it seems that Windows torchvision==0.18.1
is still incompatible with NumPy v2.x
resulting in errors initializing torch workspaces on Windows
We should look into updating torch
and torchvision
to later versions to compatible across Ubuntu and Windows
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "<stdin>", line 1, in <module>
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\__init__.py", line 6, in <module>
from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\models\__init__.py", line 2, in <module>
from .convnext import *
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\models\convnext.py", line 8, in <module>
from ..ops.misc import Conv2dNormActivation, Permute
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\ops\__init__.py", line 23, in <module>
from .poolers import MultiScaleRoIAlign
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\ops\poolers.py", line 10, in <module>
from .roi_align import roi_align
File "C:\Documents\openfl\venv\lib\site-packages\torchvision\ops\roi_align.py", line 4, in <module>
import torch._dynamo
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\__init__.py", line 64, in <module>
torch.manual_seed = disable(torch.manual_seed)
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\decorators.py", line 50, in disable
return DisableContext()(fn)
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\eval_frame.py", line 410, in __call__
(filename is None or trace_rules.check(fn))
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 3378, in check
return check_verbose(obj, is_inlined_call).skipped
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 3361, in check_verbose
rule = torch._dynamo.trace_rules.lookup_inner(
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 3442, in lookup_inner
rule = get_torch_obj_rule_map().get(obj, None)
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 2782, in get_torch_obj_rule_map
obj = load_object(k)
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 2811, in load_object
val = _load_obj_from_str(x[0])
File "C:\Documents\openfl\venv\lib\site-packages\torch\_dynamo\trace_rules.py", line 2795, in _load_obj_from_str
return getattr(importlib.import_module(module), obj_name)
File "C:\AppData\Local\Programs\Python\Python310\lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "C:\Documents\openfl\venv\lib\site-packages\torch\nested\_internal\nested_tensor.py", line 417, in <module>
values=torch.randn(3, 3, device="meta"),
C:\Documents\openfl\venv\lib\site-packages\torch\nested\_internal\nested_tensor.py:417: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:84.)
values=torch.randn(3, 3, device="meta"),
To Reproduce
Steps to reproduce the behavior:
- Install
OpenFL
fx workspace create --template torch_cnn_mnist --prefix my_workspace
fx plan initialize
- See error
Expected behavior
workspaces should initialize without error
The suggested fix is to upgrade OpenFL to use the latest PyTorch 2.x version across the board (TaskRunner hierarchy, FL workspaces, workflow API examples). This will also likely address the issue observed in Windows.