rotemtzaban/STIT

Click problems?

NoUserNameForYou opened this issue · 6 comments

While trying to run the first inversion step I get this:

"Traceback (most recent call last):
File "\STIT\train.py", line 124, in
main()
File "\python\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "\python\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "\python\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "\python\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "\STIT\train.py", line 62, in main
_main(**config, config=config)
File "\STIT\train.py", line 76, in _main
files = make_dataset(input_folder)
File "\STIT\utils\data_utils.py", line 27, in make_dataset
assert os.path.isdir(dir), '%s is not a valid directory' % dir
AssertionError: /data/obama is not a valid directory"

(removed the full path names due to privacy reasons while I pasted the error log here)

I got this error while trying to install requirements.txt:

qt5-tools 5.15.2.1.2 requires click~=7.0, but you have click 8.1.2 which is incompatible.

I'm on Windows 11 with python 3.9.9 and cuda 11.6

If you could just release an easier way of doing this via Anaconda + template environment building file I'd be happy.

@NoUserNameForYou

Hi,

Did you install the requirements on a clean environment?
For me on a Linux machine it does not install qt5-tools, was that requirement installed via our requirements.txt?
Unfortunately I currently don't have a Windows machine to test on, so I can only test it on Linux.

I had a few things installed before. But shouldnt latest click work?

It does. It seems like the qt5-tools package is dependent on an older version of clip, which prevents the latest from being installed.

It does. It seems like the qt5-tools package is dependent on an older version of clip, which prevents the latest from being installed.

Still the same error after "pip install --upgrade --no-deps --force-reinstall click" and even after uninstalling qt5 tools.

I ran "python train.py --input_folder /data/obama --output_folder training_results --run_name obama --num_pti_steps 80" in cmd btw. I don't know how to use the .py files as they have \ at the end of each line and they fail at 1st.

I give up until it becomes user friendly maybe with a virtual env., thanks for trying to help at least.

@NoUserNameForYou
Looking again at the error, it doesn't seem to even be from click.
It looks like there is some issue with the input folder given.
One option I'm thinking of is that maybe you've used unix style slashes(e.g /), instead of Windows style backslashes() as a separator.
Also, I assume you've downloaded the sample videos and that they exist in the directory you give there.

Thanks, that was it. But not specifically that. because using "--input_folder \data\obama" didn't work either. It was the first backslash. doing "--input_folder data\obama" works now.

Got out of memory error and I'll dig for max split size mb options now. Thank oyu.

edit:

"
Traceback (most recent call last):
File "STIT\torch_utils\ops\bias_act.py", line 48, in _init
_plugin = custom_ops.get_plugin('bias_act_plugin', sources=sources, extra_cuda_cflags=['--use_fast_math'])
File "STIT\torch_utils\custom_ops.py", line 64, in get_plugin
raise RuntimeError(f'Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "{file}".')
RuntimeError: Could not find MSVC/GCC/CLANG installation on this computer. Check _find_compiler_bindir() in "STIT\torch_utils\custom_ops.py"."

Since you're not pulling it from PATH I'll try to manually enter my VS19 custom folder location.

Edit 2: I give up. "torch_utils\ops\upfirdn2d.py:34: UserWarning: Failed to build CUDA kernels for upfirdn2d. Falling back to slow reference implementation. Details:"

Providing a virtual environment shouldn't be hard man. See how this guy did it for GPEN: https://github.com/Cioscos/GPEN-Cioscos

Edit 3: I provided the correct cl.exe location in the custom ops file and now it's compiling. Awaiting the next error.

Aaaand edit 4: As expected, here's the next error:

"Traceback (most recent call last):
File "\STIT\train.py", line 124, in
main()
File "\python\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "\python\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "\python\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "\python\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "\STIT\train.py", line 62, in main
_main(**config, config=config)
File "\STIT\train.py", line 91, in _main
ws = coach.train()
File "\STIT\training\coaches\coach.py", line 142, in train
generated_images = self.forward(w_pivot)
File "\STIT\training\coaches\coach.py", line 93, in forward
generated_images = self.G.synthesis(w, noise_mode='const', force_fp32=True)
File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "", line 463, in forward
File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "", line 397, in forward
File "\python\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "", line 296, in forward
File "\STIT\torch_utils\ops\bias_act.py", line 88, in bias_act
return _bias_act_cuda(dim=dim, act=act, alpha=alpha, gain=gain, clamp=clamp).apply(x, b)
File "\STIT\torch_utils\ops\bias_act.py", line 153, in forward
y = _plugin.bias_act(x, b, _null_tensor, _null_tensor, _null_tensor, 0, dim, spec.cuda_idx, alpha, gain, clamp)
RuntimeError: CUDA out of memory. Tried to allocate 128.00 MiB (GPU 0; 4.00 GiB total capacity; 2.58 GiB already allocated; 0 bytes free; 2.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"

And I'm out.