bjing2016/alphaflow

TypeError: __init__() missing 1 required positional argument: 'no_column_attention'

Bloeci opened this issue · 5 comments

Hi,
I've been trying to do some experiments using your model and scripts and running into a problem. The error arises when using the following command from a testing directory in the main project location:

python3 ../predict.py \
    --weights ../model_weights/alphaflow_pdb_distilled_202402.pt \
    --mode alphafold \
    --input_csv ghsr.csv \
    --msa_dir ./msas \
    --samples 10 \
    --outpdb ./out \
    --noisy_first \
    --no_diffusion

I get the error:

2024-02-27 08:10:02,292 [iwe547170:52263] [INFO] Loading the model
Traceback (most recent call last):
  File "/media/data/software/alphaflow/workdir/../predict.py", line 138, in <module>
    main()
  File "/home/iwe34/anaconda3/envs/alphaflow/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/media/data/software/alphaflow/workdir/../predict.py", line 84, in main
    model = model_class(**ckpt['hyper_parameters'], training=False)
  File "/media/data/software/alphaflow/alphaflow/model/wrapper.py", line 496, in __init__
    self.model = AlphaFold(config,
  File "/media/data/software/alphaflow/alphaflow/model/alphafold.py", line 77, in __init__
    self.evoformer = EvoformerStack(
TypeError: __init__() missing 1 required positional argument: 'no_column_attention'

To figure out why that happened, I modified the alphaflow/config.py file by manually adding the flag no_column_attention: False and realized later, that this config is only used when the predict.py script is called with the additional flag --original_weights=True.

However, inspecting the loaded config ckpt from the lines

if args.weights:
    ckpt = torch.load(args.weights, map_location='cpu')
    model = model_class(**ckpt['hyper_parameters'], training=False)

showed that the model weights alphaflow_pdb_distilled_202402.pt doesn't contain the no_column_attention field. It worked fine when I used the original weights params_model_1.npz (apart from getting a CUDA error, another problem).

Simple question: What am I doing wrong? Why can I provide new model weights when these could never be used because the EvoformerStack in openfold requires this argument?

Hm, I'm not sure why adding the line no_column_attention: False worked. However, if there is an error about no_column_attention, then the OpenFold version is probably wrong. Did you install OpenFold using the pip command in the README or some other way?

Hi,
thanks for the quick reply. To clarify something, adding the flag no_column_attention: False to the config.py doesn't work. I experimented to see if this would solve the problem. After that, I realized that this config is never read by the model_class (the code snipped in my first post, third box), because it loads the hyperparameter from the alphaflow_pdb_distilled_202402.pt. And these weights don't contain the flag no_column_attention.

However, for some reason, I can't trace it back. The version I installed was not the correct one, and looking at the correct openfold version from the commit hash 103d037, the attribute no_column_attention is missing in the EvoformerStack class. In older/newer versions this flag exists.

When repeating the installation instructions again, I ran into some issues regarding pytorch-cuda support because, for some reason (I'm definitely not an expert), the instructions don't work on my machine (I use an NVIDIA GeForce RTX 3070, CUDA 11.4). I ran into error when trying to install openfold. The main error has something to do with the nvcc, because I get (I have cropped the output, complete message attached)

(alphaflow) Ξ software/alphaflow git:(master) ▶ pip install 'openfold @ git+https://github.com/aqlaboratory/openfold.git@103d037'
Collecting openfold@ git+https://github.com/aqlaboratory/openfold.git@103d037
  Cloning https://github.com/aqlaboratory/openfold.git (to revision 103d037) to /tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35
  Running command git clone --filter=blob:none --quiet https://github.com/aqlaboratory/openfold.git /tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35
  WARNING: Did not find branch or tag '103d037', assuming revision or ref.
  Running command git checkout -q 103d037
  Resolved https://github.com/aqlaboratory/openfold.git to commit 103d037
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: openfold
  Building wheel for openfold (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
[...]
FAILED: /tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35/build/temp.linux-x86_64-cpython-39/openfold/utils/kernel/csrc/softmax_cuda_kernel.o
      /usr/local/cuda-11.0/bin/nvcc  -I/tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35/openfold/utils/kernel/csrc/ -I/home/iwe34/anaconda3/envs/alphaflow/lib/python3.9/site-packages/torch/include -I/home/iwe34/anaconda3/envs/alphaflow/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -I/home/iwe34/anaconda3/envs/alphaflow/lib/python3.9/site-packages/torch/include/TH -I/home/iwe34/anaconda3/envs/alphaflow/lib/python3.9/site-packages/torch/include/THC -I/usr/local/cuda-11.0/include -I/home/iwe34/anaconda3/envs/alphaflow/include/python3.9 -c -c /tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35/openfold/utils/kernel/csrc/softmax_cuda_kernel.cu -o /tmp/pip-install-jsvsku_v/openfold_a1fb3071bf26468684704ab751571b35/build/temp.linux-x86_64-cpython-39/openfold/utils/kernel/csrc/softmax_cuda_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -std=c++14 -maxrregcount=50 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda -gencode arch=compute_86,code=sm_86 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=attn_core_inplace_cuda -D_GLIBCXX_USE_CXX11_ABI=0
      nvcc fatal   : Unsupported gpu architecture 'compute_86'
[...]

pip_openfold_error.txt

It seems like you are compiling with CUDA 11.0. Although it should not be strictly necessary to use 11.6, 11.0 is probably too low. Could you upgrade your CUDA version to 11.4, the maximum supported by your NVIDIA driver? Note that the CUDA version given by nvidia-smi is not related to the version you actually have installed.

Oh cool. Thank you very much for the suggestion. That was the final trick i had to make to get everything run.

@bjing2016 @Bloeci Proceed with this issue. Following the latest version of openfold, these hyperparameters should be set at https://github.com/aqlaboratory/openfold/blob/main/openfold/config.py#L602 and https://github.com/aqlaboratory/openfold/blob/main/openfold/config.py#L576:
"no_column_attention": False, "opm_first": False, "fuse_projection_weights": False,