It took ~51s in A6000 when running the point cloud mouse example
zhixial2022 opened this issue · 1 comments
Got a bunch of warnings when trying the example:
FutureWarning:
resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True
.
warnings.warn(
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
You are using a model of type opt to instantiate a model of type shape_opt. This is not supported for all configurations of models and can yield errors.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please useattn_implementation="flash_attention_2"
instead.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPT is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTDecoder is torch.float32. You should run training or inference using Automatic Mixed-Precision via thewith torch.autocast(device_type='torch_device'):
decorator, or load the model with thetorch_dtype
argument. Example:model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
Not sure the above warnings will slower the running or not. Could someone help to take a look? Thank you!
Hi! That's fine. Just ignore these warnings.