microsoft/promptbench

Is it possible to add phi-2?

tr5tn opened this issue · 6 comments

tr5tn commented

Very keen to test against Phi-2. Is support expected to be added soon?

Hi, promptbench now supports phi-2! Please clone the latest repository and use model = pb.LLMModel(model='phi-2', max_new_tokens=10, temperature=0.001) to load phi-2 model. Please do not hesitate to inform us for any potential bugs.

tr5tn commented

Thanks @Immortalise ! That was quick. I've got Phi-2 imported now, but PromptBench seems to assume CUDA, although Phi-2 can support CPU-only deployments of Torch (this is how I'm running it). So I get this error where CUDA seems to be hard-coded into models.py:

Traceback (most recent call last):
File "", line 8, in
File "C:\src\promptbench-main\promptbench\models_init_.py", line 155, in call
return self.infer_model.predict(input_text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\src\promptbench-main\promptbench\models\models.py", line 74, in predict
input_ids = self.tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "%USERPROFILE%\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\torch\cuda_init_.py", line 289, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Hi, promptbench now supports loading it using CPU! Please clone the latest repository and following the examples/basic.ipynb to try it.

tr5tn commented

@Immortalise I think there is a further problem with specifying the float32 dtype (which I have found necessary when running Phi2 on CPU). I get this error:

File "\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\torch\nn\functional.py", line 2543, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

I tried to specify the float32 dtype in pb.LLMModel but it wouldn't accept float32.

File "\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\modeling_utils.py", line 3422, in from_pretrained
raise ValueError(
ValueError: torch_dtype can be either torch.dtype or "auto", but received float32

Hi, could you try dtype=torch.float32? It works for me using the following code:
model = pb.LLMModel("phi-2", temperature=0, dtype=torch.float32, device="cpu")

tr5tn commented

Thanks! That seems to be working now. I did try this first, but I put torch.float32 in single quotes and that seemed to cause the same problem I mentioned above.

ValueError: torch_dtype can be either torch.dtype or "auto", but received torch.float32

It seems I tried all variants except removing the quotes. Thanks again for your help.