pytorch/captum

Use meta-llama/Llama-3.2-3B-Instruct,get unexpected result.

Opened this issue Ā· 1 comments

šŸ› Bug

To Reproduce

Steps to reproduce the behavior:

I followed https://captum.ai/tutorials/Llama2_LLM_Attribution
My code is hereļ¼Œthe only difference is I changed the model_name.

`
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

from captum.attr import (
    FeatureAblation,
    ShapleyValues,
    LayerIntegratedGradients,
    LLMAttribution,
    LLMGradientAttribution,
    TextTokenInput,
    TextTemplateInput,
    ProductBaselines,
)


model_name = "meta-llama/Llama-3.2-1B-Instruct"



def load_model(model_name, bnb_config):
    n_gpus = torch.cuda.device_count()
    max_memory = "10000MB"

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        device_map="auto", # dispatch efficiently the model on the available ressources
        max_memory = {i: max_memory for i in range(n_gpus)},
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name, token=True)

    # Needed for LLaMA tokenizer
    tokenizer.pad_token = tokenizer.eos_token

    return model, tokenizer

def create_bnb_config():
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16,
    )

    return bnb_config

model, tokenizer = load_model(model_name, bnb_config)

model.eval()



def prompt_fn(*examples):
    main_prompt = "Decide if the following movie review enclosed in quotes is Positive or Negative:\n'I really liked the Avengers, it had a captivating plot!'\nReply only Positive or Negative."
    subset = [elem for elem in examples if elem]
    if not subset:
        prompt = main_prompt
    else:
        prefix = "Here are some examples of movie reviews and classification of whether they were Positive or Negative:\n"
        prompt = prefix + " \n".join(subset) + "\n " + main_prompt
    return "[INST] " + prompt + "[/INST]"

input_examples = [
    "'The movie was ok, the actors weren't great' Negative", 
    "'I loved it, it was an amazing story!' Positive",
    "'Total waste of time!!' Negative", 
    "'Won't recommend' Negative",
]
sv = ShapleyValues(model) 

sv_llm_attr = LLMAttribution(sv, tokenizer)

#attr_res = sv_llm_attr.attribute(inp, target=target, num_trials=3)

inp = TextTemplateInput(
    prompt_fn, 
    values=input_examples,
)
attr_res = sv_llm_attr.attribute(inp)

attr_res.plot_token_attr(show=True)

`

Expected behavior

It should generate 'postive' or 'negtive' .And plot the score between promptā€™s example and output.

The actual output

The system gives a hint "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
Setting pad_token_id to eos_token_id:128001 for open-end generation."

And the ploted picture is
aa

Environment

Describe the environment used for Captum


 - Captum : 0.7.0:
 - google Colab : Ubuntu 22.04.3 LTS"
 - The method installed Captum / PyTorch :  pip
 - Python version: 3.10.12
 - CUDA/cuDNN version:
     `
     cuda-python                        12.2.1
    cupy-cuda12x                       12.2.0
    jax-cuda12-pjrt                    0.4.33
    jax-cuda12-plugin                  0.4.33
    nvidia-cuda-cupti-cu12             12.6.80
    nvidia-cuda-nvcc-cu12              12.6.77
    nvidia-cuda-runtime-cu12           12.6.77`
 - GPU models and configuration:
 - Pytorch : 2.4.1+cu121
 - Transformer: 4.44.2
 
## The Possible problem
 attention_mask is incorrect when captum calls model.generate?

Follow https://captum.ai/tutorials/Llama2_LLM_Attribution ,When I user mistralai/Mistral-7B-Instruct-v0.3,the model generate"POSTIVE",but the score is all zero.
f416fce048c7dcf7dc48cffe9a1ef4d