Help with get_activations

Question

Help with get_activations

timwfburton opened this issue 3 years ago · 4 comments

I've been having a lot of success with your keras attention library, and was hoping to use get_activations to understand the network behavior in more detail.

In this toy example (which I created to reduce down to the smallest snippet), I have a dataset of size 100, each with 10 time steps, and each time step having 5 features. There's an initial dense model which is encapsulated in a time distributed layer that then feeds into a LSTM that returns it's sequence. The sequence is then input to your Attention layer with 32 units.

My goal is to understand which of the 10 time stamps for each of the 100 samples in the dataset is most important (i.e., receiving the most attention), so I used the get_activations function. I was expected a matrix of size 100x10, but I got 100x32. Is seems like what I'm trying to do should be possible based on your examples - I'd really appreciate any advice that you can give!

Answer 1 · 2022-05-31T02:29:44.000Z

@timwfburton thanks for the feedback! Appreciated! Can you paste the full code that I can run?

Answer 2 · 2022-05-31T13:09:26.000Z

Of course, thanks in advance - here it is!

import keras as k
import keract
from attention import Attention
import numpy as np

num_samples = 100
timeSteps = 10
featurePerStep = 5

coreModel =  k.models.Sequential()
coreModel.add(k.layers.Input(shape=(featurePerStep), name = 'input'))
coreModel.add(k.layers.Dense(3))

coreModel.summary()

timeDistributedModel = k.models.Sequential()
timeDistributedModel.add(k.layers.TimeDistributed(coreModel))
timeDistributedModel.add(k.layers.LSTM(64, return_sequences=True, name = 'lstm'))
timeDistributedModel.add(Attention(units=32, name = "attention"))
timeDistributedModel.add(k.layers.Dense(1, name = "dense"))

timeDistributedModel.compile(optimizer=k.optimizers.Adam(),loss='binary_crossentropy')

mockTrainX = np.random.rand(num_samples,timeSteps,featurePerStep)
mockTrainY = np.round(np.random.uniform(0.0, 1.0, size = (num_samples,1)),0)

timeDistributedModel.fit(x=mockTrainX, y=mockTrainY, batch_size = 32, epochs=1)
timeDistributedModel.summary()

activations = keract.get_activations(timeDistributedModel, mockTrainX,layer_names = ['attention'],nested=True)['attention']
print("activations has a length of "+str(len(activations)))
print("activations[0] have a length of "+str(len(activations[0])))
print("activations[99] have a length of "+str(len(activations[99])))

Answer 3 · 2022-06-08T09:38:18.000Z

https://github.com/philipperemy/keras-attention-mechanism/blob/482b0c937b3888da5967b47478701838a4222269/examples/add_two_numbers.py#L95

What you want is the attention_weights. But maybe it does not work well with the Sequential API. Try with the functional API.

Answer 4 · 2022-09-25T15:47:19.000Z

I'll close this issue (cf. answer above). If it's not clear, feel free to comment on it!