cezannec/capsule_net_pytorch

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

anirudhprabhu opened this issue ยท 5 comments

Hello,

I am very new to capsuleNetworks and Pytorch in general. Thank you for the detailed and easy to understand explanations. While I was trying to run the code I came across an error when I was trying to train a model.

RuntimeError                              Traceback (most recent call last)
<ipython-input-16-ce644b7b7998> in <module>
      1 # training for 3 epochs
      2 n_epochs = 3
----> 3 losses = train(capsule_net, criterion, optimizer, n_epochs=n_epochs)

<ipython-input-15-54eb5db28cd7> in train(capsule_net, criterion, optimizer, n_epochs, print_every)
     34             optimizer.zero_grad()
     35             # get model outputs
---> 36             caps_output, reconstructions, y = capsule_net(images)
     37             # calculate loss
     38             loss = criterion(caps_output, target, images, reconstructions)

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-11-1bf5514185a1> in forward(self, images)
     16         primary_caps_output = self.primary_capsules(self.conv_layer(images))
     17         caps_output = self.digit_capsules(primary_caps_output).squeeze().transpose(0,1)
---> 18         reconstructions, y = self.decoder(caps_output)
     19         return caps_output, reconstructions, y
     20 

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

<ipython-input-10-ffae35828f12> in forward(self, x)
     44         x = x * y[:, :, None]
     45         # flatten image into a vector shape (batch_size, vector_dim)
---> 46         flattened_x = x.view(x.size(0), -1)
     47         # create reconstructed image vectors
     48         reconstructions = self.linear_layers(flattened_x)

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

I have not changed any part of the code yet. I wanted to run the code as it is, before trying different things. Can you help me understand why such an error was caused and how to fix it?

Thank you!

EDIT: I just replaced the view function with reshape as suggested in the error and it works. Though I am still not sure of the difference between the two functions in this context.

It is a difference in how the tensor is stored in memory. reshape or a call to .contiguous() before view should fix. I made the latter change, thanks for raising the issue!

It is a difference in how the tensor is stored in memory. reshape or a call to .contiguous() before view should fix. I made the latter change, thanks for raising the issue!

@cezannec I am confused, can it be explain what is going on underneath? why does it make a difference?

references to check that might be helpful:

@anirudhprabhu Having the same issue, however the conclusion isnt really want I wanted. The reason why is:

`view` function with `reshape` as suggested in the error and it works. Though I am still not 

sure of the difference between the two functions in this context.

The reason reshape or contiguous work is that unlike view, they will copy the tensor iteslf. https://pytorch.org/docs/stable/generated/torch.Tensor.view.html Sooo if you're trying to avoid duplicating tensors, this isnt going to be an optimal solution for you. If you're fine with possibly duplicating tensors, then you might as well directly use reshape or contiguous.

For me, I want to avoid tensor duplication. I think this issue occurs when you're making views of views, or views of indexed tensors (or especially indexed tensors since you're literally filtering the shape).

I'm mainly posting this for others to note.

TLDR: reshape or contiguous work because they copy the tensor sometimes. Where a view gauruntees that when its called, the tensor wont be duplicated, but ofcourse... sometimes you must duplicate the tensor :/

Hi,

based on suggestion of the error I used reshape() instead of view(). Also, I tested contiguous().view(-1) as well, but still I face the same error. Would anyone assist in it?

this is my error: RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

from tqdm import tqdm

tokenizer = T5Tokenizer.from_pretrained("t5-base")
model = T5ForConditionalGeneration.from_pretrained("t5-base")
with open("question_answer.txt", "r") as file:
    text = file.read()
questions = text.split("\n")[:-1]
answers = text.split("\n")[1:]

# Define the maximum number of lines for training
max_lines = 50

# Create a progress bar
progress_bar = tqdm(total=min(max_lines, len(questions)), desc="Processing")

batch_size = 4  # Adjust the batch size according to your memory capacity

inputs = []
target_texts = []

loss_values = []  # Store the losses for each batch
optimizer_values = []  # Store the optimizer values for each batch

for i, question in enumerate(questions[:max_lines]):
    if i % batch_size == 0 and i != 0:
        tokenized_inputs = tokenizer.batch_encode_plus(
            inputs,
            padding="longest",
            truncation=True,
            return_tensors="pt"
        )
        tokenized_targets = tokenizer.batch_encode_plus(
            target_texts,
            padding="longest",
            truncation=True,
            return_tensors="pt"
        )

        input_ids = tokenized_inputs["input_ids"]
        attention_mask = tokenized_inputs["attention_mask"]
        target_ids = tokenized_targets["input_ids"]
        decoder_attention_mask = tokenized_targets["attention_mask"]

        model.train()
        optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
        loss_fn = torch.nn.CrossEntropyLoss()

        optimizer.zero_grad()
        outputs = model(
            input_ids=input_ids,
            attention_mask=attention_mask,
            decoder_input_ids=target_ids[:, :-1],
            decoder_attention_mask=decoder_attention_mask[:, :-1],
            labels=target_ids[:, 1:]
        )

        lm_logits = outputs.logits
        loss = loss_fn(lm_logits.reshape(-1, lm_logits.size(-1)), target_ids[:, 1:].reshape(-1))
        loss.backward()
        optimizer.step()

        loss_values.append(loss.item())
        optimizer_values.append(optimizer.param_groups[0]['lr'])

        inputs = []
        target_texts = []

        print(f"Batch {i//batch_size}, Loss: {loss.item()}")

    input_text = question.format_map({'item': data.iloc[0]})
    inputs.append(input_text)
    target_texts.append(answers[i])

    # Update the progress bar
    progress_bar.update(1)

# Process the remaining batch
if inputs:
    tokenized_inputs = tokenizer.batch_encode_plus(
        inputs,
        padding="longest",
        truncation=True,
        return_tensors="pt"
    )
    tokenized_targets = tokenizer.batch_encode_plus(
        target_texts,
        padding="longest",
        truncation=True,
        return_tensors="pt"
    )

    input_ids = tokenized_inputs["input_ids"]
    attention_mask = tokenized_inputs["attention_mask"]
    target_ids = tokenized_targets["input_ids"]
    decoder_attention_mask = tokenized_targets["attention_mask"]

    model.train()
    optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
    loss_fn = torch.nn.CrossEntropyLoss()

    optimizer.zero_grad()
    outputs = model(
        input_ids=input_ids,
        attention_mask=attention_mask,
        decoder_input_ids=target_ids[:, :-1],
        decoder_attention_mask=decoder_attention_mask[:, :-1],
        labels=target_ids[:, 1:]
    )

    lm_logits = outputs.logits
    loss = loss_fn(lm_logits.reshape(-1, lm_logits.size(-1)), target_ids[:, 1:].reshape(-1))

    loss_values.append(loss.item())
    optimizer_values.append(optimizer.param_groups[0]['lr'])

hi @SudoSaba @cezannec am still facing the same issue replace() or contiguos() did not fix mine.