CQCL/lambeq

PicklingError

Closed this issue · 6 comments

Hi,

I am trying to run the tutorial script to train on the MC dataset (link) without any modifications. However, when I fit the model using -

trainer.fit(train_dataset, val_dataset, eval_interval=1, log_interval=5)

It gives PicklingError -

PicklingError                             Traceback (most recent call last)
[<ipython-input-20-744a780ae3be>](https://localhost:8080/#) in <cell line: 1>()
----> 1 trainer.fit(train_dataset, val_dataset, eval_interval=1, log_interval=5)

5 frames
[/usr/local/lib/python3.10/dist-packages/lambeq/training/pytorch_model.py](https://localhost:8080/#) in get_diagram_output(self, diagrams)
    132 
    133         parameters = {k: v for k, v in zip(self.symbols, self.weights)}
--> 134         diagrams = pickle.loads(pickle.dumps(diagrams))  # deepcopy, but faster
    135         for diagram in diagrams:
    136             for b in diagram.boxes:

PicklingError: Can't pickle <class 'discopy.tensor.Box[float64]'>: attribute lookup Box[float64] on discopy.tensor failed

Hi @iknoorjobs ,
This looks like an error with DisCoPy 1.1.5.

I'll raise an issue in the upstream repo, in the meantime:
Downgrading to discopy==1.1.4 fixes this issue.

Hi @nikhilkhatri

Thanks! It's fixed now.
Also one more thing - when I substitute the sample MC dataset in the same script with my custom dataset, it triggers a RuntimeError due to a stack size mismatch. Any thoughts on why this might be happening?

RuntimeError                              Traceback (most recent call last)
[<ipython-input-47-744a780ae3be>](https://localhost:8080/#) in <cell line: 1>()
----> 1 trainer.fit(train_dataset, val_dataset, eval_interval=1, log_interval=5)
5 frames
[/usr/local/lib/python3.10/dist-packages/lambeq/training/pytorch_model.py](https://localhost:8080/#) in get_diagram_output(self, diagrams)
    144 
    145         with backend('pytorch'), tn.DefaultBackend('pytorch'):
--> 146             return torch.stack([tn.contractors.auto(
    147                 *d.to_tn(dtype=float)).tensor for d in diagrams])
    148 

RuntimeError: stack expects each tensor to be equal size, but got [2, 2] at entry 0 and [2] at entry 2

Its possible your dataset yields diagrams with different codomains. Most models expect all diagrams to have the same codomain size
The UnifyCodomainRewriter should help https://cqcl.github.io/lambeq/lambeq.rewrite.html#lambeq.rewrite.UnifyCodomainRewriter

Hi @nikhilkhatri
Thanks. I tried adding UnifyCodomainRewriter after creating diagrams, but its showing AxiomError now. Could you check the sample code below, please?

from lambeq import BobcatParser, TreeReader, TreeReaderMode
parser = BobcatParser(verbose='text')
train_diagrams = parser.sentences2diagrams(['man skillful prepares sauce .', 'dinner bakes man skillful .'])

from lambeq import UnifyCodomainRewriter
unify_rewriter = UnifyCodomainRewriter()
for i in range(len(train_diagrams)):
    train_diagrams[i] = unify_rewriter.rewrite(train_diagrams[i])

from discopy.tensor import Dim
from lambeq import AtomicType, SpiderAnsatz
ansatz = SpiderAnsatz({AtomicType.NOUN: Dim(2),
                       AtomicType.SENTENCE: Dim(2)})

train_circuits = [ansatz(diagram) for diagram in train_diagrams]

Error ----

Tagging sentences.
Parsing tagged sentences.
Turning parse trees to diagrams.
---------------------------------------------------------------------------
AxiomError                                Traceback (most recent call last)
[<ipython-input-1-c253b932cc38>](https://localhost:8080/#) in <cell line: 18>()
     16                        AtomicType.SENTENCE: Dim(2)})
     17 
---> 18 train_circuits = [ansatz(diagram) for diagram in train_diagrams]

16 frames
[/usr/local/lib/python3.10/dist-packages/discopy/utils.py](https://localhost:8080/#) in assert_iscomposable(left, right)
    643     """
    644     if not left.is_composable(right):
--> 645         raise AxiomError(messages.NOT_COMPOSABLE.format(
    646             left, right, left.cod, right.dom))
    647 

AxiomError: Cup(n, n.r) @ s does not compose with MERGE_s_0: s != Ty().

When I remove UnifyCodomainRewriter, it works but gives RuntimeError later during trainer.fit (link)

Thanks

Hi @iknoorjobs , this would be easier to debug looking at the diagram for which this error is raised.

This will be closed due to inactivity.