[FEA] Create an easy functionality to generate dict of tensors- a standard way to move array data across frameworks

Question

[FEA] Create an easy functionality to generate dict of tensors- a standard way to move array data across frameworks

rnyak opened this issue 2 years ago · 1 comments

When we want to trace a PyT model we do this torch.jit.trace(model, train_dict, strict=True). here train_dict is a dictionary of torch tensors. if you look at the Pyt documentation, that corresponds to example_inputs term.

currently we get the dict of tensors as follow, but I think this is not what we want users to practice:

dataset = Dataset(train_paths[0])
trainer.train_dataset_or_path = dataset
loader = trainer.get_train_dataloader()
train_dict = next(iter(loader))

Based on discussions with Karl, looks like this is related to Columns and MerlinArray. We need a standard solution for this.

Answer 1 · 2023-04-04T18:18:21.000Z

We now have this via TensorTable and the related utility functions for converting back and forth between TensorTables, dataframes, and dictionaries.