[QST] How to construct schema based on a single "items" list feature?
ardulat opened this issue · 3 comments
What is your question?
Hello! First, a bit of context: I am using NVTabular for further usage in Transformers4Rec. Hence, I am working on session-based recommendations. Currently, I only have one feature, an "items" list of product IDs (string). So, how do I construct a Schema
necessary for transformers4rec.torch.TabularSequenceFeatures
?
More context: I went through some examples of notebooks in Transformers4Rec documentation, but the main issue is related to NVTabular preprocessing. I have tried using nvt.Workflow
to create a schema from a pandas data frame with an "items" list feature (as in the example), but I get the following:
In contrast, I am trying to get something like this:
The item_id-list
have tags saying these are categorical features (further necessary for TabularSequenceFeatures
). How do I get the representation of the same tags if I already have an "items" list in my data frame?
When NVT infers a schema, it can figure out the dtypes and so forth, but can't tell what the semantics of the fields are, so leaves the tags blank. You can add any additional tags with the AddTags
operator. It accepts a list of plain strings and will auto-convert them to Tags
if needed.
["item_id-list"] >> AddTags(tags=["ITEM","LIST","ITEM_ID","ID"])
@rnyak, yes, the issue was solved at NVIDIA-Merlin/Transformers4Rec#703.