IBM/unitxt

No documentation for Serializers

Closed this issue · 0 comments

Serializers are not documented so it's not clear how they are used.

I created one for actions for examples:

from unitxt.type_utils import register_type
from unitxt.serializers import SingleTypeSerializer

class Action(TypedDict):
    id: str
    description: str

class ActionList(TypedDict):
    actions: List[Action]

register_type(ActionList)

class ActionListSerializer(SingleTypeSerializer):
    serialized_type = ActionList

    def serialize(self, value: ActionList, instance: Dict[str, Any]) -> str:
        return "\n".join([f'- {action["id"]}: {action["description"]}' for  action in value["actions"]])

task=Task(
        input_fields={"utterance": str, 
                      "prior_sequence" : Union[List[str],str],
                      "prior_context": List[str],
                      "actions" : ActionList},
        reference_fields={"expected_output_sequence": List[str]},
        prediction_type=str,
        metrics=["metrics.normalized_sacrebleu"],
    )

dataset = load_dataset(card=card, template=template,max_test_instances=10,serializer=ActionListSerializer())
  1. is there a way to define a serializer for List[Actions]?
  2. We use the term serialize in a way the same we use verbalize, which is confusing.