aws/sagemaker-huggingface-inference-toolkit

Support multiple return sequences

Opened this issue · 0 comments

Is there any way to generate multiple return sequences for a text generation prompt? At the moment, I call the predictor sequentially n times.

I think the following linked issue would also solve my request as I could pass num_return_sequences=n as a kwarg to the HF pipeline.
#85

In the meantime, does HuggingFacePredictor.predict support batched inputs? This would be an improvement over my current implementation.