aws-solutions/improving-forecast-accuracy-with-machine-learning

RTS features must be numeric (demo data has strings)

athewsey opened this issue · 2 comments

Describe the bug

As per the Amazon Forecast Related Time Series validation doc:

  • Related time series feature data must be of the int or float datatypes.

The pre-prepared nyctaxi_weather_auto demo data tries to use a string field day_hour_name - which produces the following error notification (email):

There was an error running the forecast job for dataset group nyctaxi_weather_auto

Message: An error occurred (InvalidInputException) when calling the CreatePredictor operation: The attribute(s) [day_hour_name] present in the RELATED_TIME_SERIES schema should be of numeric type such as integer or float, or be added as a forecast dimension

To Reproduce

Deploy the solution with the optional pre-prepared NYC taxi demo data (and manually download/re-upload the datasets to kick off the pipeline if #9 is not yet fixed).

Expected behavior

Pipeline should deploy and create forecasts without errors on the demo dataset.

Please complete the following information about the solution:

  • Version: v1.4.0

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0123) Improving Forecast Accuracy with Machine Learning v1.3.0[...]".

  • Region: us-east-1
  • Was the solution modified from the version published on this repository? No
  • If the answer to the previous question was yes, are the changes available on GitHub?
  • Have you checked your service quotas for the sevices this solution uses?
  • Were there any errors in the CloudWatch Logs?

Screenshots

If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context

This field is not present in the TTS schema, so I'm not sure it would be possible to add it to the list of forecast dimensions even if it made sense? (Which for this field I believe it might not)

Thank you for reaching out. We will investigate the issue and try reproducing the same.

This issue is still happening on a brand new CF stack I created following the [https://docs.aws.amazon.com/solutions/latest/improving-forecast-accuracy-with-machine-learning/step-1-launch-the-stack.html](official instructions).