jthomperoo/predictive-horizontal-pod-autoscaler

Allow fetching Holt-Winters parameters at runtime

Closed this issue · 13 comments

The alpha, beta, and gamma values for Holt-Winters are currently predetermined and set at configuration time. A useful feature would be to allow these parameters to be fetched/calculated at runtime - for example with a grid search.

Originally posted by @shubhamitc in #26 (comment)

Proposed solution is providing some kind of hook functionality that would allow configuration of an HTTP endpoint or a Python script to run to determine these values at runtime.

@jthomperoo i would like to work on it. Can we discuss the way forward?

Hi, that's great to hear! I'll outline some steps here just to get you started on the codebase from the README.

Developing this project

Environment

Developing this project requires these dependencies:

To view docs locally, requires:

Commands

  • make - builds the Predictive HPA binary.
  • make docker - builds the Predictive HPA image.
  • make lint - lints the code.
  • make unittest - runs the unit tests
  • make vendor - generates a vendor folder.
  • make doc - hosts the documentation locally, at 127.0.0.1:8000.

So make sure you have Golang >=1.13, Docker and Golint installed on your development machine.
You will also need a Kubernetes cluster to be able to test your changes out on, either from a cloud provider (for example I like to use DigitalOcean) or running the cluster locally with MiniKube or k3d.

To test out your changes, install the Custom Pod Autoscaler Operator on your cluster, see the install guide here.

Any changes made should be documented (see the docs/ folder), unit tested and pass the linter make lint.

Now onto the changes themselves.

So the proposed solution is some kind of hook functionality, which would be configurable by users. The config would look something like this:

models:
- type: HoltWinters
  name: HoltWintersPrediction
  perInterval: 1
  holtWinters:
    alphaFetch:
      method: "GET"
      url: "https://<url>/<endpoint_that_returns_alpha_value>"
      successCodes:
        - 200
      parameterMode: query
    betaFetch:
      method: "GET"
      url: "https://<url>/<endpoint_that_returns_beta_value>"
      successCodes:
        - 200
      parameterMode: query
    gammaFetch:
      method: "GET"
      url: "https://<url>/<endpoint_that_returns_gamma_value>"
      successCodes:
        - 200
      parameterMode: query
    seasonLength: 6
    storedSeasons: 4
    method: "additive"
decisionType: "maximum"
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50

The key bit here is the new alphaFetch, betaFetch and gammaFetch - these should be set up using some of the 'method' system from the Custom Pod Autoscaler Framework, this would save you having to rewrite lots of code:

https://github.com/jthomperoo/custom-pod-autoscaler/tree/master/execute
https://custom-pod-autoscaler.readthedocs.io/en/latest/user-guide/methods/

You could specify that they have to conform to a certain format, e.g. in JSON:

{
    "value": 0.2
}

Or you could just allow them to return the raw value, it is up to you, I'm not sure which of these approaches is best.

Also in the configuration if these hooks are provided the normal hardcoded alpha, beta and gamma values should be ignored, if they are not provided the hardcoded values should be used.

The holt-winters code should then be updated to execute these hooks if they are provided to fetch these tuning values:

https://github.com/jthomperoo/predictive-horizontal-pod-autoscaler/blob/master/prediction/holtwinters/holtwinters.go

I think that covers most of it, I'm sure there are some other complexities that I've missed, if you have any questions feel free to ask me!

Thanks for taking the time to help out!

IMHO, you can serve all three alpha, beta, and gamma via a single API response. So if the parameterMode is a query, we can just ask for 1 endpoint and paramerterMode once. See if this makes sense

models:
- type: HoltWinters
  name: HoltWintersPrediction
  perInterval: 1
  holtWinters:
    paramFetchEndpoint: "https://<url>/<endpoint_that_returns_alpha_value>"
    paramStatusCode: 200
    parameterMode: query
    paramFetchMethod: "GET"
    alphaFetch:
      parameterMode: query
    seasonLength: 6
    storedSeasons: 4
    method: "additive"
decisionType: "maximum"
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50

This should make the configuration pretty small and will give you the option to either use the old hardcoded values for param as well as fetchAPI. Let me know your thoughts.

Yes, that's a much better approach! A single query would have a much reduced overhead than three separate ones, good thinking. Perhaps there would be a standard response format, for example:

{
    "alpha_value": 0.2,
    "beta_value": 0.5,
    "gamma_value": 0.7
}

The only thing I would be clear about is using the 'method' pattern from the Custom Pod Autoscaler; much of the work around HTTP requests (timeouts, headers, parameters) has already been done and could save a lot of effort. The other benefit of using the method pattern is that it supports alternatives to HTTP (for example a shell script can be called), and there are future plans for gRPC, protobuf, and websockets support, which the PHPA would receive for free just by upgrading the dependency version.

Here's some snippets from the Custom Pod Autoscaler codebase showing how to use the methods:

Initialisation: https://github.com/jthomperoo/custom-pod-autoscaler/blob/d00bb5f72d382ba07f9b23437e2f8ce844edd18c/cmd/custom-pod-autoscaler/main.go#L154-L167

Configuration: https://github.com/jthomperoo/custom-pod-autoscaler/blob/d00bb5f72d382ba07f9b23437e2f8ce844edd18c/config/config.go#L66-L73

Execution: https://github.com/jthomperoo/custom-pod-autoscaler/blob/d00bb5f72d382ba07f9b23437e2f8ce844edd18c/metric/metric.go#L87-L90

Upon some research I found that the python module has some way to give do the grid search internally using AIC values.
Generally, they have a fit.summary() method which returns-optimized values for these parameters. This takes some time to compute and will be an overhead to repeat the same in GO. Below are some of the proposals:

  • I propose to transition the prediction task to python code.
  • Figure out a fixed API contract between python application and metric.go
  • Check the below snippet which suggests the holt-winters code in statsmodel lib
  • This will open up a way to use multiple prediction algorithms supported by statsmodel including ARMA and ARIMA
  • Let me know your thoughts
import pandas as pd
from matplotlib import pyplot as plt
from statsmodels.tsa.holtwinters import ExponentialSmoothing as HWES

#read the data file. the date column is expected to be in the mm-dd-yyyy format.
df = pd.read_csv('retail_sales_used_car_dealers_us_1992_2020.csv', header=0, infer_datetime_format=True, parse_dates=[0], index_col=[0])
df.index.freq = 'MS'

#plot the data
df.plot()
plt.show()

#split between the training and the test data sets. The last 12 periods form the test data
df_train = df.iloc[:-12]
df_test = df.iloc[-12:]

#build and train the model on the training data
model = HWES(df_train, seasonal_periods=12, trend='add', seasonal='mul')
fitted = model.fit(optimized=True, use_brute=True)

#print out the training summary
print(fitted.summary())

#create an out of sample forcast for the next 12 steps beyond the final data point in the training data set
sales_forecast = fitted.forecast(steps=12)

#plot the training data, the test data and the forecast on the same plot
fig = plt.figure()
fig.suptitle('Retail Sales of Used Cars in the US (1992-2020)')
past, = plt.plot(df_train.index, df_train, 'b.-', label='Sales History')
future, = plt.plot(df_test.index, df_test, 'r.-', label='Actual Sales')
predicted_future, = plt.plot(df_test.index, sales_forecast, 'g.-', label='Sales Forecast')
plt.legend(handles=[past, future, predicted_future])
plt.show()

Currently, the build is failing with error:

go get vbom.ml/util: unrecognized import path "vbom.ml/util" (https fetch: Get https://vbom.ml/util?go-get=1: dial tcp: lookup vbom.ml: no such host)

Since I am not a go developer, I don't know how to fix it.

In response to using Python for Holt-Winters and other methods, I think that could be a great idea; would certainly save a lot of work and open up a bunch of algorithms to add in. Some of the concerns I'd have with it are:

  • Performance - not hugely important but just good to get some kind of benchmarks to make sure it's not going to eats lots of CPU/RAM.
  • Designing a contract/interface for Go-Python interop. Should be fairly flexible to allow non-breaking changes, perhaps this could be abstracted out to execute any shell command, rather than limit to Python, if for example in the future it would be changed.
  • Base image concerns - size mainly.
  • Python dependencies - should these be bundled with the base image?

I think if we can get some good benchmarks/thought put into these the Python approach could be the way to go, and would save a whole bunch of time.

I think if we make sure that the logic executed in Python is minimal, restricted to just the algorithms themselves (ARIMA, Holt Winters etc.) and Go handles all configuration and just feeds into that it could address all of those.

I'll create a new ticket to track that.

As to your build failing, I'm not sure why that's happening, I tried with a fresh clone and it builds fine for me both with make and make docker - could you post your Go version?

go version

Let me try and answer these:
Performance - not hugely important but just good to get some kind of benchmarks to make sure it's not going to eats lots of CPU/RAM. - Should be ok considering the amount of data we pass in.
Designing a contract/interface for Go-Python interop. Should be fairly flexible to allow non-breaking changes, perhaps this could be abstracted out to execute any shell command, rather than the limit to Python, if for example in the future it would be changed. - I believe we should push that as a different container itself and be able to call that over http/grpc... That will allow algorithms to be abstracted and self-contained while the application builds something like an abstract pattern.
Base image concerns - size mainly. - I think just like above separate container should solve this problem
Python dependencies - should these be bundled with the base image? - As I said, we should pack that as a separate container.

Let me know your thoughts.

Go version

go version go1.13.5 darwin/amd64

I'm using Go version go1.14.6, I don't think the version of Go being used should affect it at all, try running go mod vendor to pull down any dependencies.

I'm not entirely sure that putting the Python logic into it's own container is the best approach - this would be adding a lot of complexity, handling multiple containers + publishing both to be pulled down at deploy time; I'm not clear on the benefits of having it in its own container - could it not just be triggered by executing a shell command to some Python code that is baked into the same image?

I think the advantage to have python code in its own container is to allow users to bring their own logic, while, the project can focus on supporting certain frameworks(Hold-winters, Arima, etc...). As long as they support the same contract as we do it should be ok.
Additionally, it can allow us to use something like cloud watch and stack driver predictions too.
Let me know your thoughts.

Hey sorry for the late reply, I think that sounds like a good approach, perhaps if you put together a prototype/proof of concept of the two container set up so we could work on it a bit further and flesh out the golang-python contact.

Thanks!

Great, Sorry for the late reply from my end too. Let me try and figure something out by next week.

The pull request #39 adds in runtime Holt Winters tuning, I'm going to have a look at swapping the algorithms to Python as part of #38.