/tfdeploy

tfdeploy: Deploy Tensorflow Models from R

Primary LanguageR

TensorFlow Model Deployment for R

Overview

While TensorFlow models are typically defined and trained using R or Python code, it is possible to deploy TensorFlow models in a wide variety of environments without any runtime dependency on R or Python:

  • TensorFlow Serving is an open-source software library for serving TensorFlow models using a gRPC interface.

  • CloudML is a managed cloud service that serves TensorFlow models using a REST interface.

  • RStudio Connect provides support for serving models using the same REST API as CloudML, but on a server within your own organization.

TensorFlow models can also be deployed to mobile and embedded devices including iOS and Android mobile phones and Raspberry Pi computers.

The R interface to TensorFlow includes a variety of tools designed to make exporting and serving TensorFlow models straightforward. The basic process for deploying TensorFlow models from R is as follows:

  • Train a model using the keras, tfestimators, or tensorflow R packages.

  • Call the export_savedmodel() function on your trained model write it to disk as a TensorFlow SavedModel.

  • Use the serve_savedmodel() and predict_savedmodel() functions from the tfdeploy package to run a local test server that supports the same REST API as CloudML and RStudio Connect.

  • Deploy your model using TensorFlow Serving, CloudML, or RStudio Connect.

Getting Started

Begin by installing the tfdeploy package from GitHub as follows:

devtools::install_github("rstudio/tfdeploy")

Next we'll walk through an end-to-end example using a model trained with keras package. After that we'll describe in more depth the specific requirements and various options associated with exporting models. Finally, we'll cover the various deployment options and provide links to additional documentation.

Keras Example

We'll use a Keras model that recognizes handwritten digits from the MNIST dataset as an example. MNIST consists of 28 x 28 grayscale images of handwritten digits like these:

The dataset also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.

Here's the complete source code for the model:

library(keras)

# load data
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()

# reshape and rescale
x_train <- array_reshape(x_train, dim = c(nrow(x_train), 784)) / 255
x_test <- array_reshape(x_test, dim = c(nrow(x_test), 784)) / 255

# one-hot encode response
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)

# define and compile model
model <- keras_model_sequential()
model %>%
  layer_dense(units = 256, activation = 'relu', input_shape = c(784),
              name = "image") %>%
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dense(units = 10, activation = 'softmax',
              name = "prediction") %>%
  compile(
    loss = 'categorical_crossentropy',
    optimizer = optimizer_rmsprop(),
    metrics = c('accuracy')
  )

# train model
history <- model %>% fit(
  x_train, y_train,
  epochs = 30, batch_size = 128,
  validation_split = 0.2
)

Note that we have given the first and last layers names ("image" and "prediction" respectively). You should always provide sensible names for your input and output layers when creating Keras models which you plan on deploying.

As a baseline, let's predict the digits using the in-memory Keras model for the first 10 images within the test set (next, we'll see how we can generate predictions from an exported model which requires no references to the original code used to train the model):

preds <- predict(model, x_test[1:10,])
round(preds, 4)
        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
 [1,] 0.0160 0.0544 0.0420 0.1529 0.1236 0.4482 0.0252 0.0356 0.0579 0.0443
 [2,] 0.0164 0.0568 0.0447 0.1324 0.0926 0.5013 0.0301 0.0252 0.0587 0.0419
 [3,] 0.0135 0.0494 0.0366 0.1349 0.0893 0.5265 0.0253 0.0247 0.0557 0.0440
 [4,] 0.0196 0.0641 0.0480 0.1545 0.1416 0.4003 0.0308 0.0411 0.0583 0.0419
 [5,] 0.0163 0.0612 0.0474 0.1376 0.1595 0.4141 0.0272 0.0392 0.0549 0.0426
 [6,] 0.0136 0.0467 0.0365 0.1349 0.0849 0.5299 0.0253 0.0237 0.0585 0.0460
 [7,] 0.0136 0.0562 0.0400 0.1391 0.1513 0.4449 0.0254 0.0322 0.0545 0.0426
 [8,] 0.0145 0.0575 0.0440 0.1382 0.1568 0.4320 0.0254 0.0312 0.0521 0.0483
 [9,] 0.0157 0.0541 0.0391 0.1347 0.1064 0.4903 0.0305 0.0275 0.0610 0.0407
[10,] 0.0159 0.0558 0.0439 0.1383 0.1706 0.4041 0.0261 0.0365 0.0611 0.0476

The values displayed represent the probabilties that each image is the respctive digit (so the prediction for a given image is the column with the largest probability)

It's straightfoward to generate predictions when we have access to the original R code used to train the model, but what if we want to deploy the model in an environment where R isn't available? The next sections cover doing this with the tfdeploy package.

Exporting the Model

After training, the next step is to export the model as a TensorFlow SavedModel using the export_savedmodel() function:

library(tfdeploy)
export_savedmodel(model, "savedmodel")

This will create a "savedmodel" directory that contains a saved version of your MNIST model. You can view the graph of your model using TensorBoard with the view_savedmodel() function:

view_savedmodel("savedmodel")

Generating Predictions

You can generate predictions from the exported model using the predict_savedmodel() function. The predict_savedmodel() requires that we pass a list() of instances to generate predictions for (as distinct from the predict() function above which could take an N-dimensional array where the first dimension represents the instances). So we first transform the 10x784 matrix of test images into a 10 element list of length 784 vectors:

test_images <- x_test[1:10,]
test_images <- lapply(1:nrow(test_images), function(i) test_images[i,])

Now we can call predict_savedmodel(), passing the test images (as a list) and specifying the directory where the model was saved:

preds <- predict_savedmodel(
  test_images,
  "savedmodel", 
  type = "export"
)
preds
$predictions
                                                                       prediction
1  0.0160, 0.0544, 0.0420, 0.1529, 0.1236, 0.4482, 0.0252, 0.0356, 0.0579, 0.0443
2  0.0164, 0.0568, 0.0447, 0.1324, 0.0926, 0.5013, 0.0301, 0.0252, 0.0587, 0.0419
3  0.0135, 0.0494, 0.0366, 0.1349, 0.0893, 0.5265, 0.0253, 0.0247, 0.0557, 0.0440
4  0.0196, 0.0641, 0.0480, 0.1545, 0.1416, 0.4003, 0.0308, 0.0411, 0.0583, 0.0419
5  0.0163, 0.0612, 0.0474, 0.1376, 0.1595, 0.4141, 0.0272, 0.0392, 0.0549, 0.0426
6  0.0136, 0.0467, 0.0365, 0.1349, 0.0849, 0.5299, 0.0253, 0.0237, 0.0585, 0.0460
7  0.0136, 0.0562, 0.0400, 0.1391, 0.1513, 0.4449, 0.0254, 0.0322, 0.0545, 0.0426
8  0.0145, 0.0575, 0.0440, 0.1382, 0.1568, 0.4320, 0.0254, 0.0312, 0.0521, 0.0483
9  0.0157, 0.0541, 0.0391, 0.1347, 0.1064, 0.4903, 0.0305, 0.0275, 0.0610, 0.0407
10 0.0159, 0.0558, 0.0439, 0.1383, 0.1706, 0.4041, 0.0261, 0.0365, 0.0611, 0.0476

Note that this function can be called without defining or loading the Keras model (in fact no reference to the code originally used to build and train the model is required). As we will see below, we can even generate predictions from other languages using an HTTP/REST interface to the model!

Local Server

The tfdeploy package includes a local server which you can use to test the HTTP/REST interace to your model before deployment. To serve a model locally, use the serve_savedmodel() function:

serve_savedmodel("savedmodel")
Starting server under http://127.0.0.1:8089 with the following API entry points:
  http://127.0.0.1:8089/api/serving_default/predict/

You can now call the model from another R session using the predict_savedmodel() function. Here we'll actually pass a real image from the MNIST test set to predict_savedmodel() (previously we just passed all zeros):

library(keras)

# prepare list of test images
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()
x_test <- array_reshape(x_test, dim = c(nrow(x_test), 784)) / 255
test_images <- x_test[1:10,]
test_images <- lapply(1:nrow(test_images), function(i) test_images[i,])

# invoke webapi to generate predictions
library(tfdeploy)
preds <- predict_savedmodel(
  test_images,
  "http://localhost:8089/api/serving_default/predict", 
  type = "webapi"
)

You could also call the model from another langague entirely since the interface is just HTTP and REST. For example, if the following JSON is submitted to the endpoint as a request body it will return predictions:

{
  "instances": [
    {
      "image_input": [0,0,0,...,0,0]
    },
    {
      "image_input": [0,0,0,...,0,0]
    },
    {
      "image_input": [0,0,0,...,0,0]
    },
    
    // ...more instances
  ]
}

Additional details on the JSON schema used for the REST interface is provided in the Model Deployment section below.

Remote Deployment

Once you have tested your model locally you can deploy it to a server. There are a number of available options for this including TensorFlow Serving, CloudML, and RStudio Connect. For example, if we wanted to deploy our saved model to CloudML we could do this:

library(cloudml)
cloudml_deploy("savedmodel", name = "keras_mnist", version = "keras_mnist_1")

Now that we've walked through a simple end-to-end example, we'll describe the processes of Model Export and Model Deployment in more detail.

Model Export

TensorFlow SavedModel defines a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.

The export_savedmodel() function creates a SavedModel from a model trained using the keras, tfestimators, or tensorflow R packages. There are subtle differences in how this works in practice depending on the package you are using.

keras

The Keras Example above includes complete example code for creating and using SavedModel instances from Keras so we won't repeat all of those details here.

To export a TensorFlow SavedModel from a Keras model, simply call the export_savedmodel() function on any Keras model:

export_savedmodel(model, "savedmodel")
Keras learning phase set to 0 for export (restart R session before doing additional training)

Note the message that is printed indicates that a side effect of exporting the model was setting the Keras "learning phase" (whether it is training or doing inference) to 0. This is necessary to export Keras models to TensorFlow and carries the implication that you shouldn't do additional training within your R session after calling export_savedmodel() (because Keras will be hard-coded to be in inference mode, which means it won't update weights as data flows through it's graph).

tfestimators

Exporting a TensorFlow SavedModel from a TF Estimators model works exactly the same way, simply call export_savedmodel() on the estimator. Here is a complete example:

library(tfestimators)

mtcars_input_fn <- function(data, num_epochs = 1) {
  input_fn(data,
           features = c("disp", "cyl"),
           response = "mpg",
           batch_size = 32,
           num_epochs = num_epochs)
}

cols <- feature_columns(column_numeric("disp"), column_numeric("cyl"))

model <- linear_regressor(feature_columns = cols)

indices <- sample(1:nrow(mtcars), size = 0.80 * nrow(mtcars))
train <- mtcars[indices, ]
test  <- mtcars[-indices, ]

model %>% train(mtcars_input_fn(train, num_epochs = 10))

export_savedmodel(model, "savedmodel")

We can now generate predictions from the model as follows:

# transform data frame records into list of named lists
test_records <- mtcars[1:5, c("disp", "cyl")]
test_records <- lapply(1:nrow(test_records), function(i) as.list(test_records[i,]))

# generate predictions
library(tfdeploy)
preds <- predict_savedmodel(
  test_records,
  "savedmodel",  
  type = "export",
  signature_name = "predict"
)
preds
$predictions
  predictions
1     13.3829
2     13.3829
3      9.1437
4     20.5719
5     28.4791

tensorflow

The tensorflow package provides a lower-level interface to the TensorFlow API. You can also use the export_savedmodel() function to export models created with this API, however you need to provide some additional parmaeters indicating which tensors represent the inputs and outputs for your model.

For example, here's an MNIST model using the core TensorFlow API along with the requisite call to export_savedmodel():

library(tensorflow)

sess <- tf$Session()
datasets <- tf$contrib$learn$datasets
mnist <- datasets$mnist$read_data_sets("MNIST-data", one_hot = TRUE)

x <- tf$placeholder(tf$float32, shape(NULL, 784L))
W <- tf$Variable(tf$zeros(shape(784L, 10L)))
b <- tf$Variable(tf$zeros(shape(10L)))
y <- tf$nn$softmax(tf$matmul(x, W) + b)
y_ <- tf$placeholder(tf$float32, shape(NULL, 10L))
cross_entropy <- tf$reduce_mean(
  -tf$reduce_sum(y_ * tf$log(y), reduction_indices=1L)
)

optimizer <- tf$train$GradientDescentOptimizer(0.5)
train_step <- optimizer$minimize(cross_entropy)

init <- tf$global_variables_initializer()
sess$run(init)

for (i in 1:1000) {
  batches <- mnist$train$next_batch(100L)
  batch_xs <- batches[[1]]
  batch_ys <- batches[[2]]
  sess$run(train_step,
           feed_dict = dict(x = batch_xs, y_ = batch_ys))
}

export_savedmodel(
  sess,
  "savedmodel",
  inputs = list(images = x),
  outputs = list(scores = y))

You can then generate predictions as follows:

# prepare list of test images
test_images <- mnist$test$next_batch(10L)[[1]]
test_images <- lapply(1:nrow(test_images), function(i) test_images[i,])

# generate predictions
library(tfdeploy)
preds <- predict_savedmodel(
  test_images, 
  "savedmodel",
  type = "export"
)

Model Deployment

There are a variety of ways to deploy a TensorFlow SavedModel, each of which are described below. Of the 4 methods described, 3 of them (the local server, CloudML, and RStudio Connect) all share the same REST interface, which is described in detail here: https://cloud.google.com/ml-engine/docs/v1/predict-request.

Local Server

The first place you are likely to "deploy" a SavedModel is on your local system, for the purpose of testing and refining the prediction API. The serve_savedmodel() function runs a local server that serves your model:

serve_savedmodel("savedmodel")
Starting server under http://127.0.0.1:8089 with the following API entry points:
  http://127.0.0.1:8089/api/serving_default/predict/

The REST API used by the local server is based on the CloudML predict request API.

If you navigate to http://localhost:8089 you'll see a web page that describes the REST interace to your model:

You can request predictions remotely from any language or environment using this REST API. You can also call the predict_savedmodel() function from R:

predict_savedmodel(
  input_data,
  "http://localhost:8089/api/serving_default/predict", 
  type = "webapi"
)

CloudML

You can deploy TensorFlow SavedModels to Google's CloudML service using functions from the cloudml package. For example:

library(cloudml)
cloudml_deploy("savedmodel", name = "keras_mnist")

You can generate predictions using the cloudml_predict() function:

cloudml_predict(test_images, name = "keras_mnist")

See the Deploying Models article on the CloudML package website for additional details.

RStudio Connect

RStudio Connect is a server publishing platform for applications, reports, and APIs created with R.

An upcoming version of RStudio Connect will include support for hosting TensorFlow SavedModels, using the same REST interface as is supported by the local server and CloudML.

TensorFlow Serving

TensorFlow Serving is an open-source library and server implementation that allows you to serve TensorFlow SavedModels using a gRPC interface.

Once you have exported a TensorFlow model using export_savedmodel() it's straightforward to deploy it using TensorFlow Serving. See the documentation at https://www.tensorflow.org/serving for additional details.