Google has just moved to a production release of TensorFlow Serving, its open source library for serving machine-learned models in production environments. A beta version of the technology was released in February.
Part of Google’s TensorFlow machine intelligence project, the TensorFlow Serving 1.0 library is intended to aid the deployment of algorithms and experiments while maintaining the same server architecture and APIs. TensoFlow Serving lets you push out multiple versions of models over time, as well as roll them back.
The library of course integrates with TensorFlow learning models, but it can also be extended to serve other model types.
A pre-built binary is available for the software. A Docker container can be used to install the server binary on non-Linux systems.
The three major aspects of the TensorFlow Serving 1.0 are:
- A set of C++ libraries that offer standards support for serving and learning TensorFlow models, as well a generic core platform not tied to TensorFlow.
- Binaries incorporating best practices and featuring reference Docker containers and tutorials.
- A hosted service with the Google Cloud ML platform, and an internal instance used by many Google products.
The principal concepts in TensorFlow Serving are:
- Servables, which are underlying objects used by clients for computation and the central abstraction in TensorFlow Serving.
- Loaders, to manage a servable life cycle. APIs are standardized for loading and unloading a servable.
- Sources, which are plugin modules that originate servables.
- Managers, to handle the servable life cycle.
The TensorFlow Serving 1.0 release deprecates the legacy SessionBundle model format. It is replaced with the SavedModel format introduced with TensorFlow 1.0.
Google say it has more than 800 projects itself using TensorFlow Serving in production.