Over the past year, Google’s TensorFlow has asserted itself as a popular open source toolkit for deep learning. But training a TensorFlow model can be cumbersome and slow—especially when the mission is to take a dataset used by someone else and try to refine the training process it uses. The sheer number of moving parts and variations in any model-training process is enough to make even deep-learning experts take a deep breath.
This week, Google open-sourced a project intended to cut down on the amount of work in configuring a deep learning model for training. Tensor2Tensor, or T2T for short, is a Python-powered workflow organization library for TensorFlow training jobs. It lets developers specify the key elements used in a TensorFlow model and define the relationships among them.
Here are the key elements:
- Datasets: T2T has built-in support for several common datasets used for training. You can add new datasets to your individual workflows, or add them to the core T2T project via a pull request.
- Problems and modalities: These describe what kind of task the training is for, such as speech recognition versus machine translation, and what kinds of data to both expect for it and generate from it. For example, an image-recognition system would take in images and return text labels.
- Models: Many commonly used models are already registered with T2T, but you can add more.
- Hyperparameters: You can create sets of the various settings that control the training process, so you can switched among them or batch them together as needed.
- Trainers: You can separately specify the parameters passed to the actual training binary.
T2T comes with defaults for each element, which is what’s most immediately useful about it. Several common models and datasets come baked into T2T, so you can quickly get started by reusing or expanding on an existing mode and deploy one of the defaults and tinker with it as needed.
What T2T doesn’t do is provide a larger context beyond TensorFlow for how to organize a deep learning project. Theoretically, it could become part of an end-to-end, data-to-prediction system for building machine learning solutions, but right now it simply makes the job of using TensorFlow easier—and that is absolutely worth having.