Earlier this year, Google unveiled its Tensor Processing Unit, custom hardware for speeding up prediction-making with machine learning models.
Now Microsoft is trying something similar, with its Project Brainwave hardware, which supports many major deep learning systems in wide use. Project Brainwave covers many of the same goals as Google’s TPU: Speed up how predictions are served from machine learning models (in Brainwave case, those hosted in Azure, using custom hardware deployed in Microsoft’s cloud at scale).
Of course, developers need tools to do deep learning via Brainwave. Microsoft says Brainwave already supports Google TensorFlow and Microsoft’s own Cognitive Toolkit, and Microsoft says it plans to support “many others.” Brainwave converts existing models built with those frameworksto a format that can be used natively on Microsoft’s silicon, although it isn’t clear yet how much of a bottleneck that creates when porting existing models.
With Brainwave, Microsoft claims speed improvements above and beyond just using dedicated silicon. Its FPGAs are attached directly to the network fabric in the datacenter, letting a deep neural network be associated as closely as possible with a specific FPGA. Microsoft claims this high-throughput design makes it easier to create deep learning applications that run in real time, rather than require long offline training periods.
Microsoft also claims it’s taking a different approach using FPGAs to run predictions. Deep learning predictions can be delivered faster by swapping accuracy for speed. The Brainwave FPGAs can choose from a variety of data types needed, high or low precision, based on the needs of the specific problem.