Announcing TensorFlow Lite

NOV. 14, 2017

Posted by the TensorFlow team

Today, we're happy to announce the developer preview of TensorFlow Lite, TensorFlow’s lightweight solution for mobile and embedded devices! TensorFlow has always run on many platforms, from racks of servers to tiny IoT devices, but as the adoption of machine learning models has grown exponentially over the last few years, so has the need to deploy them on mobile and embedded devices. TensorFlow Lite enables low-latency inference of on-device machine learning models.

It is designed from scratch to be:

Lightweight Enables inference of on-device machine learning models with a small binary size and fast initialization/startup

Cross-platform A runtime designed to run on many different platforms, starting with Android and iOS

Fast Optimized for mobile devices, including dramatically improved model loading times, and supporting hardware acceleration

More and more mobile devices today incorporate purpose-built custom hardware to process ML workloads more efficiently. TensorFlow Lite supports the Android Neural Networks API to take advantage of these new accelerators as they come available.
TensorFlow Lite falls back to optimized CPU execution when accelerator hardware is not available, which ensures your models can still run fast on a large set of devices.

Architecture

The following diagram shows the architectural design of TensorFlow Lite:

The individual components are:

TensorFlow Model: A trained TensorFlow model saved on disk.
TensorFlow Lite Converter: A program that converts the model to the TensorFlow Lite file format.
TensorFlow Lite Model File: A model file format based on FlatBuffers, that has been optimized for maximum speed and minimum size.

The TensorFlow Lite Model File is then deployed within a Mobile App, where:

Java API: A convenience wrapper around the C++ API on Android
C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The same library is available on both Android and iOS
Interpreter: Executes the model using a set of operators. The interpreter supports selective operator loading; without operators it is only 70KB, and 300KB with all the operators loaded. This is a significant reduction from the 1.5M required by TensorFlow Mobile (with a normal set of operators).
On select Android devices, the Interpreter will use the Android Neural Networks API for hardware acceleration, or default to CPU execution if none are available.

Developers can also implement custom kernels using the C++ API, that can be used by the Interpreter.

Models

TensorFlow Lite already has support for a number of models that have been trained and optimized for mobile:

MobileNet: A class of vision models able to identify across 1000 different object classes, specifically designed for efficient execution on mobile and embedded devices
Inception v3: An image recognition model, similar in functionality to MobileNet, that offers higher accuracy but also has a larger size
Smart Reply: An on-device conversational model that provides one-touch replies to incoming conversational chat messages. First-party and third-party messaging apps use this feature on Android Wear.

Inception v3 and MobileNets have been trained on the ImageNet dataset. You can easily retrain these on your own image datasets through transfer learning.

What About TensorFlow Mobile?

As you may know, TensorFlow already supports mobile and embedded deployment of models through the TensorFlow Mobile API. Going forward, TensorFlow Lite should be seen as the evolution of TensorFlow Mobile, and as it matures it will become the recommended solution for deploying models on mobile and embedded devices. With this announcement, TensorFlow Lite is made available as a developer preview, and TensorFlow Mobile is still there to support production apps.
The scope of TensorFlow Lite is large and still under active development. With this developer preview, we have intentionally started with a constrained platform to ensure performance on some of the most important common models. We plan to prioritize future functional expansion based on the needs of our users. The goals for our continued development are to simplify the developer experience, and enable model deployment for a range of mobile and embedded devices.
We are excited that developers are getting their hands on TensorFlow Lite. We plan to support and address our external community with the same intensity as the rest of the TensorFlow project. We can't wait to see what you can do with TensorFlow Lite.
For more information, check out the TensorFlow Lite documentation pages.
Stay tuned for more updates.
Happy TensorFlow Lite coding!

posted in: