Search for "LiteRT"

25 results

Clear filters
  • FEB. 3, 2026 / AI

    Easy FunctionGemma finetuning with Tunix on Google TPUs

    Finetuning the FunctionGemma model is made fast and easy using the lightweight JAX-based Tunix library on Google TPUs, a process demonstrated here using LoRA for supervised finetuning. This approach delivers significant accuracy improvements with high TPU efficiency, culminating in a model ready for deployment.

    Building-1-banner
  • JAN. 28, 2026 / Mobile

    LiteRT: The Universal Framework for On-Device AI

    LiteRT, the evolution of TFLite, is now the universal framework for on-device AI. It delivers up to 1.4x faster GPU, new NPU support, and streamlined GenAI deployment for models like Gemma.

    LiteERT_banner
  • DEC. 8, 2025 / Mobile

    MediaTek NPU and LiteRT: Powering the next generation of on-device AI

    LiteRT and MediaTek are announcing the new LiteRT NeuroPilot Accelerator. This is a ground-up successor for the TFLite NeuroPilot delegate, bringing seamless deployment experience, state-of-the-art LLM support, and advanced performance to millions of devices worldwide.

    Train a GPT2 model with JAX on TPU for free
  • NOV. 24, 2025 / Mobile

    Unlocking Peak Performance on Qualcomm NPU with LiteRT

    LiteRT's new Qualcomm AI Engine Direct (QNN) Accelerator unlocks dedicated NPU power for on-device GenAI on Android. It offers a unified mobile deployment workflow, SOTA performance (up to 100x speedup over CPU), and full model delegation. This enables smooth, real-time AI experiences, with FastVLM-0.5B achieving over 11,000 tokens/sec prefill on Snapdragon 8 Elite Gen 5 NPU.

    Train a GPT2 model with JAX on TPU for free
  • OCT. 8, 2025 / Web

    Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device

    This guide shows you how to fine-tune the Gemma 3 270M model for custom tasks, like an emoji translator. Learn to quantize and convert the model for on-device use, deploying it in a web app with MediaPipe or Transformers.js for a fast, private, and offline-capable user experience.

    OYOAI_Metadata_RD2-V01
  • SEPT. 24, 2025 / Mobile

    On-device GenAI in Chrome, Chromebook Plus, and Pixel Watch with LiteRT-LM

    Google AI Edge provides the tools to run AI features on-device, and its new LiteRT-LM runtime is a significant leap forward for generative AI. LiteRT-LM is an open-source C++ API, cross-platform compatibility, and hardware acceleration designed to efficiently run large language models like Gemma and Gemini Nano across a vast range of hardware. Its key innovation is a flexible, modular architecture that can scale to power complex, multi-task features in Chrome and Chromebook Plus, while also being lean enough for resource-constrained devices like the Pixel Watch. This versatility is already enabling a new wave of on-device generative AI, bringing capabilities like WebAI and smart replies to users.

    Screens-1-banner (1)
  • SEPT. 9, 2025 / Mobile

    Google AI Edge Gallery: Now with audio and on Google Play

    Google AI Edge has expanded the Gemma 3n preview to include audio support. Users can play with it on their own mobile phone using the Google AI Edge Gallery, which is now available in Open Beta on Play Store.

    GoogleAIEdge_Metadatal_RD2-V01
  • SEPT. 4, 2025 / Gemma

    Introducing EmbeddingGemma: The Best-in-Class Open Model for On-Device Embeddings

    Introducing EmbeddingGemma: a new embedding model designed for efficient on-device AI applications from Google. This open model is the highest-ranking text-only multilingual embedding model under 500M parameters on the MTEB benchmark, enabling powerful features like RAG and semantic search directly on mobile devices without an internet connection.

    EmbeddingGemma_Metadata
  • AUG. 14, 2025 / Gemma

    Introducing Gemma 3 270M: The compact model for hyper-efficient AI

    Google's new Gemma 3 270M is a compact, 270-million parameter model offering energy efficiency, production-ready quantization, and strong instruction-following, making it a powerful solution for task-specific fine-tuning in on-device and research settings.

    Gemma 3 270M
  • MAY 20, 2025 / AI Edge

    LiteRT: Maximum performance, simplified

    LiteRT has been improved to boost AI model performance and efficiency on mobile devices by effectively utilizing GPUs and NPUs, now requiring significantly less code, enabling simplified hardware accelerator selection, and more for optimal on-device performance.

    Built with LiteRT: Maximum Performance, Simplified