Posts by Weiyi Wang

2 results

Clear filters
  • APRIL 23, 2026 / Mobile

    Building real-world on-device AI with LiteRT and NPU

    LiteRT is a production-ready framework designed to help mobile developers unlock the power of Neural Processing Units (NPUs), overcoming the performance and battery limitations of traditional CPU or GPU processing. By providing a unified API that abstracts away hardware complexities, it allows industry leaders like Google Meet and Epic Games to deploy sophisticated AI models for real-time video, animation, and speech recognition with significantly higher efficiency. The platform further supports developers through benchmarking tools and cross-platform compatibility, enabling seamless AI deployment across mobile devices, AI PCs, and industrial IoT hardware.

    Gemini_Generated_Image_ignk8signk8signk (1)
  • NOV. 24, 2025 / Mobile

    Unlocking Peak Performance on Qualcomm NPU with LiteRT

    LiteRT's new Qualcomm AI Engine Direct (QNN) Accelerator unlocks dedicated NPU power for on-device GenAI on Android. It offers a unified mobile deployment workflow, SOTA performance (up to 100x speedup over CPU), and full model delegation. This enables smooth, real-time AI experiences, with FastVLM-0.5B achieving over 11,000 tokens/sec prefill on Snapdragon 8 Elite Gen 5 NPU.

    Train a GPT2 model with JAX on TPU for free