Posts by Andrew Zhang

1 results

Clear filters
  • NOV. 24, 2025 / Mobile

    Unlocking Peak Performance on Qualcomm NPU with LiteRT

    LiteRT's new Qualcomm AI Engine Direct (QNN) Accelerator unlocks dedicated NPU power for on-device GenAI on Android. It offers a unified mobile deployment workflow, SOTA performance (up to 100x speedup over CPU), and full model delegation. This enables smooth, real-time AI experiences, with FastVLM-0.5B achieving over 11,000 tokens/sec prefill on Snapdragon 8 Elite Gen 5 NPU.

    Train a GPT2 model with JAX on TPU for free