Posts by Ian Ballantyne

4 results

Clear filters
  • JUNE 10, 2026 / AI

    DiffusionGemma: The Developer Guide

    DiffusionGemma is an experimental text-generation model built on the Gemma 4 architecture that uses diffusion-based parallel generation instead of token-by-token autoregression, enabling much faster inference, bidirectional context awareness, and real-time self-correction while remaining deployable on consumer GPUs. Its architecture generates and refines 256-token blocks in parallel through iterative denoising, allowing it to handle complex constraint-based tasks such as Sudoku more effectively than traditional language models and demonstrating strong gains from fine-tuning. The model integrates with vLLM and other popular inference frameworks, giving developers access to a new non-autoregressive approach that combines high performance, efficient long-context scaling, and straightforward customization and deployment.

    diffusion_banner
  • OCT. 8, 2025 / Web

    Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device

    This guide shows you how to fine-tune the Gemma 3 270M model for custom tasks, like an emoji translator. Learn to quantize and convert the model for on-device use, deploying it in a web app with MediaPipe or Transformers.js for a fast, private, and offline-capable user experience.

    OYOAI_Metadata_RD2-V01
  • SEPT. 4, 2025 / AI

    From Fine-Tuning to Production: A Scalable Embedding Pipeline with Dataflow

    Learn how to use Google's EmbeddingGemma, an efficient open model, with Google Cloud's Dataflow and vector databases like AlloyDB to build scalable, real-time knowledge ingestion pipelines.

    EG+Dataflow_Metadatal
  • JUNE 26, 2025 / Gemma

    Introducing Gemma 3n: The developer guide

    The Gemma 3n model has been fully released, building on the success of previous Gemma models and bringing advanced on-device multimodal capabilities to edge devices with unprecedented performance. Explore Gemma 3n's innovations, including its mobile-first architecture, MatFormer technology, Per-Layer Embeddings, KV Cache Sharing, and new audio and MobileNet-V5 vision encoders, and how developers can start building with it today.

    Introducing Gemma 3n: The Developer Guide