Search

4 results

Clear filters
  • AUG. 27, 2025 / Google Labs

    Stop “vibe testing” your LLMs. It's time for real evals.

    Stax, an experimental developer tool, addresses the insufficient nature of "vibe testing" LLMs by streamlining the LLM evaluation lifecycle, allowing users to rigorously test their AI stack and make data-driven decisions through human labeling and scalable LLM-as-a-judge auto-raters.

    Stax
  • AUG. 12, 2025 / Google Labs

    Meet Jules’ sharpest critic and most valuable ally

    Jules' critic functionality addresses potential issues like subtle bugs and missed edge cases in AI-generated code by acting as a peer reviewer within the generation process. This "critic-augmented generation" means proposed code changes undergo adversarial review, allowing Jules to improve its output and ultimately deliver higher-quality, pre-reviewed code.

    Jules critic agent
  • JULY 24, 2025 / Google Labs

    Introducing Opal: describe, create, and share your AI mini-apps

    Opal is a new experimental tool from Google Labs that helps you compose prompts into dynamic, multi-step mini-apps using natural language, removing the need for code, allowing users to build and deploy shareable AI apps with powerful features and seamless integration with existing Google tools.

    Opal Metadata card
  • MAY 20, 2025 / Gemini

    From idea to app: Introducing Stitch, a new way to design UIs

    Stitch, a new Google Labs experiment, uses AI to generate UI designs and frontend code from text prompts and images, aiming to streamline the design and development workflow, offering features like UI generation from natural language or images, rapid iteration, and seamless paste to Figma and front-end code.

    From idea to app: Introducing Stitch, a new way to design UIs