The Live API equips developers with the essential tools to craft applications and intelligent agents capable of processing streaming audio, video, and text with incredibly low latency. This speed is paramount for creating truly interactive experiences, opening doors for customer support solutions, educational platforms, and real-time monitoring services.
Link to Youtube Video (visible only when JS is disabled)
Recently we announced the preview launch of the Live API for Gemini models – a significant step forward in enabling developers to build robust and scalable real-time applications. Try the latest features now using the Gemini API in Google AI Studio and in Vertex AI.
Since our experimental launch in December, we've been listening closely to your feedback and have incorporated new features and capabilities to make the Live API production ready. Find full details in the Live API documentation:
session_resumption
) to reconnect and resume where you left off.GoAway
server message indicating when a connection is about to close, allowing for graceful handling before termination.activityStart
, activityEnd
) for manual turn control.speechConfig
.usageMetadata
field of server messages, broken down by modality and prompt/response phases.To inspire your next project, we're showcasing developers who are already leveraging the power of the Live API in their applications:
Daily integrates Live API support into the Pipecat Open Source SDKs for Web, Android, iOS and C++.
By using the power of the Live API, Pipecat Daily has created a voice-based word guessing game – Word Wrangler. Test your description skills in this AI-powered twist on classic word games and see how you can build one for yourself!
LiveKit integrates Live API support into LiveKit Agents. This framework for building voice AI agents provides a fully open-source platform for creating server-side agentic applications.
"Until the Live API, no other LLM offered a developer interface that could directly ingest streaming video.”
– Russell d’Sa, CEO
Check out their demo where they built an AI copilot that can browse the internet alongside you while sharing thoughts about what it can see in real-time.
Hey Bubba is an agentic, voice-first AI application specifically developed for truck drivers. Utilizing the Live API, it enables seamless, multi-language voice communication, allowing drivers to operate hands-free. Key functionalities include:
The Live API powers both driver interaction (leveraging function calling and context caching for queries like future pickups) and Bubba's ability to interact during phone calls for negotiation and booking. This makes Hey Bubba a comprehensive AI tool for the largest and most diverse job sector in the USA.
Link to Youtube Video (visible only when JS is disabled)
Live API is ready to power your next real time voice application, to get started:
Happy building!