Imagine gazing into a mirror and seeing not just your reflection, but a gateway to information, creativity, and a touch of enchantment. This is precisely what the Gemini backed Magic Mirror project brings to life. Moving beyond a simple display, this project showcases the incredible interactive capabilities of the Gemini API and JavaScript GenAI SDK, transforming a familiar object into a new chat interface.
Link to Youtube Video (visible only when JS is disabled)
This project creates its interactive experience using several features of the Gemini API:
The foundation of the magic mirror's interactivity is the Live API. This allows for continuous, real-time voice interactions. You speak, and the mirror doesn't just listen for a single command, it engages in a flowing conversation by processing your speech as you talk, allowing for a more natural back-and-forth dialogue in either text or audio.
On top of this, the Live API is able to understand when you’re speaking during playback and interpret that interruption to pivot the narrative and conversation based on your inputs, allowing for dynamic audible conversations alongside text.
Link to Youtube Video (visible only when JS is disabled)
On top of being able to have a conversation through the Live API, the magic mirror can also be customized to weave tales, all thanks to the Gemini model's advanced generation capabilities by providing specific system instructions and updating speech configurations during initialization to include different dialects or accents, voices, and a variety of other attributes.
Link to Youtube Video (visible only when JS is disabled)
While conversations and stories are great, sometimes you want to be able to know about the world around you as it’s happening. This magic mirror project leverages the model’s ability to integrate with Grounding with Google Search, providing grounded, up-to-date information.
Link to Youtube Video (visible only when JS is disabled)
Using Function Calling with the Gemini API, the magic mirror is able to generate visuals based on your descriptions, adding depth to stories and deepening the experience of interacting with the Gemini model. The Gemini model determines that your request requires image generation and calls a predefined function based on stated characteristics, passing along the detailed prompt it derives from your spoken words.
Link to Youtube Video (visible only when JS is disabled)
While the user experience is intended to hide the technical details, several powerful features of the Gemini models work in concert to make this magical experience:
This Gemini enabled Magic Mirror is more than a novelty; it's a powerful demonstration of how sophisticated AI can be woven into our physical environment to create helpful, engaging, and even enchanting interactions. The flexibility of the Gemini API opens the door to countless other applications, from ultra-personalized assistants to dynamic educational tools and immersive entertainment platforms.
You can view the code for this entire project on GitHub, as well as a complete technical tutorial on Hackster.io.
We encourage you to imagine the possibilities. What would your magic mirror do?
Be sure to share your ideas and Gemini enabled creations with us on X and LinkedIn.