3 results
JUNE 3, 2026 / AI
The newly released Gemma 4 12B is a dense, multimodal model designed for high-performance local AI execution on consumer devices. By introducing a novel, encoder-free architecture, it bypasses traditional visual and audio encoders to feed multimodal data directly into the LLM backbone.
FEB. 19, 2025 / Gemma
PaliGemma 2 mix, an upgraded vision-language model, is now available, offering capabilities like image captioning, OCR, and object detection in various sizes.
DEC. 5, 2024 / Gemma
PaliGemma 2, the next evolution in tunable vision-language models, comes with new features such as scalable performance, long captioning, and expanded capabilities. Get started with pre-trained models, documentation, and tutorials.