Google Introduces PaliGemma, A New Visual Language Model
Written by Kay Ewbank   
Monday, 20 May 2024

Last week's Google I/O saw the introduction of PaliGemma, an open vision-language model (VLM), together with some details of what's coming in Gemma 2. 

Gemma is Google's lightweight open models that have been built from the same research and technology used to create Google's Gemini models. The existing models in Gemma are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

 

gemma

PaliGemma is described as a powerful open VLM inspired by PaLI-3. The language is built on open components, including the SigLIP vision model and the Gemma language model, and is designed a wide range of vision-language tasks, including image and short video captioning, visual question answering, understanding text in images, object detection, and object segmentation.

HuggingSpace Face Running Paligemma

Google is providing both pre-trained and fine-tuned checkpoints at multiple resolutions, as well as checkpoints specifically tuned to a mixture of tasks for immediate exploration.

PaliGemma is being released for various platforms and resources, with free options including Kaggle and Colab notebooks. Academic researchers seeking to push the boundaries of vision-language research can also apply for Google Cloud credits to support their work. The language joins CodeGemma and RecurrentGemma which were released earlier in the year.

Google also shared some information about Gemma 2, a 27B parameter instance that outperforms models twice its size and runs on a single TPUv5e. Gemma 2 will be available in new sizes and is based on a new architecture. Google says Gemma 2 will deliver performance comparable to Llama 3 70B at less than half the size. The updated design will mean it can fit on less than half the compute of comparable models, with the 27B model optimized to run on NVIDIA's GPUs.

Google also says Gemma 2 will provide developers with "robust tuning capabilities across a diverse ecosystem of platforms and tools".

Google PaliGemma is available now. 

gemma

More Information

Google Gemma

Related Articles

Google Releases Gemma Open Models

Gemini 1.5 Pro Now Available

Google Rebrands Bard With Subscription

Google Adds Gemini To Bard

Google Adds Code Generation To Bard

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Can You Solve The GCHQ Christmas Challenge 2024
20/12/2024

The GCHQ Christmas Challenge has become a pre-Christmas tradition. While it is primarily targeted at school students working in teams, GCHQ encourages both children and adults to give it a try.



GitHub Announces Free Copilot
19/12/2024

GitHub has launched GitHub Copilot Free, a free version of Copilot that provides limited access to selected features of Copilot and is automatically integrated into VS Code. The free tier is aimed at  [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 20 May 2024 )