Google Introduces PaliGemma, A New Visual Language Model
Written by Kay Ewbank   
Monday, 20 May 2024

Last week's Google I/O saw the introduction of PaliGemma, an open vision-language model (VLM), together with some details of what's coming in Gemma 2. 

Gemma is Google's lightweight open models that have been built from the same research and technology used to create Google's Gemini models. The existing models in Gemma are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

 

gemma

PaliGemma is described as a powerful open VLM inspired by PaLI-3. The language is built on open components, including the SigLIP vision model and the Gemma language model, and is designed a wide range of vision-language tasks, including image and short video captioning, visual question answering, understanding text in images, object detection, and object segmentation.

HuggingSpace Face Running Paligemma

Google is providing both pre-trained and fine-tuned checkpoints at multiple resolutions, as well as checkpoints specifically tuned to a mixture of tasks for immediate exploration.

PaliGemma is being released for various platforms and resources, with free options including Kaggle and Colab notebooks. Academic researchers seeking to push the boundaries of vision-language research can also apply for Google Cloud credits to support their work. The language joins CodeGemma and RecurrentGemma which were released earlier in the year.

Google also shared some information about Gemma 2, a 27B parameter instance that outperforms models twice its size and runs on a single TPUv5e. Gemma 2 will be available in new sizes and is based on a new architecture. Google says Gemma 2 will deliver performance comparable to Llama 3 70B at less than half the size. The updated design will mean it can fit on less than half the compute of comparable models, with the 27B model optimized to run on NVIDIA's GPUs.

Google also says Gemma 2 will provide developers with "robust tuning capabilities across a diverse ecosystem of platforms and tools".

Google PaliGemma is available now. 

gemma

More Information

Google Gemma

Related Articles

Google Releases Gemma Open Models

Gemini 1.5 Pro Now Available

Google Rebrands Bard With Subscription

Google Adds Gemini To Bard

Google Adds Code Generation To Bard

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


The Feds Want Us To Move On From C/C++
13/11/2024

The clamour for safe programming languages seems to be growing and becoming official. We have known for a while that C and C++ are dangerous languages so why has it become such an issue now and is it  [ ... ]



Apollo Adds REST APIs For GraphQL
29/10/2024

Apollo has added a simpler way to integrate REST APIs into a federated GraphQL environment. Available now in public preview, can be used to map REST API endpoints to their GraphQL schema using a decla [ ... ]


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 20 May 2024 )