Google Announces New Gemini 1.5 Pro with a Large Context Window

Feb 16, 2024 | 0 comments

Gemini 1.5 Pro

Google introduced the next version of their AI model, the Gemini 1.5 Pro, with a much larger context window as compared to the Gemini 1.0 Pro and other AI models. The Gemini 1.5 Pro uses the popular Mixture-of-Experts (MoE) approach for more efficient training and higher-quality responses. 

The Gemini 1.5 Pro is much better than the Gemini 1.0 Pro in text, vision, and audio benchmarks. It gives good competition to Google’s best AI model, Gemini Ultra 1.0. The Gemini 1.5 Pro also uses less computation power as compared to the Gemini Ultra 1.0.

Google Gemini 1.5 Pro with a 1 Million token Context Window 

Google’s CEO, Sundar Pichai, took to social media platform X and said the Gemini 1.5 Pro will soon roll out with a 128K-token context window.

Pichai also said that enterprise users and developers can try the Gemini 1.5 Pro with a whopping experimental 1 million token context window.

This large context window easily beats Gemini 1.0, GPT-4 Turbo, and Claude 2.1, which come with 32K, 128K, and 200K context windows, respectively.

With the 1 million token context window, enterprise users and developers can upload entire PDFs, entire code repos, and long videos. It can take 11 hours of audio, 1 hour of video, and 7,00,000 words at once in a single context. However, Google said users can expect a longer latency time as it is still an experimental feature of the model.

Gemini 1.5 Pro Comparison with other AI Models

Google Gemini 1.5 Pro with a 10 million token Context Window

Google also did research in which Gemini 1.5 was tested on 10 million tokens for text, 2 million tokens for audio, and 2.8 million tokens for video, and the model was able to remember these tokens with great accuracy.

This will easily beat every other model currently present in the AI industry.

Capabilities of the Gemini 1.5 Pro

Google also gave us a demo showing the capabilities of their new Gemini 1.5 model. In the demo video, a 402-page PDF transcript of the Appollo 11 moon landings was uploaded to Google Visual Studio. The model was asked to find three comedic moments and list quotes from the transcript and emojis. The Gemini Pro 1.5 was able to extract three quotes and comedic moments accurately from the PDF. 

Further, it was given a drawing showing a man’s foot, and the model was asked, “What moment is this?” The model correctly identified it as Neil Armstrong’s first step on the moon, which shows the multimodal capabilities of the model.

When it was further asked to cite the timecode of this moment in the transcript, the model was able to cite the timecode accurately.

Google also uploaded two more demo videos in which the Gemini 1.5 Pro model was able to process more than 100K lines of code in Three.js. It was also able to process 44 minutes of the Buster Keaton film, find small details in the film, and understand plot points.

How to access Gemini 1.5 Pro

Google said that the Gemini 1.5 Pro with a 1 million token context window can currently only be accessed by developers and enterprise users. 
Developers interested in testing 1.5 Pro can sign up and access it via Google AI Studio, while enterprise users can access it via their Vertex AI account team.

Conclusion

Google is introducing new AI technologies and features day by day. It seems that they are very serious and want to be ahead of everyone in the AI race. With a 1 million-token context window, users can do many tasks they never thought of, such as uploading full movies to get reviews or uploading full code files to know which function does what in a program. Google has also researched Gemini 1.5 with a 10 million token context window, and it worked with great accuracy. If Google is able to pull this off, then no other AI model will be able to compete with Gemini.

Competitors like OpenAI will have to work hard to stay relevant and compete with Google. The battle between OpenAI and Google is increasing day by day, as OpenAI has also announced their text-to-video generator, which is directly going to compete with Google’s Lumiere. It is fun to see the tech giants compete in this AI race.

Read More

OpenAI’s Sora: A Groundbreaking New Text-to-Video AI Generator Announced

New Slack AI Features are here to Boost Productivity

RECENT POSTS

Elon Musk’s xAI Announces Grok 1.5 with Great Capabilities

Elon Musk’s xAI Announces Grok 1.5 with Great Capabilities

Image Credits: xAI Elon Musk's xAI launched Grok last year in November to compete with chatbots from big tech giants like Google, Microsoft, and OpenAI. Elon Musk's xAI is soon launching the next version of their chatbot, Grok 1.5, which performs really well as...

Meta’s Ray-Ban Smart Glasses Are Getting New AI Features

Meta’s Ray-Ban Smart Glasses Are Getting New AI Features

Image Credits: Meta Meta’s $300 smart glasses, made in collaboration with Ray-Ban, allow users to take pictures, record videos, make calls, hear music, and do much more. Now, new AI features are being added to Meta's Ray-Ban smart glasses.  New AI Features in Meta’s...

Claude 3 beats GPT-4 for the First Time on LMSYS Leaderboard

Claude 3 beats GPT-4 for the First Time on LMSYS Leaderboard

Anthropic released the Claude 3 model family earlier this month, and they have become highly popular since their release. Now Anthropic's Claude 3 Opus Model beats OpenAI's GPT-4 model for the first time on the LMSYS Chatbot Arena Leaderboard. LMSYS Chatbot Arena is a...

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *