A Look Inside Google's Most Capable AI Model Family
Gemini is the first model family from Google built to be natively **multimodal**, meaning it can understand, operate across, and combine different types of information from the ground up—including text, code, audio, images, and video. This makes it incredibly flexible and powerful for a vast range of tasks.
MAXIMUM PERFORMANCE
The most powerful model for highly complex tasks requiring deep reasoning and understanding.
SCALED PERFORMANCE
The best all-around model, offering advanced performance with a massive 1M token context window.
SPEED & EFFICIENCY
A lighter-weight model optimized for high-speed, high-volume tasks where latency matters.
ON-DEVICE TASKS
The most efficient model designed to run directly on mobile devices for fast, offline capabilities.
This chart shows how the different Gemini models compare on key performance metrics. Each model is optimized for a different balance of power, speed, and cost, allowing you to choose the perfect tool for your specific application.
Follow this simple guide to find the Gemini model that best fits your needs. Start with your primary requirement and follow the path to the recommended model.
Use Gemini 1.0 Ultra
Use Gemini 1.5 Flash
Use Gemini 1.5 Pro
📄
Gemini 1.5 Pro can process up to 1 million tokens of information at once—the largest of any large-scale foundation model. That's equivalent to analyzing 1,500 pages of text, a 1-hour video, or 30,000 lines of code.
🎨
Unlike models that stitch together different modalities, Gemini was built from the ground up to understand and reason about text, images, video, and audio seamlessly, leading to more sophisticated understanding and interaction.