Balance cost & reliability with our new Flex & Priority inference tiers in the Gemini API! Flex: Pay 50% less for cost-sensitive & latency-tolerant workloads Priority: Highest reliability for your most critical, interactive apps (with premium pricing) Together with the async Batch API, these synchronous tiers give you a complete set of options for any workload. Just swap the tier with a single line of code and keep building. Learn more ⬇️
New ways to balance cost and reliability in the Gemini API