Google releases Gemini 3.5 Flash, claims it outruns rival frontier models fourfold on speed

Wednesday, 20 May 2026 at 15:19

At its annual I/O conference, Google unveiled Gemini 3.5 Flash as a model that matches large flagship AI systems on key performance metrics while running at speeds that smaller, faster models are known for. According to Google, Gemini 3.5 Flash delivers frontier performance for agents and coding, with a particular focus on complex long-horizon tasks.

Gemini 3.5 Flash is now available to everyone via the Gemini app and AI Mode in Google Search, to developers through Google Antigravity and the Gemini API in Google AI Studio and Android Studio, and to enterprises through the Gemini Enterprise Agent Platform and Gemini Enterprise.

Google vs Google vs Competition

The clearest measure of progress is how 3.5 Flash compares to what came before it. Google says 3.5 Flash outperforms Gemini 3.1 Pro on challenging coding and agentic benchmarks, including Terminal-Bench 2.1 (76.2%), GDPval-AA (1656 Elo), and MCP Atlas (83.6%). It also leads in multimodal understanding, scoring 84.2% on CharXiv Reasoning.

The significance of that last point is that 3.5 Flash, a speed-optimised model, is besting a previous-generation Pro model, which is a larger and more capable tier in Google's AI model lineup.

Google frames the competitive positioning around the speed-quality trade-off that has long defined the AI model market. The company says that when measured by output tokens per second, 3.5 Flash is four times faster than other frontier models. It adds that 3.5 Flash lands in the top-right quadrant of the Artificial Analysis index, delivering frontier-level intelligence at exceptional speed.

On cost, Google claims that Gemini 3.5 Flash can help complete tasks that previously took developers days or auditors weeks, often at less than half the cost of other frontier models.

Focus: Agentic Capabilities and Enterprise Use

Much of Google's emphasis at I/O was on agentic workflows, tasks where an AI model plans, executes, and iterates across multiple steps without constant human prompting. When paired with Google's updated Antigravity harness, Gemini 3.5 Flash can deploy collaborative subagents to tackle problems at scale for demanding use cases, reliably executing multi-step workflows and coding tasks under supervision.

Several enterprise partners are already piloting the model. Macquarie Bank is testing how 3.5 Flash can speed up customer onboarding by reasoning over complex documents of more than 100 pages. Salesforce is integrating it into Agentforce to automate complicated enterprise tasks using multiple subagents. Shopify is running subagents in parallel to analyse complex data for merchant growth forecasts.

The model is also powering Google's new personal AI agent. Gemini Spark, described as a personal AI agent that runs around the clock and can take action on a user's behalf. Google says it is beginning to roll out Gemini Spark to trusted testers, with a Beta planned for Google AI Ultra subscribers in the US the following week.

A more capable model is on the way. Google says it is working on Gemini 3.5 Pro, which is already being used internally, with a public rollout planned for the following month.