Google has detailed its security approach for Chrome’s agentic features, which enable automated actions like booking tickets or shopping, addressing potential data and financial risks associated with these capabilities.
To manage agentic actions, Google employs several models. A User Alignment Critic, built with Gemini, scrutinizes action items generated by the planner model. If the critic model determines that planned tasks do not align with user goals, it prompts the planner to re-evaluate the strategy. Notably, the critic model accesses only metadata of proposed actions, not actual web content.
Google utilizes Agent Origin Sets to restrict agent access to disallowed or untrustworthy sites. These sets confine the model to specific read-only and read-writeable origins. Read-only origins define content that Gemini can consume; for example, product listings on a shopping site are relevant, but banner ads are not. The agent can only interact with designated iframes on a page. According to Google, “This delineation enforces that only data from a limited set of origins is available to the agent, and this data can only be passed on to the writable origins. This bounds the threat vector of cross-origin data leaks. This also gives the browser the ability to enforce some of that separation, such as by not even sending to the model data that is outside the readable set.”
In addition to Agent Origin Sets, Google monitors page navigation via another observer model to prevent access to harmful model-generated URLs, further enhancing the security of agentic features.
For sensitive tasks, Google requires user consent. If an agent attempts to access sensitive sites, such as banking or medical platforms, it requests user permission. Should a site require signing in, Chrome will prompt for user permission to utilize the password manager; the agent’s model does not access password data. Users will be asked before the agent initiates actions like making a purchase or sending a message, ensuring transparency and control over automated actions.
Google also employs a prompt-injection classifier to prevent unwanted actions and is evaluating agentic capabilities against attacks developed by researchers. This approach is part of a broader industry effort, as seen in Perplexity’s recent release of an open-source content detection model to counter prompt injection attacks against agents earlier this month.




