[go: up one dir, main page]

Discovering Chrome’s pre-defined
Custom Handlers

In a previous post I described some architectural changes I’ve applied in Chrome to move the Custom Handlers logic into a new component. One of the advantages of this new architecture is an easier extensibility for embedders and less friction with the //chrome layer.

Igalia and Protocol Labs has been collaborating for some years already to improve the multi-protocol capabilities in the main browsers; most of our work has been done in Chrome, but we have interesting contributions in Firefox and of course we have plans to increase or activity in Safari.

As I said in the mentioned post, one of the long-term goals that Protocol Labs has with this work is to improve the integration of the IPFS protocol in the main browsers. On this regard, its undeniable that the Brave browser is the one leading this effort, providing a native implementation of the IPFS protocol that can work with both, a local IPFS node and the HTTPS gateway. For the rest of the browsers, Protocol Labs would like to at least implement the HTTPS gateway support, as an intermediate step and to increase the adoption and use of the protocol.

This is where the Chrome’s pre-defined custom handlers feature could provide us a possible approach. This feature allows Chrome to register handlers for specific schemes at compile-time. The ChromeOS embedders use this mechanism to register handlers for the mailto and webcal schemes, to redirect the HTTP requests to mail.google.com and google.com/calendar sites respectively.

The idea would be that Chrome could use the same approach to register pre-defined handlers for the IPFS scheme, redirecting the IPFS request to the HTTPS gateways.

In order to understand better how this feature would work, I\’ll explain first how Chrome deals with the needs of the different embedders regarding the schemes.

The SchemesRegistry

The SchemeRegistry data structure is used to declare the schemes registered in the browser and assign some specific properties to different subsets of such group of schemes.

The data structure defines different lists to provide these specific features to the schemes; the main and more basic list is perhaps the standard schemes, which is defined as follows:

A standard-format scheme adheres to what RFC 3986 calls “generic URI syntax” (https://tools.ietf.org/html/rfc3986#section-3).

There are other lists to lend specific properties to a well-defined set of schemes:

// Schemes that are allowed for referrers.
std::vector<std::string> referrer_schemes = {
  {kHttpsScheme, SCHEME_WITH_HOST_PORT_AND_USER_INFORMATION},
  {kHttpScheme, SCHEME_WITH_HOST_PORT_AND_USER_INFORMATION},
};
// Schemes that do not trigger mixed content warning.
std::vector<std::string> secure_schemes = {
  kHttpsScheme, kAboutScheme, kDataScheme, 
  kQuicTransportScheme, kWssScheme,
};
</std::string></std::string>

The following class diagram shows how the SchemeRegistry data structure is defined, with some duplication to avoid layering violations, and its relationship with the Content Public API, used by the Chrome embedders to define its own behavior for some schemes

It’s worth mentioning that this relationship between the //url, //content and //blink layers shows some margin to be improved, as it was noted down by Dimitry Gozman (one of the Content Public API owners) in one of the reviews of the patches I’ve been working on. I hope we could propose some improvements on this code, as part of the goals that Igalia and Protocol Labs will define for 2023.

Additional schemes for Embedders

As it was commented on the preliminary discussions I had as part of the analysis of the initial refactoring work described in the first section, Chrome already provides a way for embedders to define new schemes and even modifying the behavior of the ones already registered. This is done via the Content Public API, as usual.

The Chrome’s Content Layer offers a Public API that embedders can implement to define its own behavior for certain features.

One of these abstract features is the addition of new schemes to be handled internally. The ContentClient interface has a method called AddAdditionalSchemes precisely for this purpose. On the other hand, the ContentBrowserClient interface provides the HasCustomSchemeHandler and IsHandledURL methods to allow embedders implement their specific behavior to handler such schemes.

The following class diagram shows how embedders implements the Content Client interfaces to provide specific behavior to some schemes:

I’ve applied a refactoring to allow defining pre-defined handlers via the SchemeRegistry. This is some excerpt of such refactoring:

// Schemes with a predefined default custom handler.
std::map<std::string, std::string=""> predefined_handler_schemes;
 
void AddPredefinedHandlerScheme(const char* new_scheme, const char* handler) {
  DoAddSchemeWithHandler(
      new_scheme, handler,      
      GetSchemeRegistryWithoutLocking().predefined_handler_schemes);
}
 
const std::map<std::string, std::string="">
GetPredefinedHandlerSchemes() {
  return GetSchemeRegistry().predefined_handler_schemes;
}
</std::string,></std::string,>

Finally, embedders would just need to implement the ChromeContentClient::AddAdditionalSchemes interface to register a scheme and its associated handler.

For instance, in order to install in Android a predefined handler for the IPFS protocol, we would just need to add in the AwContentClient::AddAdditionalSchemes function the following logic:

 schemes.predefined_handler_schemes.emplace_back(
      "ipfs", "https://dweb.link/ipfs/?uri=%s");
  schemes.predefined_handler_schemes.emplace_back(
      "ipns", "https://dweb.link/ipns/?uri=%s");

Conclusion

The pre-defined handlers allow us to introduce in the browser a custom behavior for specific schemes.

Unlike the registerProtocolHandler method, we could bypass some if the restrictions imposed by the HTML API, given that in the context of an embedder browser its possible to have more control on the navigation use cases.

The custom handlers, either using the pre-defined handlers approach or the registerProtocolHandler method, is not the ideal way of introducing multi-protocol support in the browser. I personally consider it a promising intermediate step, but a more solid solution must be designed for these use cases of the browser.

New Custom Handlers component for Chrome

The HTML Standard section on Custom Handlers describes the procedure to register custom protocol handlers for URLs with specific schemes. When a URL is followed, the protocol dictates which part of the system handles it by consulting registries. In some cases, handlers may fall to the OS level and and launch a desktop application (eg. mailto:// may launch a native application) or, the browser may redirect the request to a different HTTP(S) URL (eg. mailto:// can go to gmail).

There are multiple ways to invoke the registerProtocolHandler method from the HTML API. The most common is through Browser Extensions, or add-ons, where the user must install third-party software to add or modify a browser’s feature. However, this approach puts on the users the responsibility of ensuring the extension is secure and that it respects their privacy. Ideally, downstream browsers could ship built-in protocol handlers that take this load off of users, but this was previously difficult.

During the last years Igalia and Protocol Labs have been working on improving the support of this HTML API in several of the main web engines, with the long term goal of getting the IPFS protocol a first-class member inside the most relevant browsers.

In this post I’m going to explain the most recent improvements, especially in Chrome, and some side advantages for Chromium embedders that we got on the way.

Componentization of the Custom Handler logic

The first challenge faced by Chromium-based browsers wishing to add support for a new protocol was the architecture. Most of the logic and relevant codebase lived in the //chrome layer. Thus, any intent to introduce a behavior change on this logic would require a direct fork and patch the //chrome layers with the new logic. The design wasn’t defined to allow Chrome embedders to extend it with new features

The Chromium project provides a Public Content API to allow embedders reusing a great part of the browser’s common logic and to implement, if needed, specific behavior though the specialization of these APIs. These would seem to be the ideal way of extending the browser capabilities to handle new protocols. However, accessing //chrome from any layer (eg. //content or //components) is forbidden and constitutes a layering violation.

After some discussion with Chrome engineers (special thanks to Kuniko Yasuda and Colin Blundell) we decided to move the Protocol Handlers logic to the //components layer and create a new Custom Handlers component.

The following diagram provides a high-level overview of the architectural change:

Custom Handlers logic moved to the //component layer
Changes in the Chrome’s component design

The new component makes the Protocol Handler logic Chrome independent, bringing us the possibility of using the Content Public API to introduce the mentioned behavior changes, as we’ll see later in this post.

Only the registry’s Delegate and the Factory classes will remain in the //chrome layer.

The Delegate class was defined to provide OS integration, like setting as default browser, which needs to access the underlying operating systems interfaces. It also deals with other browser-specific logic, like Profiles. With the new componentized design, embedders may provide their own Delegate implementations with different strategies to interact with the operating system’s desktop or even support additional flavors.

On the other hand, the Factory (in addition to the creation of the appropriated Delegate’s instance) provides a BrowserContext that allows the registry to operate under Incognito mode.

Lets now get a deeper look into the component, trying to understand the responsibilities of each class. The following class diagram illustrates the multi-process architecture of the Custom Handler logic:

Class diagram of the registerProtocolHandler logic in Chrome's multi-process architecture
Multi-process architecture class diagram

The ProtocolHandlerRegistry class has been modified to allow operating without user preferences storage; this is particularly useful to move most of the browser tests to the new component and avoid the dependency with the prefs and user_prefs components (eg. testing). In this scenario the registered handlers will be stored in memory only.

It’s worth mentioning that as part of this architectural change, I’ve applied some code cleanup and refactoring, following the policies dictated in the base::Value Code Health task-force. Some examples:

The ProtocolHandlerThrottle implements URL redirection with the handler provided by the registry. The ChromeContentBrowserClient and ShellBrowserContentClient classes use it to implement the CreateURLLoaderThrottles method of the ContentBrowserClient interface.

The specialized request is the responsible for handling the permission prompt dialog response, except for the the content-shell implementations, which bypass the Permission APIs (more on this later). I’ve also been working on a new design that relies on the PermissionContextBase::CreatePermissionRequest interface, so that we could create our own custom request and avoid the direct call to the PermissionRequestManager. Unfortunately this still needs some additional support to deal with the RegisterProtocolHandlerPermissionRequest‘s specific parameters needed for creating new instances, but I won’t extend on this, as it deserves its own post.

Finally, I’d like to remark that the new component provides a simple registry factory, designed mainly for testing purposes. It’s basically the same as the one implemented in the //chrome layer except some limitations of the new component:

  • No incognito mode
  • Use of a mocking Delegate (no actual OS interaction)

Single-Point security checks

I also wanted to apply some refactoring as well to improve the inter-process consistency and increase the amount of shared code between the renderer and browser process.

One of the most relevant parts of this refactoring was the one applied to the implement the Custom Handlers’ normalization process, which goes from some URL’s syntax validation to the security and privacy checks. I’ve landed some patches (CL#3507332, CL#3692750) to implement functions that are shared among the renderer and browser processes to implement the mentioned normalization procedure.

Refactoring to implement a single-point security and privacy checks
Security and privacy logic refactoring

The code has been moved to blink/common so that it could be invoked also by the new Custom Handlers component described before. Both the WebContentsImpl (run by the Browser process) and the NavigatorContentUtils (by the Renderer process) rely on this common code now.

Additionally, I have moved all the security and privacy checks from the Browser class, in Chrome, to the WebContentsImpl class. This implies that any embedder that implements the WebContentsDelegate interface shouldn’t need to bother about these checks, assuming that any ProtocolHandler instance is valid.

Conclusion

In the introduction I mentioned that the main goal of this refactoring was to decouple the Custom Handlers logic from the Chrome’s specific codebase; this would allow us to modify or even extend the implementation of the Custom Handlers APIs.

The componentization of some parts of the Chrome’s codebase has many advantages, as it’s described in the Browser Components design document. Additionally, a new Custom Handlers component would provide other interesting advantages on different fronts.

One of the most interesting use cases is the definition of Web Platform Tests for the Custom Handlers APIs. This goal has 2 main challenges to address:

  • Testing Automation
  • Content Shell support

Finally, we would like this new component to be useful for Chrome embedders, and even browsers built directly on top of a full featured Chrome, to implement new Custom Handlers related features. This line of work would eventually be useful to improve the support of distributed protocols like IPFS, as an intermediate step towards a native implementation of the protocols in the browser.