Preventing token theft
When you log into a service you’re given an authentication token. Each further request to the site includes that token, allowing the server to figure out who you are and ensuring that you have access to your data. Depending on site policy, this token may either be stored in memory (and so vanish if you restart your browser) or disk. The token is the proof of your identity. As far as the site is concerned, anyone with your token is you. These tokens may be traditional browser cookies, but they may also be stored in either site local storage or (if you’re not using a browser) in some other storage location.
In recent years we’ve seen infostealer malware (like LummaC2) gain the ability to exfiltrate user tokens, allowing attackers to gain access to the user’s data without needing to retain access to the user’s machine. This attack is viable even if the site has strong MFA requirements, so passkeys don’t help. Encrypting the tokens on disk doesn’t prevent the malware from scraping them out of the browser’s RAM or obtaining whatever key is used to encrypt them. This feels like a pretty hard problem to solve.
But that hasn’t stopped people from trying! Dirk Balfanz wrote an IETF draft describing a mechanism for using self-signed certificates for TLS authentication. This uses the mutual authentication feature of the TLS protocol that requires both sides prove their identity to each other. In regular TLS, the remote site presents a signed certificate that tells you who it is. When performing mutual authentication, you then present a certificate to the remote site telling it who you are. These client certificates are largely unused outside enterprise environments because they’re a huge pain to deploy. It’s not so much that this has sharp edges, it’s that it’s entirely made of sharp edges. Managing certificate deployment to your devices is hard. Browsers get confused if the certificates change under them. You have one certificate and it lives forever, so sites you present it to can track your identity. Users are prompted to choose a certificate to authenticate with, and if they pick the wrong one everything breaks and is hard to recover. I’ve deployed this and I did not have a good time.
But Balfanz’s idea was simple. Rather than require certificates to be deployed, browsers would simply generate a certificate on the fly. The goal wasn’t to prove the device or user’s identity in any global way - but it would associate a TLS session with a specific certificate. You could then, for example, include a hash of the certificate in the cookie, and if someone tried to use that cookie without presenting that certificate then the cookie could be rejected. If the browser used a hardware-backed private key for the certificate then it would be impossible for an attacker to steal it. Sure, you could still steal cookies, but you wouldn’t be able to use them.
This was written almost 15 years ago, and seems simple, elegant, and functional. It didn’t happen. Part of the reason for that is that, well, it wasn’t quite so simple. One problem was privacy related. Cookies are only sent after the TLS session is established, so anyone monitoring the network doesn’t know anything about the user identity. A naive implementation of this approach would have meant the client certificate being sent before session establishment, and now user identity can be tracked (no longer an issue if this was implemented on top of TLS 1.3, but this was a log time ago). This was avoided by reordering the client handshake, but that meant having to modify the TLS specification and implementations would have to be updated to support this. Another was that figuring out the granularity of the certificates was difficult. You’d want to use different certificates for every site to avoid them effectively becoming tracking cookies, but you need to provide the certificate before cookies are set, and you don’t know what origin the site is going to set in its cookies. If you generate a certificate for a.example.com and a different one for b.example.com, and a.example.com sets a cookie for *.example.com and includes the certificate you used for a.example.com, that cookie isn’t going to work on b.example.com and things are broken. This meant supporting it wasn’t as straightforward as it seemed - you’d need to ensure that your cookie scope was compatible with the certificate scope. You could probably make this work well enough by aligning it with the Public Suffix List, but there was still some risk of expectations not being aligned.
And, perhaps most importantly, TLS session resumption (replaced by pre-shared keys in TLS 1.3) somewhat defeats the purpose of the exercise - clients store state that allows them to re-establish a TLS connection without performing certificate exchange (this reduces overhead if a connection gets interrupted or you switch to a new network or anything along those lines), and anyone in a position to steal cookies could steal that state as well.
The followup attempt was channel IDs. This simplified the implementation somewhat - rather than certificates, a raw public key would be sent, along with proof of possession of the private key in the form of a signature over a portion of the TLS handshake. This was required even in the event of session resumption, which avoided having to worry about theft of session secrets. The timing of the exchange was after the encrypted session had been established, so user identity couldn’t be leaked that way either. Cookies could then be bound to this identifier. Unfortunately it didn’t really deal with the problem of scoping keys in a way that would match cookie requirements, and the spec suggests that the right way of handling this is to scope keys to TLDs, which would enable user tracking across sites (Chrome’s implementation apparently restricted it to eTLD+1, which would match the third party cookie policy and avoid the tracking risk).
Chrome added support for this, but it was removed in early 2018. The discussion of some of the pain points in that message is interesting, explicitly calling out problems with connection coalescing across domains and the incompatibility with zero-RTT TLS1.3. The overall consensus at the time seems to be that trying to solve this entirely at the TLS layer has too many rough edges, and a different approach should be taken.
And so almost 7 years after the initial draft for origin bound certificates, we come to token binding. This ended up being a rather more complex endeavour, covering 3 different RFCs describing how it impacts TLS, how to incorporate it into HTTP, and how to manage all the various parties involved in the process. The short version is that it’s pretty similar to channel ID, except that there’s also a documented mechanism for allowing tokens to be bound to one party and consumed by another, avoiding any need for widely scoped keys. Token binding effectively solved all the issues in the original proposal, but at the cost of somewhat more complexity.
The RFC was finalised in October 2018. Chrome removed its (incomplete, draft) support for token binding in November 2018. Edge carried support until late 2024. Despite getting all the way through the RFC process, it’s functionally dead.
The process up until this point had been largely initiated by Google, with Microsoft contributing significantly to the token binding standards. The work had been focused on identifying a generic solution to the problem rather than tying it to any specific authentication flow. The next step was in a different direction - rather than trying to fix this for the entire internet, how about we try to fix it for OAuth?
RFC 8705 is titled “OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens”. This is basically the 2011 approach, but (a) with an explicit definition of how the certificate should be incorporated into issued auth cookies, and (b) with a proviso that well uh if you’re going to use tokens issued by your IdP to authenticate to someone else then well you’re going to need to use the same cert for both. This is probably fine for the company-owned-laptop case where you’re actually fine with multiple sites being able to tie identities together (that’s kind of the point here!), and also works for “I am using an app and not a browser”, but doesn’t work for more generic scenarios. It also doesn’t seem to take the session resumption case into account at all? Support for RFC8705 seems poor, as far as I can tell of the big players only Auth0 implements it. In theory it works fine with self-signed client certs but in reality that’s going to be almost as difficult to support across multiple platforms as just issuing proper client certs in the first place, so deployment is going to be kind of a pain. But the good news is it doesn’t rely on any TLS extensions or custom browser behaviour, so at the client side it works fine with any browser.
Which brings us on to RFC 9449, “Demonstrating Proof of Possession”. This goes even further than RFC8705 in terms of reducing the burden of deployment - it works fine with existing browsers, and it doesn’t even require any certs. The client generates a keypair and provides the pubkey when requesting the cookie. The cookie contains the pubkey. Every request to the service now provides the cookie with the pubkey and also provides a signature over the URI and HTTP method. If the signature matches the pubkey in the token then clearly the signature came from the machine the token was issued to, and everything is good.
This does come with some downsides, though. The first is that it uses browser interfaces to generate the keys (typically crypto.subtle.generatekey()) and as far as I can tell there are no browsers that guarantee that that key is going to be generated in hardware even if it’s marked non-exportable, so anyone able to steal the cookies can also steal the keys. The second is that the signature only covers the URI and HTTP method, and not the message content or any other headers, so anyone able to exfiltrate a valid signature can replay it against the same URI with different message content. The recommended way to handle this is to reject any signatures that weren’t generated within the last few seconds, which is a wonderful additional way to allow clock skew to give you a Bad Day. And the third is that every single request has to be separately signed, which is not intrinsically a problem because computers are fast and have multiple cores, but if you’re trying to solve the first problem by sticking the key in a TPM then you’re dealing with something that’s slow and single threaded and that’s maybe acceptable if you’re using client certificates (because there’s going to be one signature per session and you can use the same session for multiple requests) but probably not if you’re dealing with a user opening a browser that restores previous tabs and each of those is a webapp that fires off 100 requests in parallel.
In case it wasn’t clear, I don’t like DPoP. It doesn’t feel like it actually solves the underlying problem that we see in the real world (malware running in a context where if it can grab the tokens it can grab the keys), it adds a massive amount of overhead, and it has baked in replay vulnerabilities. I don’t know why it exists and I’m incredibly suspicious of vendors telling me that it fixes my problems, because if they’re telling me that then I’m going to end up assuming that they either don’t understand my problems or they don’t understand their technology, and neither of those is good.
Still. Then we get to the thing that prompted me to write this - Chrome’s announcement that they had launched device-bound session credentials. This is interesting because it’s a Chrome feature that’s explicitly intended to counter on-device malware, which was one of the things that was out of scope in 2018 when token binding was being removed. Since this is entire web level it doesn’t have to be an RFC, and so is instead defined by W3C. I’m going to handwave all the complexity and say that it’s basically a way to register a public key when a cookie is issued, and then prove possession of the private key when it’s time to renew the cookie. By making the cookies shortlived and having support for rotating them in the background, user impact is basically zero and while it’s still possible for an attacker to exfiltrate and use a cookie they’ll only be able to do so for a short window before it needs to be refreshed - something the attacker can’t do, since they don’t have the private key. This avoids the DPoP overhead because you only need to do signing once per cookie per cookie lifetime, and not on every single request. I don’t like this due to the window where exfiltrated tokens can be used, but it feels like a strict improvement over the status quo. An extension called device-bound session credentials for enterprise allows pre-enrollment of device keys, so even though the actual runtime DBCE flow doesn’t involve certificates, certificates can be used for device registration in enterprise environments and you can make sure that auth cookies only go to trusted devices. Unfortunately this is Chrome-only, and so we’re going to need to wait for it to be backported to all the random app frameworks for it to have widespread support on mobile or for almost everyone’s desktop app that’s actually three websites in an Electron wrapper. Mozilla’s current position is that they’re not in favour of it, so I guess we’ll see where Safari lands in terms of broad uptake.
The last thing on my list is another client cert/OAuth binding, this one still in draft state at the time of writing. This one is aimed primarily at the use of agent-driven tooling, where you have something running in the background using a whole bunch of tools that are each acting on your behalf. Authenticating to all of them separately isn’t a fun time, but giving broadly scoped access tokens to a non-deterministic agent and trusting that it’ll never post them somewhere public also isn’t a fun time. The key distinction between it and RFC8705 is that it’s aimed at connections rather than sessions, which avoids the worries about session resumption. This is done with TLS Exporters, which in TLS 1.3 should be unique to the connection even over session resumption (TLS 1.2 may reuse some of the same key material for exporters over session resumption, so it’s recommended to enforce 1.3 for this). By providing a new signature alongside the cookie on every new connection, the client proves that it still has access to the private key. This is a very new spec and I haven’t had much time to work through it yet, but my naive understanding is that unlike RFC8705 this would require some additional client support to be able to regenerate the client signature on every TLS reconnection.
This doesn’t avoid all the problems that RFC8705 has, including how to scope certificates. For the agentic use case that probably doesn’t matter - all these tools are acting on behalf of the same user, it’s fine if all the sites involved know they’re the same user. But it doesn’t solve the general purpose user use case, and right now DBSC seems like the best we have there.
But. Part of me still wonders whether Dirk Balfanz’s approach was the right one. Yes, there’s risk associated with TLS session resumption, but in the worst case you could just switch that off for high risk setups. The cookie scope argument is real, and also in cases where it could violate privacy the site owner could already choose to broaden their cookie scope and violate your privacy, and in cases where it breaks things you could just not make use of it. The other problems are largely fixed by TLS 1.3, and then we’re just left with “Browsers handle client certificates badly” to which my answer is “Yes, and we should fix that anyway”.
Despite having a pretty good answer to this solution over a decade ago, the closest we have to actual deployment is something that offers strictly worse security guarantees. And tokens keep getting stolen, and compromises keep occurring, and for the most part people shrug and get on with things.