Subverting HTTPS with BREACH
An attack against encrypted web traffic (i.e. HTTPS) that can reveal sensitive information to observers was presented at the Black Hat security conference. The vulnerability is not any kind of actual decryption of HTTPS traffic, but can nevertheless determine whether certain data is present in the page source. That data might include email addresses, security tokens, account numbers, or other potentially sensitive items.
The attack uses a modification of the CRIME (compression ratio info-leak made easy) technique, but instead of targeting browser cookies, the new attack focuses on the pages served from the web server side. Dubbed BREACH (browser reconnaissance and exfiltration via adaptive compression of hypertext—security researchers are nothing if not inventive with names), the attack was demonstrated on August 1. Both CRIME and BREACH require that the session use compression, but CRIME needs it at the Transport Layer Security (TLS, formerly Secure Sockets Layer, SSL) level, while BREACH only requires the much more common HTTP compression. In both cases, because the data is compressed, just comparing message sizes can reveal important information.
In order to perform the attack, multiple probes need to be sent from a victim's browser to the web site of interest. That requires that the victim get infected with some kind of browser-based malware that can perform the probes. The usual mechanisms (e.g. email, a compromised web site, or man-in-the-middle) could be used to install the probe. A wireless access point and router would be one obvious place to house this kind of attack as it has the man-in-the-middle position to see the responses along with the ability to insert malware into any unencrypted web page visited.
The probes are used as part of an "oracle" attack. An oracle attack is one where the attacker can send multiple different requests to the vulnerable software and observe the responses. It is, in some ways, related to the "chosen plaintext" attack against a cryptography algorithm. When trying to break a code, arranging for the "enemy" to encrypt your message in their code can provide a wealth of details about the algorithm. With computers, it is often the case that an almost unlimited number of probes can be made and the results analyzed. The only limit is typically time or bandwidth.
BREACH can only be used against sites that reflect the user input from requests in their responses. That allows the site to, in effect, become an oracle. Because the HTTP compression will replace repeated strings with shorter constructs (as that is the goal of the compression), a probe response with a (server-reflected) string that duplicates one that is already present in the page will elicit a shorter response than a probe for an unrelated string. Finding that a portion of the string is present allows the probing tool to add an additional digit or character to the string, running through all the possibilities checking for a match.
For data that has a fixed or nearly fixed format (e.g. email addresses, account numbers, cross-site request forgery tokens), each probe can try a variant (e.g. "@gmail.com" or "Account number: 1") and compare the length of the reply to that of one without the probe. Shorter responses correlate to correct guesses, because the duplicated string gets compressed out of the response. Correspondingly, longer responses are for incorrect guesses. It is reported that 30 seconds is enough time to send enough probes to essentially brute force email addresses and other sensitive information.
Unlike CRIME, which can be avoided by disabling TLS compression, BREACH will be more difficult to deal with. The researchers behind BREACH list a number of mitigations, starting with disabling HTTP compression. While that is a complete fix for the problem, it is impractical for web servers to do so because of the additional bandwidth it would require. It would also increase page load times.
Perhaps the most practical solution is to rework applications so that user input is not reflected onto pages with sensitive information. That way, probing will not be effective, but it does mean a potentially substantial amount of work on the web application. Other possibilities like randomizing or masking the sensitive data will also require application rework. At the web server level, one could potentially add a random amount of data to responses (to obscure the length) or rate-limit requests, but both of those are problematic from a performance perspective.
Over the years, various attacks against HTTPS have been found. That is to be expected, really, since cryptographic systems always get weaker over time. There's nothing to indicate that HTTPS is fatally flawed, though this side-channel attack is fairly potent. With governments actively collecting traffic—and using malware—it's not much of a stretch to see the two being combined. Governments don't much like encryption or anonymity, and flaws like BREACH will unfortunately be available to help thwart both, now and in the future.
| Index entries for this article | |
|---|---|
| Security | Encryption/Web |