HTML Subresource Integrity

By Nathan Willis
June 29, 2016

The World Wide Web Consortium (W3C) has approved a new specification intended to thwart cross-site scripting and content-injection attacks in web pages that include content from served from multiple sites. Subresource Integrity (SRI) defines a mechanism for browsers to verify that third-party resources like scripts and images match the exact contents expected by the author of the surrounding page.

Injection attacks can take a number of forms. DNS poisoning can be used to redirect HTTP requests to malicious servers, images can be replaced on compromised content-delivery networks (CDNs) or caches, and so on. The usage of HTTP Strict Transport Security (HSTS) and browsers that block HTTP elements in pages served over HTTPS mitigate the most obvious such risks, but there are still avenues for exploitation.

In particular, the SRI specification notes that a substantial number of sites rely on third-party services to deliver page content, from CDNs to partner sites to open-source JavaScript frameworks. At any particular time, a site administrator may feel reasonably confident about the security of their own servers, but such confidence does not extend to the external, third-party servers involved. Thus, it is in the site owner's interest to be able to attest to the expected content of a resource in a way that the browser can validate.

SRI is designed to combat injection attacks that come through third-party content. The originating site can include cryptographic hashes of third-party script and image files, enabling the user's browser to hash the corresponding files it receives from the third-party servers and verify that the hashes match. It also provides the means for browsers to report validation failures back to site owners. The specification was developed by the W3C Web Application Security Working Group, and received "Recommendation" status (the W3C equivalent of final approval) on June 23.

Content

In its present form, SRI adds an integrity property to the HTML <script> element and to <link> elements of type stylesheet. The expectation is that future revisions of the standard will expand the coverage to include additional HTML elements—perhaps every possible subresource type (images, audio and video elements, iframes, plugin objects, and all hyperlinks).

In its most basic form, the integrity property's value should be a string starting with the hash algorithm used, followed by a dash, then the base64-encoded hash. Support for the SHA-256, SHA-384, and SHA-512 hash functions is required; support for additional functions is optional (although SHA-1 and MD5 are marked as functions that browsers should reject due to their cryptographic weakness).

So a site owner would hash a script of interest, and add that information to their own site's <script> tag:

    <script src="https://example.com/privacy-friendly-analytics.js"
            integrity="sha384-H8BRh8j48O9oYatfu5AZzq6A9RINhZO5H16dQZngK7T62em8MUt1FLm52t+eX6xO"
            crossorigin="anonymous"></script>

Naturally, SRI only provides integrity protection if the HTML page is retrieved over a secure connection. If the page is sent over unencrypted HTTP, attackers can simply replace the integrity value with whatever they choose.

Access control

The crossorigin property shown in the example, while not part of SRI itself, is required. It comes from the Cross-Origin Resource Sharing (CORS) access-control API, which can be used to restrict access to scripts based on the origin of the request and other information. In a CORS-enabled FETCH request, the origin (typically the URL) of the surrounding document is sent to the server along with the value of the crossorigin property (either "anonymous" or "use-credentials"). The server can then grant or deny the request based on whatever access controls it has defined. Filtering out requests based on the request origin is simple enough, although attackers would likely forge that header. The use-credentials option supports additional authentication mechanisms, like HTTP cookies.

In the context of SRI, CORS is used to protect again a particular type of side-channel attack in which the attacker tries to infer information hidden in a resource by pre-computing hashes. For example, if a stylesheet includes some kind of interesting token (whether that is an API key, session ID, username, or something else), an attacker could compute hashes of likely values and send repeated FETCH requests, logging those requests that do not result in a 404 error.

Using CORS, however, the server hosting the stylesheet can turn on the use-credentials option, enabling HTTP cookie-based authentication of every request. Since the attacker cannot supply valid authentication cookies with its brute-force requests, the stylesheet server will drop those requests silently, preventing any information leaks.

Options and reporting

SRI allows integrity properties to include a space-separated list of several hashes; for example:

    <link rel="stylesheet" href="https://example.org/fancy-grid.css"
            integrity="sha384-H8BRh8j48O9oYatfu5AZzq6A9RINhZO5H16dQZngK7T62em8MUt1FLm52t+eX6xO
	               sha384-NWFxpV6Pjs1JsG7lQ/N8EGnddVuWW2ft08xHm/X0rsXB5TrAokLI/BsbADXmXVRX"
            crossorigin="anonymous">

If a resource matches any of the supplied hashes, it is regarded as having been validated. Thus, site owners can support multiple algorithms or provide hashes for several variants of the same resource. For instance, the version of a stylesheet returned to the browser might differ based on whether or not the user is logged into the site. There is also a mechanism defined for expressing additional options in the integrity property, though none have yet been defined.

The SRI specification mandates that browsers refuse to load or render any element that fails its validation test. It also requires the browser to return an error response to the server of the originating page. Since the error is a response to the specific FETCH request, servers can catch it for diagnostic purposes as well as provide a fallback resource if necessary.

Moving forward

In strict terms, SRI seems like a rather common-sense addition to HTML. Some may even wonder why it has taken this long to standardize, given how long scripting- and content-injection attacks have plagued web users. To some degree, the answer is simply that the web-development community has historically preferred to innovate rapidly and learn security lessons a bit more slowly.

The longer answer is that the increasingly dynamic nature of the web makes it more difficult to canonically describe how HTTP resources are assembled into a page. An example of this can be found in this issue filed against the SRI specification. As it turns out, SRI in its current form only applies to "first level" subresources like CSS files; any resource linked to from within that CSS file (a sub-subresource), such as an image or font, is not checked by the SRI validation scheme. Moreover, because SRI is defined in terms of FETCH requests, fixing the sub-subresource problem may require altering the CSS specification, which is surely not a simple task.

Nevertheless, SRI is clearly a positive step forward. The good news for users and developers is that SRI is already supported in Firefox (as of version 45), Chrome (as of its version 45), Opera (in Opera 38), and in the newest release of the Android browser. CORS support is available in all major browsers and in most major web servers and frameworks. There is still much to be done to make the full contents of web pages cryptographically verifiable but, with this new specification, matters have taken a big step forward.

Index entries for this article
Security	Content integrity
Security	Web

to post comments

HTML Subresource Integrity

Posted Jun 30, 2016 2:55 UTC (Thu) by pr1268 (guest, #24648) [Link] (22 responses)

Meh... This is just another band-aid fix to a major open wound. While I respect the W3C's draft specification proposal, I'm still thinking that this is just another layer of complexity being added to fix a long-standing problem with malicious cross-site content.

If the W3C really wanted to enforce a stronger security model, why did they allow cross-site content (other than e.g. images¹ or plain text)?

¹ Yes, I know that "hot-linked" image files can contain malicious content; but then that's something for primary host to verify before hot-linking. Or for the client browser to inspect and mitigate.

HTML Subresource Integrity

Posted Jun 30, 2016 7:38 UTC (Thu) by epa (subscriber, #39769) [Link]

I think the web as originally designed doesn't have the concept of a 'site'. In principle there's no reason to treat URIs differently depending on whether they have the same domain name as the page that links to them. Indeed the old-style web hosting providers such as Geocities had many unrelated pages sharing a single 'site'. Nowadays, of course, we have walled gardens where even content from outside is often mirrored so that nobody has to leave the 'site'. The big exception to that is advertising content. The 'site' model will run out of steam one day and I don't think it would be a good idea to bake it into web standards.

HTML Subresource Integrity

Posted Jun 30, 2016 13:38 UTC (Thu) by rmayr (subscriber, #16880) [Link]

Agreed. For privacy reasons, it would be much better just to ship the Javascript etc. resources from the same server instead of sending the browser off to 10+ other monitoring/tracking networks. It's not actually hard to do so...

And for the same reason why resources are being linked from other sites, I don't think that this will see too much use: one benefit of linking to the "origin" instead of shipping a local copy is that it automatically updates to the latest version (for better or worse) without touching the local site. There is no real difference between updating a SHA256 value and updating the complete Javascript/whatever package on the local web server from an administrative overhead point of view (there is from a privacy point of view). Therefore I don't believe this to be a step forward.

HTML Subresource Integrity

Posted Jun 30, 2016 21:43 UTC (Thu) by jnareb (subscriber, #46500) [Link] (18 responses)

> If the W3C really wanted to enforce a stronger security model, why did they allow cross-site content (other than e.g. images or plain text)?

Geo-aware anycast CDN (Content Delivery Network) are a thing. Instead of getting e.g. minified jQuery or Bootstrap from original server over its pipe, you are getting it from a server close to you over big pipe.

HTML Subresource Integrity

Posted Jul 3, 2016 3:13 UTC (Sun) by raven667 (subscriber, #5198) [Link] (17 responses)

> Geo-aware anycast CDN

CDNs aren't hosting this stuff out of the good of their hearts, they are harvesting referrer and IP logs for analytics. By hosting a high profile resource like jQuery or Bootstrap they can collect a vast amount of intelligence, that would otherwise be inaccessible, to sell to the highest bidder, that there might also be performance benefits are incidental.

HTML Subresource Integrity

Posted Jul 3, 2016 4:22 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (16 responses)

You overestimate the importance of statistics. URL shorteners also expected to roll in cash selling this preciousssss data, and mostly went out of business.

HTML Subresource Integrity

Posted Jul 3, 2016 9:53 UTC (Sun) by oever (guest, #987) [Link] (15 responses)

What is the business model behind fonts.googleapis.com and ajax.googleapis.com if not to spy on web traffic?

Owning these sites give the power to change the behavior of many websites at ones and slowly increase the spyware load.

At least with SRI, the original site can determine that the page should only load one specific version of the font, script or css.

HTML Subresource Integrity

Posted Jul 3, 2016 10:54 UTC (Sun) by roc (subscriber, #30627) [Link] (1 responses)

Not everything Google does has an obvious business model.

There are groups at Google that sometimes do things just because it's good for the Web. (And of course other Google groups work in the other direction...)

HTML Subresource Integrity

Posted Jul 4, 2016 9:13 UTC (Mon) by hkario (subscriber, #94864) [Link]

except that this is very much in-line with their core competency and core business - ads

HTML Subresource Integrity

Posted Jul 3, 2016 15:04 UTC (Sun) by flussence (guest, #85566) [Link] (12 responses)

>What is the business model behind fonts.googleapis.com and ajax.googleapis.com if not to spy on web traffic?
They probably did the numbers and decided running a free CDN was more cost-effective than adding extra storage to all their phones to handle thousands of duplicates in the browser cache.

HTML Subresource Integrity

Posted Jul 3, 2016 17:40 UTC (Sun) by oever (guest, #987) [Link] (11 responses)

If that's the case they'll be very quick to implement SRI in Chrome and on their sites since SRI saves even more cache and bandwidth.

HTML Subresource Integrity

Posted Jul 4, 2016 12:58 UTC (Mon) by flussence (guest, #85566) [Link] (10 responses)

SRI was already in Chrome several weeks (months?) ago.

HTML Subresource Integrity

Posted Jul 4, 2016 16:30 UTC (Mon) by oever (guest, #987) [Link] (9 responses)

I've just tested RSI with Chromium 51 and Firefox 47. Both browsers check RSI on the <script> and <link> and refuse to use the linked CSS or JavaScript when the checksum is not correct. This is in agreement with what is written in the parent article.

What I've not checked, is whether the browser will use the checksum to retrieve the file from the cache instead of loading it from a 3rd party browser.

This is an quick way to get a base64 encoded checksum for use in HTML with SRI:

cat file | openssl dgst -binary -sha256 | base64

HTML Subresource Integrity

Posted Jul 4, 2016 19:35 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (8 responses)

> What I've not checked, is whether the browser will use the checksum to retrieve the file from the cache instead of loading it from a 3rd party browser.

Hopefully the URL also has to match. Some interesting XSS-like attacks come to mind otherwise.

HTML Subresource Integrity

Posted Jul 4, 2016 19:47 UTC (Mon) by oever (guest, #987) [Link] (7 responses)

I'm not a security researcher and lack the creativity needed to see a possible attack. Do you have an example?

If the browser uses the hash as an identifier a lot of network traffic can be avoided.

Git works by using the hash, sha1 even, for identification of files. I think it would be safe to use for the entire web too.

HTML Subresource Integrity

Posted Jul 4, 2016 20:32 UTC (Mon) by dtlin (subscriber, #36537) [Link] (6 responses)

One obvious one:

Perhaps you visited http://a.com, and the browser fetches a cachable resource http://a.com/style.css with checksum 1234. Or not.

Now you visit http://b.com, which says

<link rel="stylesheet" href="http://b.com/sneaky.css" integrity="1234">

— while it 404's that path (or anything that doesn't get cached).

If the browser were to skip fetching http://b.com/sneaky.css because it already had content with hash 1234 in cache, then b.com would quite reliably be able to determine whether the user had visited a.com just by looking to see which of their visitors doesn't make a request to http://b.com/sneaky.css. (It's already possible to be tricksy by measuring timing to load things in Javascript, but that's kind of a problem too.)

HTML Subresource Integrity

Posted Jul 4, 2016 21:27 UTC (Mon) by oever (guest, #987) [Link]

You're right. That's a deal breaker. The URL will have to be the same too.

HTML Subresource Integrity

Posted Jul 5, 2016 8:35 UTC (Tue) by paulj (subscriber, #341) [Link] (4 responses)

How can b.com determine that? All it means is there a cache _somewhere_ between the user and b.com. That cache could be tied to the user (i.e. in the user's browser profile), or it could be shared amongst many, many people (some large-network level transparent cache).

HTML Subresource Integrity

Posted Jul 5, 2016 11:52 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

Cache control headers can prevent caching by most software. Plus, the cached file could ping back (e.g., CSS font stuff or an explicit XHR) and then you know at least that the file was cached before and not having a GET query paired with it means that it was visited before. I imagine network level cause to be rare enough as to not matter much.

HTML Subresource Integrity

Posted Jul 6, 2016 16:26 UTC (Wed) by zlynx (guest, #2285) [Link]

It may be somewhat rare but there are still many places that force caches to ignore cache-control because the benefits exceed the annoying but rare need to shift-reload in the browser.

Satellite links, mesh networks, or exceedingly slow radio links are all examples.

Sites that try to force the 304 If-Modified check just for user stats are, in my opinion, evil and stupid, and forcing it to be treated as Expires is almost always a good thing.

HTML Subresource Integrity

Posted Jul 5, 2016 12:18 UTC (Tue) by excors (subscriber, #95769) [Link] (1 responses)

b.com could return HTTP headers that forbid caching, and assume that all caches between the server and user will respect those headers. (A browser that used the SRI hash as the cache key would never contact b.com and would never see those headers, so nothing would stop it using its own cache.)

Or it could use URLs like http://b.com/sneaky.css?unique_random_number so it won't be cached by anything that uses the URL as the cache key.

Or it could set sneaky.css to *not* have the correct hash, and detect whether the browser successfully returns the original style.css (if it uses its SRI-hash-keyed cache because the user had previously visited a.com) or generates an error event (if it tries to load from b.com). That wouldn't require any involvement from the server.

You could try to restrict the SRI-hash cache to be per-origin (i.e. use the domain of the requested resource), but I don't think that would help. The sneaky site could use href="http://a.com/unrelated.css" integrity="[hash of http://a.com/style.css]", and it will either successfully load http://a.com/style.css's content (on a cache hit) or return an error (on a cache miss). If you restrict the cache based on the origin of the requesting page, then you lose the benefit of sharing a cached jquery.js across all sites, so you might as well not bother with the cache.

HTML Subresource Integrity

Posted Jul 5, 2016 12:46 UTC (Tue) by paulj (subscriber, #341) [Link]

Interesting. Cheers.

HTML Subresource Integrity

Posted Jul 2, 2016 11:35 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Won't this also help defend against bit flip errors? The numbers from the recent defcon talk were estimates 600k bit errors per day in the wild (I think that may exclude the amplification of bit flips from within a CDN cache). And that's might just be for DNS-related bit errors.

HTML Subresource Integrity

Posted Jun 30, 2016 7:34 UTC (Thu) by epa (subscriber, #39769) [Link] (1 responses)

Why wasn't this simply added to the URI specification? Informal schemes like https://foo#sha256=fred are already used.

HTML Subresource Integrity

Posted Jul 2, 2016 10:05 UTC (Sat) by robert_s (subscriber, #42402) [Link]

Because URIs are used in so many more places and it would be hard to come up with a scheme that was backward compatible without the danger of misinterpreting existing URIs.

Content Centric Networking

Posted Jun 30, 2016 10:00 UTC (Thu) by paulj (subscriber, #341) [Link] (2 responses)

This is basically a crappy form of content-centric networking (CCN == retrieving data from the network by addressing the content desired, rather than the location - allowing the network to do caching).

Content Centric Networking

Posted Jun 30, 2016 13:21 UTC (Thu) by oever (guest, #987) [Link] (1 responses)

What is crappy about it?

With SRI caching and privacy can be improved. Caching can be improved because identical files from different servers only need to be downloaded once. This in turn will help to improve privacy because the advantage of CDN servers for common fonts, scripts and styles disappears. And even when CDN servers are still used, the number of requests to these servers can go down because files will often already be in cache.

Content Centric Networking

Posted Jul 1, 2016 13:01 UTC (Fri) by paulj (subscriber, #341) [Link]

Well, clearly, what is desired to be able to address content on the "network" (rather than a location). However, as the network they want to work on have doesn't provide that ability, this adds another mechanism to allow user-agents to verify some data retrieved from whatever location is the desired content.

It's not really the fault of SRI, but it's basically working around the lack of CCN.

HTML Subresource Integrity

Posted Jul 1, 2016 8:51 UTC (Fri) by pabs (subscriber, #43278) [Link] (13 responses)

Does anyone know if it can be used on <a> tags too? That would be really useful for download links.

HTML Subresource Integrity

Posted Jul 1, 2016 10:14 UTC (Fri) by oever (guest, #987) [Link] (12 responses)

In this version the integrity attribute is only allowed on <link> and <script>. However, the intent is to extend that:

A future revision of this specification is likely to include integrity support for all possible subresources, i.e., a, audio, embed, iframe, img, link, object, script, source, track, and video elements.

HTML Subresource Integrity

Posted Jul 1, 2016 10:26 UTC (Fri) by micka (subscriber, #38720) [Link] (11 responses)

But then, for download, I'm not sure you'd hash the actual resource. Imagine it's 4Gb.
Wouldn't you rather hardcode a kind of challenge in the <a> element that doesn't require to download and store the whole thing before checking it ?

HTML Subresource Integrity

Posted Jul 1, 2016 10:42 UTC (Fri) by oever (guest, #987) [Link] (10 responses)

Why wouldn't you hash a 4GB file? Download sites often already have lists of checksums. An integrity attribute on the link would make the check machine readable. So the browser could verify the download.

Large files usually do not change often so the hash does not have to be calculated often.

If every link on a site would use integrity on the links, then simply checking the hash of /index.html would be enough to know if the site changed.

HTML Subresource Integrity

Posted Jul 1, 2016 11:59 UTC (Fri) by micka (subscriber, #38720) [Link] (9 responses)

Well, I'm talking about the user side. If they need to download it to check the resource (You can't hash a file without downloading it totally) then it's really not different that putting both the link and the hash on the HTML page.

HTML Subresource Integrity

Posted Jul 1, 2016 12:09 UTC (Fri) by james (guest, #1325) [Link]

Except if it's done automatically, by your browser, then the end users don't have to worry about it.

It means Linux distributions can put a download link on an HTTPS page, store the actual resource on a mirror network, and tell new users that as long as they're using a recent browser, they don't need to bother running weird checksum utilities to verify their download.

(Yes, there are well-known limitations with HTTPS, but it's what we're ultimately stuck with: telling new users to get a GPG key is no improvement if the page telling them which GPG key to get is served over HTTPS...)

HTML Subresource Integrity

Posted Jul 2, 2016 23:34 UTC (Sat) by lsl (guest, #86508) [Link] (7 responses)

The browser can just tee it into the hash function while the download is running anyway. You don't start to verify gigantic files without having been instructed to download them, of course.

HTML Subresource Integrity

Posted Jul 3, 2016 0:55 UTC (Sun) by flussence (guest, #85566) [Link] (6 responses)

That's true, but I can see this interfering with all the speculative loading stuff browsers do to reduce page load times. After all, you'd want that hash to be valid *before* you go off parsing potentially evil Javascript/CSS/HTML-imports.

(Of course, it's only a performance issue on first load. Afterwards you can just use the hash as a cache key — and the timing becomes a privacy issue :)

HTML Subresource Integrity

Posted Jul 3, 2016 5:08 UTC (Sun) by ianmcc (guest, #88379) [Link] (5 responses)

Break the file into blocks and hash the blocks separately.

HTML Subresource Integrity

Posted Jul 3, 2016 9:46 UTC (Sun) by oever (guest, #987) [Link] (4 responses)

This is what the BitTorrent protocol does. It's not too popular for browsing yet. There is more attention for the distributed web lately, so it might become more popular.

Using checksums for block is a nice suggestion for the next version of SRI.

HTML Subresource Integrity

Posted Jul 4, 2016 9:20 UTC (Mon) by hkario (subscriber, #94864) [Link] (3 responses)

problem is, that if you want to verify only part of the file, you need to have the _whole_ Merkle tree, not only the root node, to do that

and a whole Merkle tree for 4KiB fragments of a 1MiB file is 8KiB long (using SHA-256, double that for SHA-512)

HTML Subresource Integrity

Posted Jul 5, 2016 15:34 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (2 responses)

> problem is, that if you want to verify only part of the file, you need to have the _whole_ Merkle tree, not only the root node, to do that

You don't necessarily need the whole Merkle tree, if it's organized properly. You just need the root and the siblings of the nodes on the path to the block you want. For example:

Root
|-- A0
|...|-- B0
|...|...|-- C0
|...|...|...|-- D0
|...|...|...|...|-- E0
|...|...|...|...|...|-- Block 0
|...|...|...|...|...\-- Block 1
|...|...|...|...\-- E1
|...|...|...|...|...|-- Block 2
|...|...|...|...|...\-- Block 3
|...|...|...\-- D1 (elided)
|...|...\-- C1 (elided)
|...\-- B1 (elided)
\-- A1 (elided)

The value of each node is a hash of its immediate children. A six-level binary tree can describe up to 2^6=64 blocks (or 256 KiB at 4 KiB per block), but to verify block 0 you only need the hashes of the root, A1, B1, C1, D1, E1, and block 1, for a total of 7 hashes (224 bytes of SHA-256). The intermediate hashes stand in for the parts which weren't downloaded, so in general the more complete subtrees you download the fewer hashes you need. If you download the entire file you only need to know the root hash.

HTML Subresource Integrity

Posted Jul 7, 2016 8:59 UTC (Thu) by hkario (subscriber, #94864) [Link] (1 responses)

the suggestion was so that the browser can do partial rendering as soon as the data is read, that means you need all the leaf nodes as the browser will want to update the rendering each time a new chunk is downloaded and checksummed

By writing 8KiB I meant exactly 8192 bytes of data i.e. only the leaf nodes.

In general yes, if getting parts of the tree requires less time (latency) than downloading the data, you don't need the full Merkle tree, but that doesn't assume that the whole point of using it is to reduce latency.

HTML Subresource Integrity

Posted Jul 7, 2016 16:57 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

> the suggestion was so that the browser can do partial rendering as soon as the data is read

Sorry, I wasn't looking at the bigger picture, just the statement that one needs the full Merkle tree to verify part of a file. If you want to download the full file while verifying each part as its received then you will need (almost) the full Merkle tree.

Still, it isn't necessary to download the Merkle tree in full before you can start verifying the data. You could stream the hashes from the Merkle tree in the order that they'll be needed to verify each block, e.g.:

(the root hash is already known from the SRI attribute)
Fetch A1, B1, C1, D1, E1, H(Block 1), and Block 0
Compute H(Block 0)
Compute E0 = H(H(Block 0)|H(Block 1))
Compute D0 = H(E0|E1)
Compute C0 = H(D0|D1)
Compute B0 = H(C0|C1)
Compute A0 = H(B0|B1)
Verify H(A0|A1) = root hash
Fetch Block 1
Verify H(Block 1)
Fetch H(Block 3) and Block 2
Compute H(Block 2)
Verify H(H(Block 2)|H(Block 3)) = E1
Fetch Block 3
Verify H(Block 3)
Fetch E3, H(Block 5), and Block 4
Compute H(Block 4)
Compute E2 = H(H(Block 4)|H(Block 5))
Verify H(E2|E3) = D1
Fetch Block 5
Verify H(Block 5)
etc.

The expectation is that the hashes would be provided in this order in a separate file alongside the content. With this approach you can verify each block as it's downloaded. Even better, to do so you only need to download the hashes for the odd-numbered blocks and intermediate nodes; the even-numbered hashes can be computed from the content under the assumption that the even-numbered nodes are downloaded and verified first within each subtree. For larger files this should cut the number of downloaded hashes almost in half.

HTML Subresource Integrity

Posted Aug 30, 2016 7:56 UTC (Tue) by freddyb_ (guest, #110913) [Link] (1 responses)

> The SRI [..] requires the browser to return an error response to the server of the originating page. Since the error is a response to the specific FETCH request, servers can catch it for diagnostic purposes as well as provide a fallback resource if necessary.

That's wrong, by the way. The error is reported in the HTML/JS world, i.e., it must be caught with an event handler in JavaScript. There's no built in error reporting (yet).

HTML Subresource Integrity

Posted Aug 31, 2016 12:01 UTC (Wed) by flussence (guest, #85566) [Link]

Not in this spec, but Content Security Policy v2 gets a bit further using a combination of the hash-source and report-uri fields. The error report to the server is explicitly specced as a non-blocking background request though, so any fallback will still end up being hacky.