The present and potential future of progressive image rendering - JakeArchibald.com
When I set about writing this article, I intended it to be a strong argument for progressive rendering. But after digging into it, my feelings are less certain.
When I set about writing this article, I intended it to be a strong argument for progressive rendering. But after digging into it, my feelings are less certain.
When I was talking about monitoring web performance yesterday, I linked to the CrUX data for The Session.
CrUX is a contraction of Chrome User Experience Report. CrUX just sounds better than CEAR.
It’s data gathered from actual Chrome users worldwide. It can be handy as part of a balanced performance-monitoring diet, but it’s always worth remembering that it only shows a subset of your users; those on Chrome.
The actual CrUX data is imprisoned in some hellish Google interface so some kindly people have put more humane interfaces on it. I like Calibre’s CrUX tool as well as Treo’s.
What’s nice is that you can look at the numbers for any reasonably popular website, not just your own. Lest I get too smug about the performance metrics for The Session, I can compare them to the numbers for WikiPedia or the BBC. Both of those sites are made by people who prioritise speed, and it shows.
If you scroll down to the numbers on navigation types, you’ll see something interesting. Across the board, whether it’s The Session, Wikipedia, or the BBC, the BFcache—back/forward cache—is used around 16% to 17% of the time. This is when users use the back button (or forward button).
Unless you do something to stop them, browsers will make sure that those navigations are super speedy. You might inadvertently be sabotaging the BFcache if you’re sending a Cache-Control: no-store header or if you’re using an unload event handler in JavaScript.
I guess it’s unsurprising the BFcache numbers are relatively consistent across three different websites. People are people, whatever website they’re browsing.
Where it gets interesting is in the differences. Take a look at pre-rendering. It’s 4% for the BBC and just 0.4% for Wikipedia. But on The Session it’s a whopping 35%!
That’s because I’m using speculation rules. They’re quite straightforward to implement and they pair beautifully with full-page view transitions for a slick, speedy user experience.
It doesn’t look like WikiPedia or the BBC are using speculation rules at all, which kind of surprises me.
Then again, because they’re a hidden technology I can understand why they’d slip through the cracks.
On any web project, I think it’s worth having a checklist of The Invisibles—things that aren’t displayed directly in the browser, but that can make a big difference to the user experience.
Some examples:
meta elements in the head of documents so they “unroll” nicely when the link is shared.Speculation-Rules header that points to a JSON file.lang attribute on the body of every page.If you’ve got a checklist like that in place, you can at least ask “Whose job is this?” All too often, these things are missing because there’s no clarity on whose responsible for them. They’re sorta back-end and sorta front-end.
Many interactions are not possible without JavaScript, but that doesn’t mean we should look to write more than we have to. The server doing something useful is a requirement for building an interesting business. The client doing something is often a nice-to-have.
There’s also this:
It’s really fast
One of the arguments for a SPA is that it provides a more reactive customer experience. I think that’s mostly debunked at this point, due to the performance creep and complexity that comes in with a more complicated client-server relationship.
Earlier this year I wrote about some performance improvements to The Session using the content-visibility property in CSS.
If you say content-visibility: auto you’re telling the browser not to bother calculating the layout and paint for an element until it needs to. But you need to combine it with the contain-intrinsic-block-size property so that the browser knows how much space to leave for the element.
I mentioned the browser support:
Right now
content-visibilityis only supported in Chrome and Edge. But that’s okay. This is a progressive enhancement. Adding this CSS has no detrimental effect on the browsers that don’t understand it (and when they do ship support for it, it’ll just start working).
Well, that’s happened! Safari 18 supports content-visibility. I didn’t have to do a thing and it just started working.
But …I think I’ve discovered a little bug in Safari’s implementation.
(I say I think it’s a bug with the browser because, like Jim, I’ve made the mistake in the past of thinking I had discovered a browser bug when in fact it was something caused by a browser extension. And when I say “in the past”, I mean yesterday.)
So here’s the issue: if you apply content-visibility: auto to an element that contains an SVG, and that SVG contains a text element, then Safari never paints that text to the screen.
To see an example, take a look at the fourth setting of Cooley’s reel on The Session archive. There’s a text element with the word “slide” (actually the text is inside a tspan element inside a text element). On Safari, that text never shows up.
I’m using a link to the archive of The Session I created recently rather than the live site because on the live site I’ve removed the content-visibility declaration for Safari until this bug gets resolved.
I’ve also created a reduced test case on Codepen. The only HTML is the element containing the SVGs. The only CSS—apart from the content-visibility stuff—is just a little declaration to push the content below the viewport so you have to scroll it into view (which is when the bug happens).
I’ve filed a bug report. I know it’s a fairly niche situation, but there are some other issues with Safari’s implementation of content-visibility so it’s possible that they’re all related.
After I wrote positively about the speculation rules API I got an email from David Cizek with some legitimate concerns. He said:
I think that this kind of feature is not good, because someone else (web publisher) decides that I (my connection, browser, device) have to do work that very often is not needed. All that blurred by blackbox algorithm in the browser.
That’s fair. My hope is that the user will indeed get more say, whether that’s at the level of the browser or the operating system. I’m thinking of a prefers-reduced-data setting, much like prefers-color-scheme or prefers-reduced-motion.
But this issue isn’t something new with speculation rules. We’ve already got service workers, which allow the site author to unilaterally declare that a bunch of pages should be downloaded.
I’m doing that for Resilient Web Design—when you visit the home page, a service worker downloads the whole site. I can justify that decision to myself because the entire site is still smaller in size than one article from Wired or the New York Times. But still, is it right that I get to make that call?
So I’m very much in favour of browsers acting as true user agents—doing what’s best for the user, even in situations where that conflicts with the wishes of a site owner.
Going back to speculation rules, David asked:
Do we really need this kind of (easily turned to evil) enhancement in the current state of (web) affairs?
That question could be asked of many web technologies.
There’s always going to be a tension with any powerful browser feature. The more power it provides, the more it can be abused. Animations, service workers, speculation rules—these are all things that can be used to improve websites or they can be abused to do things the user never asked for.
Or take the elephant in the room: JavaScript.
Right now, a site owner can link to a JavaScript file that’s tens of megabytes in size, and the browser has no alternative but to download it. I’d love it if users could specify a limit. I’d love it even more if browsers shipped with a default limit, especially if that limit is related to the device and network.
I don’t think speculation rules will be abused nearly as much as client-side JavaScript is already abused.
There’s a new addition to the latest version of Chrome called speculation rules. This already existed before with a different syntax, but the new version makes more sense to me.
Notice that I called this an addition, not a standard. This is not a web standard, though it may become one in the future. Or it may not. It may wither on the vine and disappear (like most things that come from Google).
The gist of it is that you give the browser one or more URLs that the user is likely to navigate to. The browser can then pre-fetch or even pre-render those links, making that navigation really snappy. It’s a replacement for the abandoned link rel="prerender".
Because this is a unilateral feature, I’m not keen on shipping the code to all browsers. The old version of the API required a script element with a type value of “speculationrules”. That doesn’t do any harm to browsers that don’t support it—it’s a progressive enhancement. But unlike other progressive enhancements, this isn’t something that will just start working in those other browsers one day. I mean, it might. But until this API is an actual web standard, there’s no guarantee.
That’s why I was pleased to see that the new version of the API allows you to use an external JSON file with your list of rules.
I say “rules”, but they’re really more like guidelines. The browser will make its own evaluation based on bandwidth, battery life, and other factors. This feature is more like srcset than source: you give the browser some options, but ultimately you can’t force it to do anything.
I’ve implemented this over on The Session. There’s a JSON file called speculationrules.js with the simplest of suggestions:
{
"prerender": [{
"where": {
"href_matches": "/*"
},
"eagerness": "moderate"
}]
}
The eagerness value of “moderate” says that any link can be pre-rendered if the user hovers over it for 200 milliseconds (the nuclear option would be to use a value of “immediate”).
I still need to point to that JSON file from my HTML. Usually this would be done with something like a link element, but for this particular API, I can send a response header instead:
Speculation-Rules: “/speculationrules.json"
I like that. The response header is being sent to every browser, regardless of whether they support speculation rules or not, but at least it’s just a few bytes. Those other browsers will ignore the header—they won’t download the JSON file.
Here’s the PHP I added to send that header:
header('Speculation-Rules: "/speculationrules.json"');
There’s one extra thing I had to do. The JSON file needs to be served with mime-type of “application/speculationrules+json”. Here’s how I set that up in the .conf file for The Session on Apache:
<IfModule mod_headers.c>
<FilesMatch "speculationrules.json">
Header set Content-type application/speculationrules+json
</FilesMatch>
</IfModule>
A bit of a faff, that.
You can see it in action on The Session. Open up Chrome or Edge (same same but different), fire up the dev tools and keep the network tab open while you navigate around the site. Notice how hovering over a link will trigger a new network request. Clicking on that link will get you that page lickety-split.
Mind you, in the case of The Session, the navigations were already really fast—performance is a feature—so it’s hard to guage how much of a practical difference it makes in this case, but it still seems like a no-brainer to me: taking a few minutes to add this to your site is worth doing.
Oh, there’s one more thing to be aware of when you’re implementing speculation rules. You have the option of excluding URLs from being pre-fetched or pre-rendered. You might need to do this if you’ve got links for adding items to shopping carts, or logging the user out. But my advice would instead be: stop using GET requests for those actions!
Most of the examples given for unsafe speculative loading conditions are textbook cases of when not to use links. Links are for navigating. They’re indempotent. For everthing else, we’ve got forms.
You can’t create a complex modern web application like Google Mail without JavaScript and a SPA architecture. Google Mail is a webmail client and webmail clients existed some time before JavaScript became the language it is today or frameworks like Angular JS or Angular BS existed. However, you cannot create a complex modern web application like Google Mail without JavaScript. Google Mail itself offers a basic HTML version that works perfectly well without JavaScript of any form—let alone a 300KB bundle. But, still, you cannot create a complex modern web application like Google Mail without JavaScript. Just keep saying that. Keep repeating that line in perpetuity. Keep adding more and more JavaScript and calling it good.
This really is a disgusting exlusionary state of affairs.
I hate to be judgy, but I honestly wonder how the people behind some of these decisions can call themselves web developers.
I wrote a little while back about improving performance on The Session by reducing runtime JavaScript in favour of caching on the server. This is on the pages for tunes, where the SVGs for the sheetmusic are now inlined instead of being generated on the fly.
It worked. But I also wrote:
A page like that with lots of sheetmusic and plenty of comments is going to have a hefty page weight and a large DOM size. I’ve still got a fair bit of main-thread work happening, but now the bulk of it is style and layout, whereas previously I had the JavaScript overhead on top of that.
Take a tune like Out On The Ocean. It has 27 settings. That’s a lot of SVG markup that needs to be parsed, styled and rendered, even if it’s inline.
Then I remembered a very handy CSS property called content-visibility:
It enables the user agent to skip an element’s rendering work (including layout and painting) until it is needed — which makes the initial page load much faster.
Sounds great! But there are two gotchas.
The first gotcha is that if a browser doesn’t paint the element, it doesn’t know how much space the element should take up. So you need to provide dimensions. At the very least you need to provide a height value. Otherwise when the element comes into view and gets rendered, it pushes down on the content below it. You’d see a sudden jump in the scrollbar position.
The solution is to provide a value for contain-intrinsic-size. If your content is dynamic—from, say, a CMS—then you’re out of luck. You don’t know how long the content is.
Luckily, in my case, I could take a stab at it. I know how many lines of sheetmusic there are for each tune setting. Each line takes up roughly the same amount of space. If I multiply that amount of space by the number of lines then I’ve got a pretty good approximation of the height of the sheetmusic. I apply this with the contain-intrinsic-block-size property.
So each piece of sheetmusic has an inline style attribute with declarations like this:
content-visibility: auto;
contain-intrinsic-block-size: 380px;
It works a treat. I did a before-and-after check with pagespeed insights on the page for Out On The Ocean. The “style and layout” part of the main thread work went down considerably. Total blocking time went from more than 600 milliseconds to less than 400 milliseconds.
Not a bad result for a little bit of CSS!
I said there was a second gotcha. That’s browser support.
Right now content-visibility is only supported in Chrome and Edge. But that’s okay. This is a progressive enhancement. Adding this CSS has no detrimental effect on the browsers that don’t understand it (and when they do ship support for it, it’ll just start working). I’ve said it before and I’ll say it again: the forgiving error-parsing in HTML and CSS is a killer feature of the web. Browsers just ignore what they don’t understand. That’s what makes progressive enhancement like this possible.
And actually, there’s something you can do for all browsers. Even browsers that don’t support content-visibility still understand containment. So they’ll understand contain-intrinsic-size. Pair that with a contain declaration like this to tell the browser that this chunk of content isn’t going to reflow or get repainted:
contain: layout paint;
Here’s what MDN says about contain:
The
containCSS property indicates that an element and its contents are, as much as possible, independent from the rest of the document tree. Containment enables isolating a subsection of the DOM, providing performance benefits by limiting calculations of layout, style, paint, size, or any combination to a DOM subtree rather than the entire page.
So if you’ve got a chunk of static content, you might as well apply contain to it.
Again, not bad for a little bit of CSS!
There are absolutely use-cases for SPAs (media sites, primarily). Most of the other things we use them for make the user experience notably worse or band-aid over the real underlying issues without addressing them.
I received this email recently:
Subject: multi-page web apps
Hi Jeremy,
lately I’ve been following you through videos and texts and I’m curious as to why you advocate the use of multi-page web apps and not single-page ones.
Perhaps you can refer me to some sources where your position and reasoning is evident?
Here’s the response I sent…
Hi,
You can find a lot of my reasoning laid out in this (short and free) online book I wrote called Resilient Web Design:
https://resilientwebdesign.com/
The short answer to your question is this: user experience.
The slightly longer answer…
For most use cases, a website (or multi-page app if you prefer) is going to provide the most robust experience for the most number of users. That’s because a user’s web browser takes care of most of the heavy lifting.
Navigating from one page to another? That’s taken care of with links.
Gathering information from a user to process on a server? That’s taken care of with forms.
This frees me up to concentrate on the content and the design without having to reinvent the wheels of links and form fields.
These (let’s call them) multi-page apps are stateless, and for most use cases that’s absolutely fine.
There are some cases where you’d want a state to persist across pages. Let’s say you’re playing a song, or a podcast episode. Ideally you’d want that player to continue seamlessly playing even as the user navigates around the site. In that situation, a single-page app would be a suitable architecture.
But that architecture comes at a cost. Now you’ve got stop the browser doing what it would normally do with links and forms. It’s up to you to recreate that functionality. And you can’t do it with HTML, a robust fault-tolerant declarative language. You need to reimplement all that functionality in JavaScript, a less tolerant, more brittle language.
Then you’ve got to ship all that code to the user before they can use your site. It might be JavaScript code you’ve written yourself or it might be a third-party library designed for building single-page apps. Either way, the user pays a download tax (and a parsing tax, and an execution tax). Whereas with links and forms, all of that functionality is pre-bundled into the user’s web browser.
So that’s my reasoning. At least nine times out of ten, a multi-page approach is leaner, more robust, and simpler.
Like I said, there are times when a single-page approach makes sense—it all comes down to whether state needs to be constantly preserved. But these use cases are the exceptions, not the rule.
That’s why I find the framing of your question a little concerning. It should be inverted. The default approach should be to assume a multi-page approach (which is the way the web works by default). Deciding to take a JavaScript-driven single-page approach should be the exception.
It’s kind of like when people ask, “Why don’t you have children?” Surely the decision to have a child should require deliberation and commitment, rather than the other way around.
When it comes to front-end development, I’m worried that we’ve reached a state where the more complex over-engineered approach is viewed as the default.
I may be committing a fundamental attribution error here, but I think that we’ve reached this point not because of any consideration for users, but rather because of how it makes us developers feel. Perhaps building an old-fashioned website that uses HTML for navigations feels too easy, like it’s beneath us. But building an “app” that requires JavaScript just to render text on a screen feels like real programming.
I hope I’m wrong. I hope that other developers will start to consider user experience first and foremost when making architectural decisions.
Anyway. That’s my answer. User experience.
Cheers,
Jeremy
A great talk from Addy on just how damaging client-side JavaScript can be to the user experience …and what you can do about it.
JavaScript is great. I love using it, and it does amazing things. But maybe it’s time we stop repeating these same patterns of development over and over again. Maybe we can use JavaScript more responsibly, and focus more effort on HTML and CSS.
I woke up today to a very annoying new bug in Firefox. The browser shits the bed in an unpredictable fashion when rounding up single pixel line widths in SVG. That’s quite a problem on The Session where all the sheet music is rendered in SVG. Those thin lines in sheet music are kind of important.
Browser bugs like these are very frustrating. There’s nothing you can do from your side other than filing a bug. The locus of control is very much with the developers of the browser.
Still, the occasional regression in a browser is a price I’m willing to pay for a plurality of rendering engines. Call me old-fashioned but I still value the ecological impact of browser diversity.
That said, I understand the argument for converging on a single rendering engine. I don’t agree with it but I understand it. It’s like this…
Back in the bad old days of the original browser wars, the browser companies just made shit up. That made life a misery for web developers. The Web Standards Project knocked some heads together. Netscape and Microsoft would agree to support standards.
So that’s where the bar was set: browsers agreed to work to the same standards, but competed by having different rendering engines.
There’s an argument to be made for raising that bar: browsers agree to work to the same standards, and have the same shared rendering engine, but compete by innovating in all other areas—the browser chrome, personalisation, privacy, and so on.
Like I said, I understand the argument. I just don’t agree with it.
One reason for zeroing in a single rendering engine is that it’s just too damned hard to create or maintain an entirely different rendering engine now that web standards are incredibly powerful and complex. Only a very large company with very deep pockets can hope to be a rendering engine player. Google. Apple. Heck, even Microsoft threw in the towel and abandoned their rendering engine in favour of Blink and V8.
And yet. Andreas Kling recently wrote about the Ladybird browser. How we’re building a browser when it’s supposed to be impossible:
The ECMAScript, HTML, and CSS specifications today are (for the most part) stellar technical documents whose algorithms can be implemented with considerably less effort and guesswork than in the past.
I’ll be watching that project with interest. Not because I plan to use the brower. I’d just like to see some evidence against the complexity argument.
Meanwhile most other browser projects are building on the raised bar of a shared browser engine. Blisk, Brave, and Arc all use Chromium under the hood.
Arc is the most interesting one. Built by the wonderfully named Browser Company of New York, it’s attempting to inject some fresh thinking into everything outside of the rendering engine.
Experiments like Arc feel like they could have more in common with tools-for-thought software like Obsidian and Roam Research. Those tools build knowledge graphs of connected nodes. A kind of hypertext of ideas. But we’ve already got hypertext tools we use every day: web browsers. It’s just that they don’t do much with the accumulated knowledge of our web browsing. Our browsing history is a boring reverse chronological list instead of a cool-looking knowledge graph or timeline.
For inspiration we can go all the way back to Vannevar Bush’s genuinely seminal 1945 article, As We May Think. Bush imagined device, the Memex, was a direct inspiration on Douglas Engelbart, Ted Nelson, and Tim Berners-Lee.
The article describes a kind of hypertext machine that worked with microfilm. Thanks to Tim Berners-Lee’s World Wide Web, we now have a global digital hypertext system that we access every day through our browsers.
But the article also described the idea of “associative trails”:
Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them, ready to be dropped into the memex and there amplified.
Our browsing histories are a kind of associative trail. They’re as unique as fingerprints. Even if everyone in the world started on the same URL, our browsing histories would quickly diverge.
Bush imagined that these associative trails could be shared:
The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities.
Heck, making a useful browsing history could be a real skill:
There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record.
Taking something personal and making it public isn’t a new idea. It was what drove the wave of web 2.0 startups. Before Flickr, your photos were private. Before Delicous, your bookmarks were private. Before Last.fm, what music you were listening to was private.
I’m not saying that we should all make our browsing histories public. That would be a security nightmare. But I am saying there’s a lot of untapped potential in our browsing histories.
Let’s say we keep our browsing histories private, but make better use of them.
From what I’ve seen of large language model tools, the people getting most use of out of them are training them on a specific corpus. Like, “take this book and then answer my questions about the characters and plot” or “take this codebase and then answer my questions about the code.” If you treat these chatbots as calculators for words they can be useful for some tasks.
Large language model tools are getting smaller and more portable. It’s not hard to imagine one getting bundled into a web browser. It feeds on your browsing history. The bigger your browsing history, the more useful it can be.
Except, y’know, for the times when it just make shit up.
Vannevar Bush didn’t predict a Memex that would hallucinate bits of microfilm that didn’t exist.
It sounds like Remix takes a sensible approach to progressive enhancement.
There are two JavaScripts.
One for the server - where you can go wild.
One for the client - that should be thoughtful and careful.
Yes! This! I’m always astounded to see devs apply the same mindset to backend and frontend development, just because it happens to be in the same language. I don’t care what you use on your own machine or your own web server, but once you’re sending something down the wire to end users, you need to prioritise their needs over your own.
It’s the JavaScript on the client side that’s the problem. What’s given to the visitor.
I’d ask you, if you’re still reading, that you consider a separation of JavaScript between client and server. If you’re a dev, consider the payload, your bundle and work to reduce the cost to your visitor. Heck, think progressive enhancement.
Spoiler: the answer to the question in the title is a resounding “hell yeah!”
Scott brings receipts.
Brian recounts the sordid messy history of user-agent strings.
I remember somebody once describing a user-agent string as “a reverse-chronological history of web browsers.”
Safari is very opinionated about which features they will support and which they won’t. And that is fine for their browser. But I don’t want the Safari team to choose for all browsers on the iOS platform.
A terrific piece from Niels pushing back on the ridiculous assertion that Apple’s ban on rival rendering engines in iOS is somehow a noble battle against a monopoly (rather than the abuse of monopoly power it actually is). If there were any truth to the idea that Apple’s browser ban is the only thing stopping everyone from switching to Chrome, then nobody would be using Safari on MacOS where users are free to choose whichever rendering engine they want.
The Safari team is capable enough not to let their browser become irrelevant. And Apple has enough money to support the Safari team to take on other browsers. It does not need some artificial App Store rule to protect it from the competition.
WebKit-only proponents are worried about losing control and Google becoming too powerful. And they feel preventing Google from controlling the web is more important than giving more power to users. They believe they are protecting users against themselves. But that is misguided.
Users need to be in control because if you take power away from users, you are creating the future you want to prevent, where one company sets the rules for everybody else. It is just somebody else who is pulling the strings.
Firefox as the asphyxiating canary in the coalmine of the web.