Tim Berners-Lee Invented the World Wide Web. Now He Wants to Save It | The New Yorker
A profile of Tim and the World World Web.
A profile of Tim and the World World Web.
Looking at LLM usage and promotion as a cultural phenomenon, it has all of the markings of a status game. The material gains from the LLM (which are usually quite marginal) really aren’t why people are doing it: they’re doing it because in many spaces, using ChatGPT and being very optimistic about AI being the “future” raises their social status. It’s important not only to be using it, but to be seen using it and be seen supporting it and telling people who don’t use it that they’re stupid luddites who’ll inevitably be left behind by technology.
As it currently stands, both the rapid growth of AI-generated content overwhelming online spaces and aggressive web-crawling practices by AI firms threaten the sustainability of essential online resources. The current approach taken by some large AI companies—extracting vast amounts of data from open-source projects without clear consent or compensation—risks severely damaging the very digital ecosystem on which these AI models depend.
AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.
Make yourself a nice cup of tea and settle in with Julian Gough’s magnum opus:
How early, sustained, supermassive black hole jets carved out cosmic voids, shaped filaments, and generated magnetic fields
Google Fonts only lets you download .ttf files meaning that if you want to self-host your fonts (and you should), you have to first convert them to .woff2 files.
Luckily this tool has been online for over a decade, doing what Google Fonts should be doing by default.
AI has the same problem that I saw ten year ago at IBM. And remember that IBM has been at this AI game for a very long time. Much longer than OpenAI or any of the new kids on the block. All of the shit we’re seeing today? Anyone who worked on or near Watson saw or experienced the same problems long ago.
LLMs are good at transforming text into less text
Laurie is really onto something with this:
This is the biggest and most fundamental thing about LLMs, and a great rule of thumb for what’s going to be an effective LLM application. Is what you’re doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it’s probably going to be great at it. If you’re asking it to convert into a roughly equal amount of text it will be so-so. If you’re asking it to create more text than you gave it, forget about it.
Depending how much of the hype around AI you’ve taken on board, the idea that they “take text and turn it into less text” might seem gigantic back-pedal away from previous claims of what AI can do. But taking text and turning it into less text is still an enormous field of endeavour, and a huge market. It’s still very exciting, all the more exciting because it’s got clear boundaries and isn’t hype-driven over-reaching, or dependent on LLMs overnight becoming way better than they currently are.
Interesting—this is exactly the same framing I used to talk about design systems a few years ago.
Oh, this is a very handy service from Paul—given the URL of an RSS feed that only has summaries, it will attempt to get the full post content from the HTML.
This magnificent piece by Maxwell Neely-Cohen—with some tasteful art-direction—is right up my alley!
This piece looks at a single question. If you, right now, had the goal of digitally storing something for 100 years, how should you even begin to think about making that happen? How should the bits in your stewardship be stored with such a target in mind? How do our methods and platforms look when considered under the harsh unknowns of a century? There are plenty of worthy related subjects and discourses that this piece does not touch at all. This is not a piece about the sheer volume of data we are creating each day, and how we might store all of it. Nor is it a piece about the extremely tough curatorial process of deciding what is and isn’t worth preserving and storing. It is about longevity, about the potential methods of preserving what we make for future generations, about how we make bits endure. If you had to store something for 100 years, how would you do it? That’s it.
If someone uses an LLM as a replacement for search, and the output they get is correct, this is just by chance. Furthermore, a system that is right 95% of the time is arguably more dangerous tthan one that is right 50% of the time. People will be more likely to trust the output, and likely less able to fact check the 5%.
Hypertext links are an information-density multiplier.
The way I’ve long thought about it is that traditional writing — like for print — feels two-dimensional. Writing for the web adds a third dimension. It’s not an equal dimension, though. It doesn’t turn writing from a flat plane into a full three-dimensional cube. It’s still primarily about the same two dimensions as old-fashioned writing. What hypertext links provide is an extra layer of depth. Just the fact that the links are there — even if you, the reader, don’t follow them — makes a sentence read slightly differently. It adds meaning in a way that is unique to the web as a medium for prose.
- People only understand things relative to things they already understand
- People only understand things in context
- People rely on patterns and consistency
- People seek to minimize cognitive load
- People have varying levels of expertise and familiarity
- People are goal-oriented
- People often don’t know what they’re looking for
- Information is more useful when it’s actionable
For many archivists, alarm bells are ringing. Across the world, they are scraping up defunct websites or at-risk data collections to save as much of our digital lives as possible. Others are working on ways to store that data in formats that will last hundreds, perhaps even thousands, of years.
David is on board. Who else?
What Trys describes here mirrors my experience too—it really is worth occasionally taking a little time to catch the low-hanging fruit of your site’s web performance (and accessibility):
I’ve shaved nearly half a megabyte off the page size and improved the accessibility along the way. Not bad for an evening of tinkering.
A fascinating look at the connections between hypertext and film editing. I’m a sucker for any article that cites both Ted Nelson and Walter Murch.
What podcasting holds in the promise of its open format is the proof that an open web can still thrive and be relevant, that it can inspire new systems that are similarly open to take root and grow. Even the biggest companies in the world can’t displace these kinds of systems once they find their audiences.