[go: up one dir, main page]

Simon Willison’s Weblog

Subscribe

January 2026

96 posts: 9 entries, 32 links, 14 quotes, 4 notes, 37 beats

Jan. 1, 2026

Tool Hyperviz — Create real-time audio-reactive visualizations with eight distinct modes including plasma, particles, tunnel effects, kaleidoscope patterns, Matrix rain, terrain, fire, and starfield animations. The visualizer captures frequency data from your microphone and responds dynamically to bass, mid, and high frequencies, with beat detection triggering visual flashes and intensity changes.

Introducing gisthost.github.io

I am a huge fan of gistpreview.github.io, the site by Leon Huang that lets you append ?GIST_id to see a browser-rendered version of an HTML page that you have saved to a Gist. The last commit was ten years ago and I needed a couple of small changes so I’ve forked it and deployed an updated version at gisthost.github.io.

[... 952 words]

Jan. 2, 2026

[Claude Code] has the potential to transform all of tech. I also think we’re going to see a real split in the tech industry (and everywhere code is written) between people who are outcome-driven and are excited to get to the part where they can test their work with users faster, and people who are process-driven and get their meaning from the engineering itself and are upset about having that taken away.

Ben Werdmuller

# 12:48 am / coding-agents, ai-assisted-programming, claude-code, generative-ai, ai, llms

I sent the December edition of my sponsors-only monthly newsletter. If you are a sponsor (or if you start a sponsorship now) you can access a copy here. In the newsletter this month:

  • An in-depth review of LLMs in 2025
  • My coding agent projects in December
  • New models for December 2025
  • Skills are an open standard now
  • Claude's "Soul Document"
  • Tools I'm using at the moment

Here's a copy of the November newsletter as a preview of what you'll get. Pay $10/month to stay a month ahead of the free copy!

# 4:33 am / newsletter

Tool Red Annotation Extractor — # Red Annotation Extractor Documentation

The most popular blogs of Hacker News in 2025 (via) Michael Lynch maintains HN Popularity Contest, a site that tracks personal blogs on Hacker News and scores them based on how well they perform on that platform.

The engine behind the project is the domain-meta.csv CSV on GiHub, a hand-curated list of known personal blogs with author and bio and tag metadata, which Michael uses to separate out personal blog posts from other types of content.

I came top of the rankings in 2023, 2024 and 2025 but I'm listed in third place for all time behind Paul Graham and Brian Krebs.

I dug around in the browser inspector and was delighted to find that the data powering the site is served with open CORS headers, which means you can easily explore it with external services like Datasette Lite.

Here's a convoluted window function query Claude Opus 4.5 wrote for me which, for a given domain, shows where that domain ranked for each year since it first appeared in the dataset:

with yearly_scores as (
  select 
    domain,
    strftime('%Y', date) as year,
    sum(score) as total_score,
    count(distinct date) as days_mentioned
  from "hn-data"
  group by domain, strftime('%Y', date)
),
ranked as (
  select 
    domain,
    year,
    total_score,
    days_mentioned,
    rank() over (partition by year order by total_score desc) as rank
  from yearly_scores
)
select 
  r.year,
  r.total_score,
  r.rank,
  r.days_mentioned
from ranked r
where r.domain = :domain
  and r.year >= (
    select min(strftime('%Y', date)) 
    from "hn-data"
    where domain = :domain
  )
order by r.year desc

(I just noticed that the last and r.year >= ( clause isn't actually needed here.)

My simonwillison.net results show me ranked 3rd in 2022, 30th in 2021 and 85th back in 2007 - though I expect there are many personal blogs from that year which haven't yet been manually added to Michael's list.

Also useful is that every domain gets its own CORS-enabled CSV file with details of the actual Hacker News submitted from that domain, e.g. https://hn-popularity.cdn.refactoringenglish.com/domains/simonwillison.net.csv. Here's that one in Datasette Lite.

# 7:10 pm / hacker-news, sql, sqlite, datasette, datasette-lite, cors

My experience is that real AI adoption on real problems is a complex blend of: domain context on the problem, domain experience with AI tooling, and old-fashioned IT issues. I’m deeply skeptical of any initiative for internal AI adoption that doesn’t anchor on all three of those. This is an advantage of earlier stage companies, because you can often find aspects of all three of those in a single person, or at least across two people. In larger companies, you need three different organizations doing this work together, this is just objectively hard

Will Larson, Facilitating AI adoption at Imprint

# 7:57 pm / leadership, llms, ai, will-larson

Jan. 3, 2026

Research SQLite Time Limit Extension — Designed as a Python C extension, the SQLite Time Limit Extension introduces a function, execute_with_timeout, enabling SQL queries against a SQLite database to be terminated if they exceed a specified millisecond threshold. This is achieved using SQLite's progress handler, ensuring that long-running queries do not block application responsiveness. Usage is simple via standard import, and rigorous tests are provided with pytest to validate both normal operation and timeouts.

Was Daft Punk Having a Laugh When They Chose the Tempo of Harder, Better, Faster, Stronger? (via) Depending on how you measure it, the tempo of Harder, Better, Faster, Stronger appears to be 123.45 beats per minute.

This is one of those things that's so cool I'm just going to accept it as true.

(I only today learned from the Hacker News comments that Veridis Quo is "Very Disco", and if you flip the order of those words you get Discovery, the name of the album.)

# 5:57 am / music

Jan. 4, 2026

I'm not joking and this isn't funny. We have been trying to build distributed agent orchestrators at Google since last year. There are various options, not everyone is aligned... I gave Claude Code a description of the problem, it generated what we built last year in an hour.

It's not perfect and I'm iterating on it but this is where we are right now. If you are skeptical of coding agents, try it on a domain you are already an expert of. Build something complex from scratch where you can be the judge of the artifacts.

[...] It wasn't a very detailed prompt and it contained no real details given I cannot share anything propriety. I was building a toy version on top of some of the existing ideas to evaluate Claude Code. It was a three paragraph description.

Jaana Dogan, Principal Engineer at Google

# 3:03 am / anthropic, claude, ai, claude-code, llms, ai-assisted-programming, google, generative-ai

Something I like about our weird new LLM-assisted world is the number of people I know who are coding again, having mostly stopped as they moved into management roles or lost their personal side project time to becoming parents.

AI assistance means you can get something useful done in half an hour, or even while you are doing other stuff. You don't need to carve out 2-4 hours to ramp up anymore.

If you have significant previous coding experience - even if it's a few years stale - you can drive these things really effectively. Especially if you have management experience, quite a lot of which transfers to "managing" coding agents - communicate clearly, set achievable goals, provide all relevant context. Here's a relevant recent tweet from Ethan Mollick:

When you see how people use Claude Code/Codex/etc it becomes clear that managing agents is really a management problem

Can you specify goals? Can you provide context? Can you divide up tasks? Can you give feedback?

These are teachable skills. Also UIs need to support management

This note started as a comment.

# 3:43 pm / careers, ai-agents, ai, llms, ethan-mollick, ai-assisted-programming, coding-agents, generative-ai

With enough users, every observable behavior becomes a dependency - regardless of what you promised. Someone is scraping your API, automating your quirks, caching your bugs.

This creates a career-level insight: you can’t treat compatibility work as “maintenance” and new features as “real work.” Compatibility is product.

Design your deprecations as migrations with time, tooling, and empathy. Most “API design” is actually “API retirement.”

Addy Osmani, 21 lessons from 14 years at Google

# 4:40 pm / api-design, addy-osmani, careers, google

It genuinely feels to me like GPT-5.2 and Opus 4.5 in November represent an inflection point - one of those moments where the models get incrementally better in a way that tips across an invisible capability line where suddenly a whole bunch of much harder coding problems open up.

# 11:21 pm / anthropic, claude, openai, ai, llms, gpt-5, ai-assisted-programming, generative-ai, claude-4

Jan. 5, 2026

Oxide and Friends Predictions 2026, today at 4pm PT (via) I joined the Oxide and Friends podcast last year to predict the next 1, 3 and 6 years(!) of AI developments. With hindsight I did very badly, but they're inviting me back again anyway to have another go.

We will be recording live today at 4pm Pacific on their Discord - you can join that here, and the podcast version will go out shortly afterwards.

I'll be recording at their office in Emeryville and then heading to the Crucible to learn how to make neon signs.

# 4:53 pm / podcasts, ai, llms, oxide

It’s hard to justify Tahoe icons (via) Devastating critique of the new menu icons in macOS Tahoe by Nikita Prokopov, who starts by quoting the 1992 Apple HIG rule to not "overload the user with complex icons" and then provides comprehensive evidence of Tahoe doing exactly that.

In my opinion, Apple took on an impossible task: to add an icon to every menu item. There are just not enough good metaphors to do something like that.

But even if there were, the premise itself is questionable: if everything has an icon, it doesn’t mean users will find what they are looking for faster.

And even if the premise was solid, I still wish I could say: they did the best they could, given the goal. But that’s not true either: they did a poor job consistently applying the metaphors and designing the icons themselves.

# 7:30 pm / apple, design, macos, usability

Research pymemchr — pymemchr is a Python library that provides ultra-fast byte and substring search functions by binding to the memchr Rust crate, leveraging SIMD optimizations for superior performance. Using PyO3 and Maturin for cross-language integration, pymemchr offers efficient routines for finding single bytes, searching for multiple bytes, and locating substring patterns, both forwards and backwards, with highly competitive speedup over native Python methods.
Research sqlite3-wasm Investigation Report — Seeking to enable Python's SQLite interface with WebAssembly, the project developed a `sqlite3_wasm` library—a drop-in replacement for Python's standard `sqlite3` module. By compiling SQLite 3.45.3 to WASM with wasi-sdk and wrapping the resulting binary with a Python API, the solution delivers fully functional, in-memory, WASM-powered database operations using the wasmtime runtime.
Research pymemchr-c: C Implementation of memchr Library — Offering a pure C reimplementation of the Rust-based pymemchr, pymemchr-c delivers high-performance byte and substring search functions to Python with extensive SIMD (SSE2/AVX2/NEON) optimizations and runtime CPU feature detection. Its unique "Packed Pair" substring search algorithm enables the C version to outperform both Python's built-in methods (up to 28x faster) and the original Rust extension (up to 1.5x faster for substring operations), all while removing the need for a Rust toolchain.

Jan. 6, 2026

Release datasette-public 0.4a0 — Make selected Datasette databases and tables visible to the public
Tool v86 Linux Emulator — Run Linux commands directly in your browser using x86 emulation powered by v86. The emulator boots a minimal Linux distribution and provides a terminal interface where you can execute shell commands and view their output in real-time. Boot times typically range from 10-30 seconds depending on your device and connection speed.

A field guide to sandboxes for AI (via) This guide to the current sandboxing landscape by Luis Cardoso is comprehensive, dense and absolutely fantastic.

He starts by differentiating between containers (which share the host kernel), microVMs (their own guest kernel behind hardwae virtualization), gVisor userspace kernels and WebAssembly/isolates that constrain everything within a runtime.

The piece then dives deep into terminology, approaches and the landscape of existing tools.

I think using the right sandboxes to safely run untrusted code is one of the most important problems to solve in 2026. This guide is an invaluable starting point.

# 10:38 pm / sandboxing, ai, generative-ai, llms

Jan. 7, 2026

AGI is here! When exactly it arrived, we’ll never know; whether it was one company’s Pro or another company’s Pro Max (Eddie Bauer Edition) that tip-toed first across the line … you may debate. But generality has been achieved, & now we can proceed to new questions. [...]

The key word in Artificial General Intelligence is General. That’s the word that makes this AI unlike every other AI: because every other AI was trained for a particular purpose. Consider landmark models across the decades: the Mark I Perceptron, LeNet, AlexNet, AlphaGo, AlphaFold … these systems were all different, but all alike in this way.

Language models were trained for a purpose, too … but, surprise: the mechanism & scale of that training did something new: opened a wormhole, through which a vast field of action & response could be reached. Towering libraries of human writing, drawn together across time & space, all the dumb reasons for it … that’s rich fuel, if you can hold it all in your head.

Robin Sloan, AGI is here (and I feel fine)

# 12:54 am / robin-sloan, llms, ai, generative-ai

[...] the reality is that 75% of the people on our engineering team lost their jobs here yesterday because of the brutal impact AI has had on our business. And every second I spend trying to do fun free things for the community like this is a second I'm not spending trying to turn the business around and make sure the people who are still here are getting their paychecks every month. [...]

Traffic to our docs is down about 40% from early 2023 despite Tailwind being more popular than ever. The docs are the only way people find out about our commercial products, and without customers we can't afford to maintain the framework. [...]

Tailwind is growing faster than it ever has and is bigger than it ever has been, and our revenue is down close to 80%. Right now there's just no correlation between making Tailwind easier to use and making development of the framework more sustainable.

Adam Wathan, CEO, Tailwind Labs

# 5:29 pm / ai-ethics, css, generative-ai, ai, llms, open-source

Jan. 8, 2026

How Google Got Its Groove Back and Edged Ahead of OpenAI (via) I picked up a few interesting tidbits from this Wall Street Journal piece on Google's recent hard won success with Gemini.

Here's the origin of the name "Nano Banana":

Naina Raisinghani, known inside Google for working late into the night, needed a name for the new tool to complete the upload. It was 2:30 a.m., though, and nobody was around. So she just made one up, a mashup of two nicknames friends had given her: Nano Banana.

The WSJ credit OpenAI's Daniel Selsam with un-retiring Sergei Brin:

Around that time, Google co-founder Sergey Brin, who had recently retired, was at a party chatting with a researcher from OpenAI named Daniel Selsam, according to people familiar with the conversation. Why, Selsam asked him, wasn’t he working full time on AI. Hadn’t the launch of ChatGPT captured his imagination as a computer scientist?

ChatGPT was on its way to becoming a household name in AI chatbots, while Google was still fumbling to get its product off the ground. Brin decided Selsam had a point and returned to work.

And we get some rare concrete user numbers:

By October, Gemini had more than 650 million monthly users, up from 450 million in July.

The LLM usage number I see cited most often is OpenAI's 800 million weekly active users for ChatGPT. That's from October 6th at OpenAI DevDay so it's comparable to these Gemini numbers, albeit not directly since it's weekly rather than monthly actives.

I'm also never sure what counts as a "Gemini user" - does interacting via Google Docs or Gmail count or do you need to be using a Gemini chat interface directly?

Update 17th January 2025: @LunixA380 pointed out that this 650m user figure comes from the Alphabet 2025 Q3 earnings report which says this (emphasis mine):

"Alphabet had a terrific quarter, with double-digit growth across every major part of our business. We delivered our first-ever $100 billion quarter," said Sundar Pichai, CEO of Alphabet and Google.

"[...] In addition to topping leaderboards, our first party models, like Gemini, now process 7 billion tokens per minute, via direct API use by our customers. The Gemini App now has over 650 million monthly active users.

Presumably the "Gemini App" encompasses the Android and iPhone apps as well as direct visits to gemini.google.com - that seems to be the indication from Google's November 18th blog post that also mentioned the 650m number.

# 3:32 pm / google, ai, openai, generative-ai, llms, gemini, nano-banana

LLM predictions for 2026, shared with Oxide and Friends

Visit LLM predictions for 2026, shared with Oxide and Friends

I joined a recording of the Oxide and Friends podcast on Tuesday to talk about 1, 3 and 6 year predictions for the tech industry. This is my second appearance on their annual predictions episode, you can see my predictions from January 2025 here. Here’s the page for this year’s episode, with options to listen in all of your favorite podcast apps or directly on YouTube.

[... 1,741 words]

Jan. 9, 2026

Fly’s new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

Visit Fly's new Sprites.dev addresses both developer sandboxes and API sandboxes at the same time

New from Fly.io today: Sprites.dev. Here’s their blog post and YouTube demo. It’s an interesting new product that’s quite difficult to explain—Fly call it “Stateful sandbox environments with checkpoint & restore” but I see it as hitting two of my current favorite problems: a safe development environment for running coding agents and an API for running untrusted code in a secure sandbox.

[... 1,560 words]

Jan. 10, 2026

Release datasette-transactions 0.1a0 — API for executing multiple queries within a transaction
Release denobox 0.1a0 — Run JavaScript code and WASM in a Deno sandbox
Release denobox 0.1a1 — Run JavaScript code and WASM in a Deno sandbox
Release denobox 0.1a2 — Run JavaScript code and WASM in a Deno sandbox