May 2022
49 posts: 6 entries, 20 links, 23 beats
May 1, 2022
May 2, 2022
sqlite-utils 3.26.1 (via) I released sqlite-utils 3.36.1 with one tiny but exciting feature: I fixed its one dependency that wasn’t published as a pure Python wheel, which means it can now be used with Pyodide—Python compiled to WebAssembly running in your browser!
May 3, 2022
Web Scraping via Javascript Runtime Heap Snapshots (via) This is an absolutely brilliant scraping trick. Adrian Cooney figured out a way to use Puppeteer and the Chrome DevTools protocol to take a heap snapshot of all of the JavaScript running on a web page, then recursively crawl through the heap looking for any JavaScript objects that have a specified selection of properties. This allows him to scrape data from arbitrarily complex client-side web applications. He built a JavaScript library and command line tool that implements the pattern.
Simple declarative schema migration for SQLite (via) This is an interesting, clearly explained approach to the database migration problem. Create a new in-memory database and apply the current schema, then run some code to compare that with the previous schema—which tables are new, and which tables have had columns added. Then apply those changes.
I’d normally be cautious of running something like this because I can think of ways it could go wrong—but SQLite backups are so quick and cheap (just copy the file) that I could see this being a relatively risk-free way to apply migrations.
May 4, 2022
Datasette Lite: a server-side Python web application running in a browser
Datasette Lite is a new way to run Datasette: entirely in a browser, taking advantage of the incredible Pyodide project which provides Python compiled to WebAssembly plus a whole suite of useful extras.
[... 4,800 words]SIARD: Software Independent Archiving of Relational Databases (via) I hadn’t heard of this before but it looks really interesting: the Federal Archives of Switzerland developed a standard for archiving any relational database as a zip file full of XML which is “is used in over 50 countries around the globe”.
May 6, 2022
Weeknotes: Datasette Lite, nogil Python, HYTRADBOI
My big project this week was Datasette Lite, a new way to run Datasette directly in a browser, powered by WebAssembly and Pyodide. I also continued my research into running SQL queries in parallel, described last week. Plus I spoke at HYTRADBOI.
[... 1,434 words]May 7, 2022
May 13, 2022
sqlite-utils: a nice way to import data into SQLite for analysis (via) Julia Evans on my sqlite-utils Python library and CLI tool.
May 14, 2022
May 15, 2022
Why Rust’s postfix await syntax is good (via) C J Silverio explains postfix await in Rust—where you can write a line like this, with the ? causing any errors to be caught and turned into an error return from your function:
let count = fetch_all_animals().await?.filter_for_hedgehogs().len();
How Materialize and other databases optimize SQL subqueries. Jamie Brandon offers a survey of the state-of-the-art in optimizing correlated subqueries, across a number of different database engines.
May 16, 2022
Heroku: Core Impact (via) Ex-Heroku engineer Brandur Leach pulls together some of the background information circulating concerning the now more than a month long Heroku security incident and provides some ex-insider commentary on what went right and what went wrong with a platform that left a huge, if somewhat underappreciated impact on the technology industry at large.
Weeknotes: Camping, a road trip and two new museums
Natalie and I took a week-long road trip and camping holiday. The plan was to camp on Santa Rosa Island in the California Channel Islands, but the boat to the island was cancelled due to bad weather. We treated ourselves to a Central Californian road trip instead.
[... 872 words]Supercharging GitHub Actions with Job Summaries (via) GitHub Actions workflows can now generate a rendered Markdown summary of, well, anything that you can think to generate as part of the workflow execution. I particularly like the way this is designed: they provide a filename in a $GITHUB_STEP_SUMMARY environment variable which you can then append data to from each of your steps.
May 17, 2022
simonw/datasette-screenshots (via) I started a new GitHub repository to automate taking screenshots of Datasette for marketing purposes, using my shot-scraper browser automation tool.
May 18, 2022
Comby (via) Describes itself as “Structural search and replace for any language”. Lets you execute search and replace patterns that look a little bit like simplified regular expressions, but with some deep OCaml-powered magic that makes them aware of comment, string and nested parenthesis rules for different languages. This means you can use it to construct scripts that automate common refactoring or code upgrade tasks.
May 19, 2022
May 21, 2022
GOV.UK Guidance: Documenting APIs (via) Characteristically excellent guide from GOV.UK on writing great API documentation. “Task-based guidance helps users complete the most common integration tasks, based on the user needs from your research.”