[go: up one dir, main page]

Simon Willison’s Weblog

Subscribe

January 2023

77 posts: 5 entries, 29 links, 8 quotes, 35 beats

Jan. 1, 2023

TIL Querying the GitHub archive with the ClickHouse playground — Via [this comment](https://news.ycombinator.com/item?id=34197637) on Hacker News I started exploring the [ClickHouse Playground](https://clickhouse.com/docs/en/getting-started/playground/). It's really cool, and among other things it allows CORS-enabled API hits that can query a decade of history from the GitHub events archive in less than a second.

In 2022, web3 went just great. Molly White’s essential roundup of 2022 in cryptocurrency. “$4.27 billion was stolen in various hacks and scams this year alone”.

# 5:13 am / web3, blockchain, molly-white

Jan. 2, 2023

nanoGPT. “The simplest, fastest repository for training/finetuning medium-sized GPTs”—by Andrej Karpathy, in about 600 lines of Python.

# 11:27 pm / python, ai, gpt-3, andrej-karpathy, generative-ai, llms

Petals (via) The challenge with large language models in the same scale ballpark as GPT-3 is that they’re large—really large. Far too big to run on a single machine at home. Petals is a fascinating attempt to address that problem: it works a little bit like BitTorrent, in that each user of Petal runs a subset of the overall language model on their machine and participates in a larger network to run inference across potentially hundreds of distributed GPUs. I tried it just now in Google Colab and it worked exactly as advertised, after downloading an 8GB subset of the 352GB BLOOM-176B model.

# 11:29 pm / ai, gpt-3, generative-ai, llms, bloom, gpus

Jan. 3, 2023

Release openai-to-sqlite 0.1a0 — Save OpenAI API results to a SQLite database
Release datasette-openai 0.1a0 — SQL functions for calling OpenAI APIs

Jan. 4, 2023

TIL Geopoly in SQLite — I noticed this morning that one of my Datasette installations had the [Geopoly](https://www.sqlite.org/geopoly.html) SQLite extension enabled. I don't know how it got there - it has to be compiled specifically - but since it was there I decided to try it out.

Jan. 8, 2023

TIL Loading SQLite extensions in Python on macOS — I finally found a workaround for this error when attempting to load a SQLite extension in Python on macOS:
Release shapefile-to-sqlite 0.4.2 — Load shapefiles into a SQLite (optionally SpatiaLite) database

Jan. 9, 2023

Release datasette-publish-fly 1.3 — Datasette plugin for publishing data using Fly
Release datasette 0.64 — An open source multi-tool for exploring and publishing data
Release datasette-auth-passwords 1.1 — Datasette plugin for authentication using passwords

Datasette 0.64, with a warning about SpatiaLite

Visit Datasette 0.64, with a warning about SpatiaLite

I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.

[... 675 words]

Jan. 10, 2023

Retiring Pinafore (via) Nolan Lawson built Pinafore, which became my default Mastodon client on both desktop and mobile over the past month. He thoughtfully explains why he’s ending his involvement in the project—and why, for trust reasons, he’s not planning on handing over the reigns to someone else. Pinafore is everything I want a good SPA to be—it loads fast, works offline and packs a whole lot of functionality into a tiny package. I’m sad to see Nolan’s involvement come to end—it’s a superb piece of software.

# 2:05 am / javascript, mastodon, nolan-lawson

Mapping Python to LLVM (via) Codon is a fascinating new entry in the “compile Python code to something else” world—this time targeting LLVM. Ariya Shajii describes in great detail how it pulls this off, including tricks such as transforming Python generators to LLVM coroutines. Codon doesn’t promise that all Python code will work—it’s best thought of as a Python-like language which can be used to create compiled modules which can then be imported back into regular Python projects.

# 2:08 am / compilers, llvm, python

Release datasette-openai 0.1a1 — SQL functions for calling OpenAI APIs
Release json-to-files 0.1 — Create separate files on disk based on a JSON object
TIL Scraping the Sky News Westminster Accounts, a Flourish application — Sky News in partnership with [Tortoise](https://www.tortoisemedia.com/) published a fantastic piece of investigative data reporting: [the Westminster Accounts](https://news.sky.com/story/westminster-accounts-methodology-12764656), a database of money in UK politics that brought together data from three different sources and make it explorable.

You will not use the Software for any act that may undermine China's national security and national unity, harm the public interest of society, or infringe upon the rights and interests of human beings.

The GLM-130B License

# 10:45 pm / machine-learning, licenses, ai, generative-ai, llms, ai-in-china, glm

Jan. 11, 2023

Release datasette-faiss 0.1a0 — Maintain a FAISS index for specified Datasette tables
Release datasette 0.64.1 — An open source multi-tool for exploring and publishing data
Release git-history 0.7a0 — Tools for analyzing Git history using SQLite
TIL Upgrading a pipx application to an alpha version — I wanted to upgrade my [git-history](https://datasette.io/tools/git-history) installation to a new alpha version.
Release datasette-cookies-for-magic-parameters 0.1 — UI for setting cookies to populate magic parameters
Release datasette-cookies-for-magic-parameters 0.1.1 — UI for setting cookies to populate magic parameters

Jan. 12, 2023

Release datasette-openai 0.1a2 — SQL functions for calling OpenAI APIs
Release datasette-cookies-for-magic-parameters 0.1.2 — UI for setting cookies to populate magic parameters

Jan. 13, 2023

Examples of floating point problems (via) I learned so much practical stuff from this post by Julia Evans. There are no 32-bit floating point numbers between 262144.0 and 262144.03125, which breaks code that attempts to keep incrementing by 0.01. I knew about the JavaScript tweet ID problem (JavaScript can’t handle numbers like 1612850010110005250) but I didn’t realize it affected jq as well. Lots more great examples in here.

# 3:41 pm / javascript, jq, julia-evans

Release openai-to-sqlite 0.2 — Save OpenAI API results to a SQLite database

2023 » January

MTWTFSS
      1
2345678
9101112131415
16171819202122
23242526272829
3031