[go: up one dir, main page]

Simon Willison’s Weblog

Subscribe

October 2019

58 posts: 5 entries, 15 links, 38 beats

Oct. 3, 2019

SQL queries don’t start with SELECT. This is really useful. Understanding that SELECT (and associated window functions) happen after the WHERE, GROUP BY and HAVING helps explain why you can’t filter a query based on the results of a window function for example.

# 8:56 pm / sql, julia-evans

Oct. 4, 2019

NGINX: Authentication Based on Subrequest Result (via) TIL about this neat feature of NGINX: you can use the auth_request directive to cause NGINX to make an HTTP subrequest to a separate authentication server for each incoming HTTP request. The authentication server can see the cookies on the incoming request and tell NGINX if it should fulfill the parent request (via a 2xx status code) or if it should be denied (by returning a 401 or 403). This means you can run NGINX as an authenticating proxy in front of any HTTP application and roll your own custom authentication code as a simple webhook-recieving endpoint.

# 3:36 pm / authentication, nginx, webhooks

Oct. 5, 2019

Client-Side Certificate Authentication with nginx. I’m intrigued by client-side browser certificates, which allow you to lock down a website such that only browsers with a specific certificate installed can access them. They work on both laptops and mobile phones. I followed the steps in this tutorial and managed to get an nginx instance running which only allows connections from my personal laptop and iPhone.

# 5:26 pm / certificates, nginx, security, dogsheep

Get your own Pocket OAuth token (via) I hate it when APIs make you jump through extensive hoops just to get an access token for pulling data directly from your own personal account. I’ve been playing with the Pocket API today and it has a pretty complex OAuth flow, so I built a tiny Flask app on Glitch which helps go through the steps to get an API token for your own personal Pocket account.

# 9:56 pm / glitch, mozilla, oauth

Oct. 6, 2019

Streamlit: Turn Python Scripts into Beautiful ML Tools (via) A really interesting new tool / application development framework. Streamlit is designed to help machine learning engineers build usable web frontends for their work. It does this by providing a simple, productive Python environment which lets you declaratively build up a sort-of Notebook style interface for your code. It includes the ability to insert a DataFrame, geospatial map rendering, chart or image into the application with a single Python function call. It’s hard to describe how it works, but the tutorial and demo worked really well for me: “pip install streamlit” and then “streamlit hello” to get a full-featured demo in a browser, then you can run through the tutorial to start building a real interactive application in a few dozen lines of code.

# 3:52 am / python

Release twitter-to-sqlite 0.6 — Save data from Twitter to a SQLite database

twitter-to-sqlite 0.6, with track and follow. I shipped a new release of my twitter-to-sqlite command-line tool this evening. It now includes experimental features for subscribing to the Twitter streaming API: you can track keywords or follow users and matching Tweets will be written to a SQLite database in real-time as they come in through the API. Since Datasette supports mutable databases now you can run Datasette against the database and run queries against the tweets as they are inserted into the tables.

# 4:54 am / projects, realtime, twitter, dogsheep

Oct. 7, 2019

Release twitter-to-sqlite 0.7 — Save data from Twitter to a SQLite database
Release pocket-to-sqlite 0.1 — Create a SQLite database containing data from your Pocket account
Release datasette-auth-github 0.10 — Datasette plugin that authenticates users against GitHub

Weeknotes: Dogsheep

Having figured out my Stanford schedule, this week I started getting back into the habit of writing some code.

[... 1,367 words]

SQL Murder Mystery in Datasette (via) “A crime has taken place and the detective needs your help. The detective gave you the  crime scene report, but you somehow lost it. You vaguely remember that the crime  was a murder that occurred sometime on ​Jan.15, 2018 and that it took place in SQL  City. Start by retrieving the corresponding crime scene report from the police  department’s database.”—Really fun game to help exercise your skills with SQL by the NU Knight Lab. I loaded their SQLite database into Datasette so you can play in your browser.

# 11:37 pm / projects, sql, sqlite, datasette

Oct. 10, 2019

Tracking PG&E outages by scraping to a git repo

Visit Tracking PG&E outages by scraping to a git repo

PG&E have cut off power to several million people in northern California, supposedly as a precaution against wildfires.

[... 868 words]

Oct. 11, 2019

Release twitter-to-sqlite 0.8 — Save data from Twitter to a SQLite database
Release twitter-to-sqlite 0.9 — Save data from Twitter to a SQLite database

Oct. 13, 2019

Release github-to-sqlite 0.5 — Save data from GitHub to a SQLite database

Oct. 14, 2019

goodreads-to-sqlite (via) This is so cool! Tobias Kunze built a Python CLI tool to import your Goodreads data into a SQLite database, inspired by github-to-sqlite and my various other Dogsheep tools. It’s the first Dogsheep style tool I’ve seen that wasn’t built by me—and Tobias’ write-up includes some neat examples of queries you can run against your Goodreads data. I’ve now started using Goodreads and I’m importing my books into my own private Dogsheep Datasette instance.

# 4:07 am / books, cli, sqlite, datasette, dogsheep

Release datasette-render-timestamps 0.1 — Datasette plugin for rendering timestamps
Release datasette-render-timestamps 0.2 — Datasette plugin for rendering timestamps
Release datasette-leaflet-geojson 0.3 — Datasette plugin that replaces any GeoJSON column values with a Leaflet map.
Release datasette-auth-github 0.11 — Datasette plugin that authenticates users against GitHub

Weeknotes: PG&E outages, and Open Source works!

My big focus this week was the PG&E outages project. I’m really pleased with how this turned out: the San Francisco Chronicle used data from it for their excellent PG&E outage interactive (mixing in data on wind conditions) and it earned a bunch of interest on Twitter and some discussion on Hacker News.

[... 452 words]

μPlot (via) “An exceptionally fast, tiny time series chart. [...] from a cold start it can create an interactive chart containing 150,000 data points in 40ms. [...] at < 10 KB, it’s likely the smallest and fastest time series plotter that doesn’t make use of WebGL shaders or WASM”

# 11:03 pm / canvas, charting, graphing, javascript

Oct. 15, 2019

Release twitter-to-sqlite 0.10 — Save data from Twitter to a SQLite database

Oct. 16, 2019

2018 Central Park Squirrel Census in Datasette (via) The Squirrel Census project released their data! 3,000 squirrel observations in Central Park, each with fur color and latitude and longitude and behavioral observations. I love this data so much. I’ve loaded it into a Datasette running on Glitch.

# 6:01 pm / squirrels, datasette

Release twitter-to-sqlite 0.11 — Save data from Twitter to a SQLite database
Release swarm-to-sqlite 0.2 — Create a SQLite database containing your checkin history from Foursquare Swarm
Release twitter-to-sqlite 0.11.1 — Save data from Twitter to a SQLite database

Oct. 17, 2019

Release twitter-to-sqlite 0.12 — Save data from Twitter to a SQLite database

Oct. 18, 2019

Release datasette 0.29.3 — An open source multi-tool for exploring and publishing data

2019 » October

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
28293031