A unified TLS API for Python
Back at the 2016 Python Language Summit, Cory Benfield and Christian Heimes gave a presentation on the future of the language's ssl module. It provides the standard library's TLS support, but suffers from a number of problems, much of it baggage that has accreted over the years. Now Benfield has proposed a PEP to, essentially, move on from ssl, and to create a set of abstract base classes (ABCs) that would define the API for TLS going forward.
Benfield posted the PEP (as yet without a number) to the Python Security SIG mailing list, but it has been in progress since last October in a GitHub repository. Since ssl is used by the pip installation tool, it is important that it works for all of the systems that Python gets installed on, Benfield said. But it is reliant on OpenSSL, which is not well supported on macOS and Windows:
To address that, a more generic API for TLS is needed, he said. There are two implementations that he considered:
As might be guessed based on that description, Benfield has chosen the latter approach. The in-progress PEP proposes a set of ABCs that are meant to allow any underlying TLS library provide a compliant implementation. That would allow users, distributors, and others to swap in other TLS libraries as needed—something that is not really possible with today's ssl.
The PEP outlines the various pieces that need attention and where they are currently implemented in ssl (if they are):
- Configuring TLS, currently implemented by the SSLContext class in the ssl module.
- Wrapping a socket object, currently implemented by the SSLSocket class in the ssl module.
- Providing an in-memory buffer for doing in-memory encryption or decryption with no actual I/O (necessary for asynchronous I/O models), currently implemented by the SSLObject class in the ssl module.
- Specifying TLS cipher suites. There is currently no code for doing this in the standard library: instead, the standard library uses OpenSSL cipher suite strings.
- Specifying application-layer protocols that can be negotiated during the TLS handshake.
- Specifying TLS versions.
- Reporting errors to the caller, currently implemented by the SSLError class in the ssl module.
- Specifying certificates to load, either as client or server certificates.
- Specifying which trust database should be used to validate certificates presented by a remote peer.
The PEP then proposes ABCs to support each of those (though the TLS cipher suite specification is still listed as "Todo"). It also briefly looks at the other standard library modules that will need to be revised to use the interfaces. Benfield lists seven different modules (such as asyncio, http.client, imaplib, and smtplib) that would need to change.
The reaction to the posting was largely favorable, though there were plenty of technical concerns mentioned, many of which have been addressed with changes to the PEP. It is already a large specification, but there were questions about additional features. Benfield, Heimes, and others were resistant to calls for more features in the first round, though Heimes brought up the need to support SRV-ID (used to identify different service types) eventually. Benfield wanted to take a wait-and-see attitude for features like that:
Heimes agreed and, in another sub-thread, pushed back on some somewhat exotic features that were being proposed. Though Benfield sees the server-side TLS support as mandatory, Heimes would even be willing to leave that behind in order to get something out there more quickly.
There is, it seems, a deadline of sorts at hand. As Donald Stufft reported to the Distutils SIG mailing list recently, the content delivery network (CDN) that PyPI and other Python infrastructure uses will be phasing out TLS 1.0 and 1.1 support over the next year and a half. Some solution needs to be in place before June 2018 or macOS pip clients will not be able to run; others may well be affected too.
So Heimes is interested in a bare-bones solution, but one that gets the job
done: "Personally I would rather remove half of the PEP than add
new things.
" He is concerned that more features will make it that
much harder for implementers in the time frame available. Stufft agreed as
well: "Getting too lost in the weeds over advanced features like
hot-config-reload I agree is a bad use of resources.
"
But Wes Turner was concerned that adding a TLS configuration object later, as was being advocated, would be hard to do. Nick Coghlan, on the other hand, thought that was the proper approach:
It turns out that Turner was actually looking beyond simply configuring the
underlying library and was instead concerned
with adding an interface to determine whether a given configuration was
consistent with a particular
security policy. It is something that will need to be addressed,
eventually, but once again is "vastly beyond the
scope of the problem I'm trying to solve here
", Benfield said. It is tempting to add all of the needed
pieces at once, he continued, so that all of the TLS dragons can be slayed
in one go, but he has a different vision moving forward:
A more iterative culture around TLS is likely something that Python needs. The ssl module has been a problem child for some time now, at least partly because it mostly works—though some of its defaults are insecure—and lacks for developer time. But protocols change over time and older versions get deprecated. At the moment, that is providing some impetus to speed up the changes that have been needed for some time, which will help. But it would be nice to see proactive efforts at keeping up with TLS down the road.
| Index entries for this article | |
|---|---|
| Security | Python |
| Security | Transport Layer Security (TLS) |