Smuggling email inside of email

By Jake Edge
January 3, 2024

Normally, when a new vulnerability is discovered and releases are coordinated with those affected, the announcement is done at a convenient time—not generally right before the end-of-year holidays, for example. The SMTP Smuggling vulnerability has taken a different path, however, with its announcement landing on December 18. That may well have been unpleasant for some administrators that had not yet updated, but it was particularly problematic for some projects that had not been made aware of the vulnerability at all—though it was known to affect several open-source mailers.

Discovery and disclosure

The vulnerability was discovered by Timo Longin of SEC Consult back in June; the company contacted three affected vendors, GMX, Microsoft, and Cisco in July. GMX fixed the issue in August, and Microsoft did so in October, but Cisco responded that the "identified vulnerability is just a feature of the software and not a bug/vulnerability". That led SEC Consult to contact the CERT Coordination Center (CERT/CC) for further assistance, using the Vulnerability Information and Coordination Environment (VINCE) tool:

There, we submitted all our research details and explained the case to the vendors involved. We received feedback from Cisco that our identified research is not a vulnerability, but a feature and that they will not change the default configuration nor inform their customers. Other vendors did not respond in VINCE but were contacted by CERT/CC.
Based on this feedback and as multiple other vendors were included in this discussion through the CERT/CC VINCE platform without objecting, we wrongly assessed the broader impact of the SMTP smuggling research. Because of this assumption, we asked CERT/CC end of November regarding publication of the details and received confirmation to proceed.

A talk about the vulnerability was accepted for the 37th Chaos Communication Congress (37C3), which was held at the end of December. So the vulnerability needed to be announced before that happened. The talk acceptance occurred on December 3 and the announcement came out roughly two weeks later—just before the holidays. The "wrongly assessed" wording in the quote above seems to indicate that SEC Consult recognizes that it made a mistake here. In addition, the talk is said to have contained "a decent apology".

One of the other "vendors" mentioned in the (admirably detailed) blog post is Sendmail, but Postfix is mentioned elsewhere as well. It is clear that SEC Consult was fully aware that those two mailers were vulnerable, but the company apparently relied on CERT/CC to involve additional affected projects, which just as clearly did not happen. For example, Postfix creator and maintainer Wietse Venema called the vulnerability announcement "part of a non-responsible disclosure process"; he softened that stance somewhat in a Postfix SMTP Smuggling page, but still noted that "[critical] information provided by the researcher was not passed on to Postfix maintainers before publication of the attack". The result was "a presumably unintended zero-day attack", he said.

Vulnerability

The Simple Mail Transfer Protocol (SMTP), which is described in RFC 5321, provides a text-based protocol for submitting and exchanging email on the net. At some level, this flaw is another indication that the "robustness principle" (also known as "Postel's law") was not actually all that sound from a security standpoint; being liberal in what is accepted over-the-wire has often led to problems. In this case, the handling of line endings in conjunction with the SMTP end-of-data indication can lead to situations where email can be successfully spoofed—leading to rogue email that passes various checks for authenticity.

The SMTP DATA command is used for the actual text that will appear in an email, including headers and the like; the so-called "envelope", which describes the sender and receiver, is another set of SMTP commands (EHLO, MAIL FROM, RCPT TO) that precede the DATA. The blog post announcing the vulnerability has lots of diagrams and explanations for those who want all the gory details. The DATA command is ended with a line that is blank except for a single period ("."); the line endings for SMTP are defined to be carriage-return (CR or "\r") followed by line-feed (LF or "\n"), so end-of-data should be signaled with "\r\n.\r\n".

It turns out that some mailers will accept just a line-feed as the line terminator, but others will not; there is a difference in interpretation of "\n.\n" that can be used to smuggle an email message inside another:

    EHLO ...
    MAIL FROM: ...
    RCPT TO: ...
    DATA
    From: ...
    To: ...
    Subject: ...

    innocuous email text
    \n.\n
    MAIL FROM: <admin@...>
    RCPT TO: <victim@...>
    DATA
    From: Administrator <admin@...>
    To: J. Random Victim <victim@...>
    Subject: Beware of phishing scams

    Like this one
    \r\n.\r\n

The second set of commands is in the text of the email, if the SMTP server receiving it does not consider "\n.\n" as the termination of the DATA command. That email will be sent to another SMTP server, however, which may see things rather differently. If the SMTP server for the destination sees that line as terminating the data, it may start processing what comes next as entirely new email. It is a vulnerability that is analogous to HTTP request smuggling, which is where the name comes from.

There are also variations on the line endings (e.g. "\n.\r\n") that can be used to fool various mailers. The core of the idea is to have the outbound mail server ignore the "extra stuff" as part of the initial email message and send the mail on to a server that sees the single message as more than one—and acts accordingly. SPF checks can be made to pass and even DKIM can be spoofed by using an attacker-controlled DKIM key in the smuggled headers. DMARC can be used to thwart the smuggling, but common configurations of it are still vulnerable. In addition, because mail servers, especially those of the larger email providers, often handle multiple domains, there is an opportunity to smuggle email that purports to come from different domains inside innocuous-seeming messages to users of those services.

The blog post mostly concentrates on the three vendors identified, but clearly notes that "Postfix and Sendmail [fulfill] the requirements, are affected and can be smuggled to". It provides several lists of domains that can be spoofed via GMX, Microsoft Exchange Online, or Cisco Secure Email Cloud Gateway. In fact, SEC Consult uses the Cisco product itself, so the company has changed its settings away from the default to avoid the problem, which Cisco does not acknowledge as any kind of bug.

Postfix and beyond

Meanwhile, over in open-source land, folks seemed rather astonished that the vulnerability dropped with no warning. Marcus Meissner posted about the vulnerability to the oss-security mailing list: "As if we did not have sufficient protocol vulnerability work short[ly] before Christmas break this year, here is one more". He is likely referring to the Terrapin vulnerability in the SSH protocol, which was announced on December 18—well after coordinating with many different SSH implementation projects. The SMTP Smuggling vulnerability followed a rather different path, as Stuart Henderson pointed out:

I'm a little confused by sec-consult's process here. They identify a problem affecting various pieces of software including some very widely deployed open source software, go to the trouble of doing a coordinated disclosure, but only do that with...looking at their timeline... gmx, microsoft and cisco?

Meissner noted that SUSE was not alerted to the problem via VINCE and that the Postfix timeline (in Venema's post) only started after the announcement. Erik Auerswald speculated that SEC Consult expected CERT/CC to alert other affected projects; instead it would seem that CERT/CC gave the go-ahead to release the findings. After Rodrigo Freire wondered why the problem was considered a vulnerability, Auerswald explained:

Any user of an affected outbound server can spoof email from any user of the same outbound server despite SPF and DKIM (DMARC+DKIM can prevent this in some cases, also more senders can be spoofed in specific cases, for details see the blog post). But for this to work, the inbound server must act as a confused deputy. Both outbound and inbound servers need to be differently vulnerable to enable the attack. This specific attack can be prevented unilaterally on either the outbound or the inbound server.
[...] For email server open source projects, relevant for the oss-security list, the primary vulnerability is to act as a confused deputy inbound server, because users of such email servers usually have a much smaller number of accounts than the big freemail providers. But, in general, they could also possibly act as a vulnerable outbound server, e.g., after a [legitimate] user account has been compromised.

CVEs for Sendmail, Postfix, and Exim have been assigned. Postfix has released updates with a new smtpd_forbid_bare_newline option (that defaults to off, but will default to on in the upcoming Postfix 3.9 release); those who do not want to upgrade can work around most of the problems using existing options. Exim has an advisory, bug report, and bug-fix release (4.97.1) for the problem. Sendmail has a snapshot release (8.18.0.2) with a new option to address the flaw as well. However, other than Postfix updates from SUSE and Slackware, most distributions have not yet issued updates for this problem, which leaves a lot of Linux users vulnerable.

So the major open-source mailer projects scrambled to fix the vulnerability over the holidays, but the process has obviously failed here. We have not (yet?) heard from CERT/CC for its side of the story, but either it or SEC Consult should definitely have contacted those projects in the multiple months that have gone by since the discovery. Given that SEC Consult knew about Sendmail and Postfix, though, it is a little hard to understand how a simple heads-up message to the security contacts for those projects was not sent.

It seems likely that there are smaller mailers and other tools that are still affected by the flaw—though they may only have just heard of it—a month or two of advance notice, especially for small projects, could have made a lot of difference. One can only hope that lessons are being learned here and that any coordination will be better ... coordinated (and communicated) down the road.

Index entries for this article
Security	Vulnerabilities/SMTP smuggling

to post comments

Smuggling email inside of email

Posted Jan 4, 2024 2:06 UTC (Thu) by iabervon (subscriber, #722) [Link]

The image at the beginning of the blog post has a useful bit of information that didn't make it to the example in the article: the first/outer email is from attacker@sender, while the second/embedded email is purportedly from admin@sender, such that receiver is not surprised by sender delivering the second email. More generally, it can be attacker@cohost1 and admin@cohost2, where the two sites are different but expected to be cohosted.

The purported sender's mailer has to consider the ambiguous sequence to be valid inside of an email, while the receiver's mailer has to consider the ambiguous sequence to be a valid terminator of an email. There are sequences mentioned for which various open-source mailers are vulnerable as receivers and big hosting providers and also mailers used by big companies are vulnerable as senders.

It looks to me like the researchers considered the sender side to be doing things particularly wrong, and also that the vulnerability was that messages purportedly from users at the sender site could be forged by other users with the sender site's participation adding credibility, and didn't think of it (also) as receivers failing to block detectably inauthentic messages. They contacted everyone who would need to implement what they thought of as the correct fix, but not other people who could have mitigated the problem differently.

Smuggling email inside of email

Posted Jan 4, 2024 2:59 UTC (Thu) by akp@akp.dk (subscriber, #73019) [Link]

Debian released fixed versions of Postfix too via their updates repositories, but apparently it wasn’t considered a security issue - only a bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059230

The confused Deputy

Posted Jan 4, 2024 9:55 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

Surely an early disclosure that certain projects are vulnerable as confused deputies if they misinterpret /r/n./r/n could/should have been made months ago?

"The RFC standard says you *MUST*, some projects don't, and this has been identified as a security vulnerability".

That's leaking bugger-all information to an attacker - how do they use it etc etc, and giving all recipients the opportunity to protect themselves, very quickly.

Cheers,
Wol

The confused Deputy

Posted Jan 5, 2024 3:05 UTC (Fri) by kentonv (subscriber, #92073) [Link]

Ehh... I think for anyone who has spent time dealing with smuggling vulnerabilities before (e.g. in HTTP, where they abound), that would have been a big enough hint for them to figure out the problem.

Smuggling email inside of email

Posted Jan 4, 2024 10:06 UTC (Thu) by roc (subscriber, #30627) [Link] (37 responses)

Postel's Law was a blunder and we've spent decades fighting the damage caused by people following it.

Computing is hard, but well-intentioned yet clearly mistaken principles such as Postel's Law are self-inflicted wounds that have made things a lot harder than they could have been.

Smuggling email inside of email

Posted Jan 4, 2024 10:35 UTC (Thu) by MarcB (subscriber, #101804) [Link] (2 responses)

The irony is that the principle is not followed in its original scope of middle-boxes in low level network protocols - where it actually makes a lot of sense. This has led to the ossification of IP protocols.

Instead it was foolishly extended to endpoints and completely different scenarios. There it is invariably causing damage. It makes things less secure, less stable and so much more complex.

Smuggling email inside of email

Posted Jan 4, 2024 18:13 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (1 responses)

> The irony is that the principle is not followed in its original scope of middle-boxes in low level network protocols - where it actually makes a lot of sense.

No it never made any sense. All those middle boxes were deployed because the fuzziness was breaking things right and left, they are designed to limit the spread of badly formatted traffic.

Of course some of this tightening went too far and led to ossification but it did not happen by mistake or because people wanted to prevent new protocol extensions, it happened because fuzziness was breaking endpoints.

And none of the war on middleboxes actually fixed things for the end user all it achieved is concentrating decision power MAGMA-side. Today Google and Amazon and Microsoft and Apple can decide whatever breakage they want there is nothing between their cloud and their browser with the browser acting as an hostile enclave of their cloud on the user terminal.

Smuggling email inside of email

Posted Jan 4, 2024 21:56 UTC (Thu) by Wol (subscriber, #4433) [Link]

> No it never made any sense. All those middle boxes were deployed because the fuzziness was breaking things right and left, they are designed to limit the spread of badly formatted traffic.

The problem is they are also designed to prevent the spread of correctly fornatted traffic, which is why port 80 is such a mess ...

Cheers,
Wol

Smuggling email inside of email

Posted Jan 4, 2024 13:17 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link] (10 responses)

If one is too fussy, that can also become a bugaboo.

Maybe: s/liberal/skeptically moderate/ ?

Smuggling email inside of email

Posted Jan 4, 2024 14:40 UTC (Thu) by farnz (subscriber, #17727) [Link] (5 responses)

It's not fussiness per-se that's a problem; it's going beyond the protocol specification in ways that make life harder that's a problem. For example, if the protocol says that the size field is encoded as hexadecimal digits in UTF-8, and that an implementation must accept at least 8 digits, insisting that the other side uses a multiple of 8 digits is "too fussy" - I should be able to send 1a2 as a size given that definition, and I should be able to send 1abcdef23, but if you require 000001a2 and 00000001abcdef23, we've got a problem.

And this problem becomes insoluble if a different implementation rejects leading zeroes; you now have two implementations being "fussy" in opposite directions, both of which can be justified against the protocol spec (which doesn't talk about leading zeroes at all). The "correct" fix is to tighten up the protocol specification so that one or both implementations are wrong, but that gets into social coordination problems about what the specification "should" be, and is especially problematic if both implementations are the way they are because that makes it easier to build on existing libraries.

Smuggling email inside of email

Posted Jan 4, 2024 22:31 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

> The "correct" fix is to tighten up the protocol specification so that one or both implementations are wrong,

No. The correct fix is to implement the specification correctly. In your own example, the spec places no requirements on the sender. Therefore the receiver should place no requirements on the sender. Your example of the reciever demanding a multiple of 8 is a clear "gilding of the lily", overly strict, and breach of the specification. Okay, if the spec does not tell the receiver how it knows how many digits it has received - ie the spec does not contain sufficient information to implement itself correctly - then the specification needs to be corrected.

In this particular case, I'd be inclined to *clarify* the spec by saying "the sender can send as many or as few characters as it likes". That doesn't affect any implementation enforcing the current letter of the spec. Yes it breaks your receiver, but only because it's enforcing "restrictions not in the spec".-

I've actually been badly bitten (by a PHB) because of this sort of problem back when email switched from ASCII to 8-bit. Of course, it was MS that contradicted the letter of the spec, but because it was *our* Microsoft Mail that was at fault, the PHB blamed the ISP and demanded *they* fixed the problem.

The SMTP spec says that invalid commands must be rejected with a 200 error, and the receiver *must* then wait for a new command. So the 8-bit RFC revision said "if you want to start an 8-bit transmission, use EHLO rather than HELO to tell the recipient to expect an 8-bit transmission". So what does MS Mail do? "550 Invalid command connection terminated".

I think the ISP ended up creating a config option telling sendmail not to use EHLO for certain (like us) customers.

Cheers,
Wol

Smuggling email inside of email

Posted Jan 4, 2024 22:34 UTC (Thu) by farnz (subscriber, #17727) [Link]

The specification does not say whether zero-padding is allowed or not - and yet one of the my examples is a receiver that insists on no zero-padding because it's "obviously" not needed, while another insists on zero-padding because it's "obviously" needed to indicate length.

And the spec is currently silent on zero-padding; changing the spec to say "no zero padding allowed" is one option. Changing it to say "zero or more leading zeroes are permitted" is another.

Smuggling email inside of email

Posted Jan 5, 2024 10:49 UTC (Fri) by farnz (subscriber, #17727) [Link] (2 responses)

And, as an aside, this comment is exactly why the robustness principle leads to a mess; I showed two receivers that insisted on different things that aren't in the letter of the specification (one insisting on zero padding, the other insisting that it's not allowed).

You've focused on the one that's visibly weird (the one that insists on padding), and ignored the other one that's outside the specification; and yet both are problematic if you're trying to deal with arbitrary clients. It would be simple to fix the specification to declare what degree of zero padding is allowed, but that's currently out of scope - the spec as given permits me to sent petabytes of 0 digits followed by the real number.

And your anecdote is unrelated; you have a case where your local mail server does not obey the letter of RFC 821 (which states that you must remain "in the same state" as before the command was received if you receive an invalid command while waiting for a HELO); in RFC 821 SMTP, you're only supposed to close the transmission channel if you're shutting down service, or in response to a successful QUIT command. This is different to a case where both implementations are obeying the letter of the specification, but imposing additional requirements not contained in the spec, and both implementers believe that they're obeying the spec (after all, they accept numbers formatted as hex digits).

Smuggling email inside of email

Posted Jan 5, 2024 13:58 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

I know my example was actually an explicit breach of the spec, but your example is also a breach of the spec. BOTH your receivers are refusing to accept a transmission that complies with the spec.

So imho no the spec does not need to be changed at all. At most it should be clarified as "the spec tells the sender to send a hex-formatted number, therefore the receiver must accept any valid hex-formatted number."

As I see it, my SMTP example yes it's easy to say MS Mail ignored the spec. But your example is to me exactly the same. The sender sent a valid (as per the spec) message. The receiver refused to accept and process it correctly. Sendmail sent us a validly formatted EHLO. MS Mail refused to accept and process it correctly. I would have thought "if the other party is behaving as per spec, then you are at fault unless the spec is contradictory" was just plain obvious!

Cheers,
Wol

Smuggling email inside of email

Posted Jan 5, 2024 14:26 UTC (Fri) by farnz (subscriber, #17727) [Link]

But that's just my point - one of the two examples is a plausible interpretation of the specification, because leading zeroes aren't discussed either way. The other is less plausible, but could come up if you're using a library that insists on having padded input.

And by your reasoning, the spec intends that implementations are capable of consuming petabytes of leading zeroes as part of accepting a hex-formatted number less than 10, otherwise they're in breach of the specification - but this is clearly absurd, and unlikely to be what the spec author intended. This is why you should fix the specification, not the implementations, since we've got behaviour that's not discussed but should be.

The reason your SMTP example is different is that the specification for SMTP explicitly says that a server keeps the connection open after a bad command - we're not talking about something that's omitted from the spec (are leading zeroes part of a valid hex-formatted number or not?), but about something that's in the spec but where your server deviated from the specification.

Smuggling email inside of email

Posted Jan 4, 2024 19:54 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

I remember reading about the new robustness law: "Be as strict as possible in what you accept, and be slightly evil in what you send".

For example, if you provide a list of allowed encodings during the negotiation, then randomize the order, and maybe stuff a couple of non-existing encodings.

Smuggling email inside of email

Posted Jan 5, 2024 10:52 UTC (Fri) by smitty_one_each (subscriber, #28989) [Link] (1 responses)

"...and here's a little WTF-8 for my homies..."

Smuggling email inside of email

Posted Jan 5, 2024 21:09 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

That's not "slightly", that's just evil.

Smuggling email inside of email

Posted Jan 5, 2024 11:04 UTC (Fri) by farnz (subscriber, #17727) [Link]

Your "non-existing encodings" bit reminded me of Raymond Chen on a driver compatibility issue in DirectX; the OS would ask the driver "do you support this optional feature, GUID xxx?" and the driver in question always said "yes", causing issues when you used that feature.

The solution was to come up with a way to generate "impossible" features at runtime, and if your driver claimed to support such a feature, it was known to be lying about optional feature support and could be ignored.

Smuggling email inside of email

Posted Jan 4, 2024 14:42 UTC (Thu) by rrolls (subscriber, #151126) [Link] (20 responses)

Postel's Law is great for processing that is started directly by the intended user (and usually, a human being) and then completed as soon as possible (with that intended user being notified about it). I suspect it was originally written with that in mind, but because of the era, the condition was never stipulated.

For the "liberal in what you accept": if I'm trying to retrieve some value from some old file that only partially complies with a standard that's many years younger than the file, I want the program to do its utmost to retrieve that value; if the deviations from the standard are enough to prevent this from being possible, it's generally pretty obvious from the output.

Similarly, for the "conservative in what you do": if I'm collecting a whole bunch of data into one file to be sent off elsewhere, I'd want that file to be as simple and compliant as possible (in terms of its syntax/encoding), so that it's useful as input to the widest possible range of other programs.

Where Postel's Law is terrible is the case of processing that is started by some unattended, always-available server receiving something - indeed, like an email server. "Liberal in what you accept" is useful when the user is the intended user; but if the user is a malicious user instead, it's going to be just as useful to them. There's even an argument for not being "conservative in what you do" - this is the argument for "security by obscurity". (Whether this is a good argument or not is a different matter entirely.)

Smuggling email inside of email

Posted Jan 4, 2024 14:54 UTC (Thu) by farnz (subscriber, #17727) [Link]

I've found a useful summary to be "Postel's Law applies only when one side of the conversation is intended to be a human". For machine-to-machine communications (as SMTP, IMAP and others are now), Postel's Law is outright hazardous.

This also plays into other things people do; for example, the problem with binary protocols and file formats is that you can't use tools intended to show text to a human to decode them. But if the intent is to use tools intended to handle that protocol or file format exclusively, you can do just as well with an unambiguous and well-designed binary format.

Smuggling email inside of email

Posted Jan 4, 2024 15:32 UTC (Thu) by adobriyan (subscriber, #30858) [Link] (17 responses)

> Postel's Law is great

No, it's just terrible. One key distinction between computers and humans is that computers are extremely precise and humans are extremely imprecise (especially in comparison). It very easy to lower computers to human level (just ship whatever buggy code at the moment), but it is _impossible_ to uplift humans to computer levels of precision.

Consequently, making computer second guess the human doesn't work. The best example is probably Excel guessing dates vs non-dates. It went as far as scientists renaming genes.

I was unfortunate to use a spreadsheet few years ago and naturally used some Linux clone. All I did is to paste like 10 floats into cells and it guessed format of the data on the _first_ try _wrong. The very first interaction with a program is a bug. Of course it was done for user-friendliness and Excel compatibility and whatever reasons but bug is a bug is a bug. Encountering a bug immediately while doing nothing unusual suggests very high rate of this problem across all users.

It is bad that programmers misinterpret specs and write spec violations, it's ten times bad when managers force programmers to do so.

In a perfect world interacting with Excel will be like this:
* user pasted data (or opens csv),
* all new data are marked with red meaning "computer have no idea what the data mean",
* user marks the data as "integer", "date", "real number" etc making them green in bulk or individually,
* no processing takes place until data are made green.

Of course this is not happening, because asking user to explain to the computer what he means is somewhere around ethnic cleansing levels of human rights violations.

AI/ML companies show the problem very well now, where single digit percent failures are celebrated because humans sometimes perform even worse.

This particular SMTP bug is more of a confusion by 2 common newline types and "streaming" property of SMTP where there is no LEN command unambiguously identifying length of the message (to this day! [citation needed]).

The better Postel's law example is probably browser interpreting broken closing tags and reordering them internally.
This is perfect "we can't break popular website with in new update, can you fix it by Friday?" moment.

Smuggling email inside of email

Posted Jan 4, 2024 15:58 UTC (Thu) by farnz (subscriber, #17727) [Link] (14 responses)

Some computer behaviour is simply annoying, though. For example, a JSON list is technically incorrect if there's a comma after the last element of the list but before the ], even though there's only one possible way to fix the input to be valid JSON. A good system will take that input data, recognise that there's only one possible edit that makes it sensible, and ask the user if that's really what they meant - then, if it is, make that edit for them.

And that's the gap between good applications of Postel's Law and bad ones; in a good application of it, the computer is interacting with the user to get confirmation that its "liberalness" in what it's accepting is indeed correct. In a bad application, the computer makes assumptions about what was meant, and doesn't give the originator a way to correct its errors.

Smuggling email inside of email

Posted Jan 4, 2024 18:03 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (7 responses)

Is there an advantage to prohibiting the trailing comma in the first place? Python allows it, and doesn't seem to suffer any problems as a result of that specific design decision.* SQL also bans trailing commas in the SELECT clause (and a couple of other places), and I have no idea why.** What software malfunction could be triggered or exacerbated by allowing a trailing comma?

* Python has been mentioned, so the thread will now be derailed with people complaining about random other features of Python that are totally irrelevant to this conversation, but at least I *tried* to steer it towards Postel's law and JSON.
** In the case of SQL, this is an anti-feature, because any code that wants to generate SQL has to go to the trouble of ensuring that every column name is followed by a comma, except for the last column.

Smuggling email inside of email

Posted Jan 4, 2024 19:22 UTC (Thu) by excors (subscriber, #95769) [Link]

I think the reason is that JSON was designed to be a subset of ECMAScript 3, so that it could be parsed in web browsers with eval(), and ES3 doesn't allow trailing commas in object literals (though ES5 does).

(Later it was realised that eval is a terrible way to parse potentially-untrusted input, and doesn't even work in general for trusted input, but the original recommendation was literally just `eval("window.val="+json_string)`: http://web.archive.org/web/20030205013518/http://crockfor...)

ES3 does allow trailing commas in array literals, but that's part of the weird 'elision' syntax where `[,].length == 1`, `[,,1,,].length == 4`, etc, and I suspect it was omitted from JSON because Douglas Crockford didn't like it, plus it would be confusingly inconsistent with the object literals. (Crockford's "JavaScript: The Good Parts" says "[JSLint] does not expect to see elided elements in array literals. Extra commas should not be used. A comma should not appear after the last element of an array literal or object literal because it can be misinterpreted by some browsers.")

And once the first version of JSON had been published, it was too late to make any syntax changes without causing major interoperability problems.

Smuggling email inside of email

Posted Jan 4, 2024 20:12 UTC (Thu) by khim (subscriber, #9252) [Link] (5 responses)

Is there an advantage to prohibiting the trailing comma in the first place?

The problem is not in trailing comma. The problem is in different programs treating the same input differently (precisely what we are discussion in the article, isn't it?)

It doesn't even matter about how exactly you interpret broken data. The key is that different apps would interpret them differently.

If you want to have auto-fixer for the JSON then decouple it from JSON parser and make sure user may use it with all apps that s/he want's or needs to use it.

Smuggling email inside of email

Posted Jan 4, 2024 21:45 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

My argument, unreasonable though it may be, is essentially that Crockford should have used his magical crystal ball to predict that prohibiting trailing commas would inevitably become a nuisance. I'm not suggesting that anyone in 2023 should be going around emitting JSON that actually contains trailing commas - that way lies madness. If you want to have trailing commas and don't care about JSON compatibility, then switch to TOML (which explicitly specifies a trailing comma as legal in arrays, and otherwise does not use commas much), or some other format with the same property.

Smuggling email inside of email

Posted Jan 7, 2024 6:53 UTC (Sun) by ssmith32 (subscriber, #72404) [Link] (3 responses)

I have worked with lots and lots of JSON, in a variety of environments, and have never once found this to be any kind of problem, nuisance or otherwise.

Do tell how this is a nuisance..?

I mean, if someone continually trips up and fails to skip appending a delimiter to the last item of a list... I certainly wouldn't want them anywhere near my serde code.. it's sort of a bog-standard thing.

Smuggling email inside of email

Posted Jan 7, 2024 8:12 UTC (Sun) by mjg59 (subscriber, #23239) [Link]

When reordering elements, or moving elements between different lists, it's much easier if you can just cut and paste without then needing to edit. Requiring that the final element in a list be special makes that harder. Given that there's zero ambiguity here it's something that maybe makes life easier for people implementing a parser but makes life harder for everyone producing things consumed by that parser, and overall that feels like a poor tradeoff.

Smuggling email inside of email

Posted Jan 7, 2024 10:28 UTC (Sun) by mbunkus (subscriber, #87248) [Link]

Programs can easily create valid JSON files; that isn't the issue. The issue is that JSON is human-readable and is therefore used in many places where humans must edit/write it manually, too, e.g. for config files. However, it wasn't designed for that use case. Not allowing the list separator in the last place is one of the drawbacks, not having comments the most obvious other one. Us humans want to be able to easily edit things, switch lines around, copy-paste stuff & append it, ideally without having to worry about these minutiae.

JSON is pretty much the worst of both worlds. It can be used for human-editable scenarios, but isn't good at it. It can be used for data interchange, but it isn't good at that either (parsing requires much more time & memory than necessary; real problems with big integers in various implementations etc.). It's the typical case of being just good enough & easy enough to implement to be ubiquitous. And us developers find it easy to use 'cause it makes debugging data flow so easy. Meh…

Smuggling email inside of email

Posted Jan 7, 2024 11:00 UTC (Sun) by ianmcc (guest, #88379) [Link]

I have some python code that I used to use to collate grades. Depending on the course, different fields of some python dictionary would be relevant for the final grade, so it would be a few messy lines with some commented out, some new ones added, etc. It is convenient in python that you can have a leading comma on every line. If that wasn't the case and you commented out the last entry (or added a new one at the end) then you'd need to fix up the commas. Admittedly not a huge burden, but also a rather pointless. And definitely it would be a nuisance.

Smuggling email inside of email

Posted Jan 4, 2024 20:06 UTC (Thu) by khim (subscriber, #9252) [Link] (5 responses)

A good system will take that input data, recognise that there's only one possible edit that makes it sensible, and ask the user if that's really what they meant - then, if it is, make that edit for them.

No. Good user system accepts what is specified and rejects what is not specified.

Some computer behaviour is simply annoying, though. For example, a JSON list is technically incorrect if there's a comma after the last element of the list but before the ], even though there's only one possible way to fix the input to be valid JSON.

No. It may be spurious command or forgotten `0` easily. Or just lost large chunk of data.

Now, it may be good idea to demand trailing comma, instead.

This format may even be better that today's one, but that would be different format.

And yes, I'm saying that as someone who very much edits JSON files by hand.

As annoying as getting message “you JSON is invalid” is I much prefer it to the fuzzy formats which couldn't be parsed reliably at all. Just look on vk.xml and try to imagine what kind of parser may give you answer to the simple question: what size if CAMetalLayer (or if it even have a type).

Please don't. If you do that people would invariably submit garbage, “apply edit” without looking and then complain that program doesn't work even if they did everything correctly.

If you want then add something like this to your text editor. Don't duplicate that code and cause vulnerabilties that we are discussing here.

It's perfect example of what is happening if you apply such “niceties”.

Smuggling email inside of email

Posted Jan 4, 2024 20:14 UTC (Thu) by farnz (subscriber, #17727) [Link] (4 responses)

You're contradicting yourself within your comment: do you want my system, of which my text editor is one component, to be able to work with me to correct entry errors, or is that bad, because my text editor, a component of my system, should be able to work with me to correct entry errors?

And my entire point is that when there's only one correction to be made, the system as a whole should work with me to recognise where my error is, not just apply that correction blindly - chances are good that in the JSON block [1,2,3,], I meant [1,2,3], but it's also possible that I didn't mean that, and I meant [1,2,3,4,5,6], but forgot three numbers. Actually asking me if I meant [1,2,3] instead of [1,2,3,] gives me a chance to say "um, no, that's not what I meant" and fix it manually if the correction is wrong.

Smuggling email inside of email

Posted Jan 4, 2024 20:56 UTC (Thu) by khim (subscriber, #9252) [Link] (3 responses)

Text editor either should work with any JSON consumer than I may want to use (including, but not limited, various Google APIs, phone apps and raw python scripts) or it shouldn't be used.

I would much prefer it as entirely separate tool not attached to anything else at all, but as long as integration with “your system” is optional I may stomach it.

What I don't want is to ever be in a situation when someone uses your system and then assumes that my system have to provide such facilities, too.

Because that's where the slippery slope starts: different programs shouldn't treat the same file differently.

And my entire point is that when there's only one correction to be made, the system as a whole should work with me to recognise where my error is, not just apply that correction blindly - chances are good that in the JSON block [1,2,3,], I meant [1,2,3], but it's also possible that I didn't mean that, and I meant [1,2,3,4,5,6], but forgot three numbers. Actually asking me if I meant [1,2,3] instead of [1,2,3,] gives me a chance to say "um, no, that's not what I meant" and fix it manually if the correction is wrong.

I don't even care about what exactly “your system” does. I do care about the fact that as long as you attach that ability to “your system” I have to attach it to “my system”, too. And then we and up these insane piles of buggy code which implements the same improvements in 10 different ways and end up with crazy security vulnerabilities.

And please stop playing that card about how you may reject fixes. It was plausible 20 years ago. Today we know it just doesn't work. Dancing pigs would ensure that after “your system” would be adopted by masses it would work as if that question about what to do with fixes doesn't even exist.

The only way to achieve something even remotely resembling security is to ensure that all these “fixes” are not offered automatically and user explicitly and consciously opts in into them.

Smuggling email inside of email

Posted Jan 4, 2024 21:05 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

My "system" is a computer running software. Since you're being abundantly clear that your system is not a computer running software, what is it, and how does it interact with my system?

I'm going back to where Postel's Law comes from - it's a bad UX when the computer clearly knows what is wrong with your input, but does nothing to help you - this bit of humour about systems that work that way is very funny, BTW - but it's also a bad UX in the long run if the computer blindly autocorrects for you.

And taking you at face value, you seem to be claiming that compilers giving me suggestions in the output, which I can then apply if I believe the computer is correct is "Today we know it just doesn't work." - and yet clang, rustc and GCC all do exactly that, telling me what they think I intended given what I've written. I can then accept or reject that change as I see fit.

Smuggling email inside of email

Posted Jan 4, 2024 21:26 UTC (Thu) by khim (subscriber, #9252) [Link] (1 responses)

My "system" is a computer running software.

You never told me what “your system” even is. What kind of software does it include, how does it process things, etc.

Yes, they are slowly but surely are nearing that threshold that separates helpful suggessions from madness.

Which may not even matter because we have a tool that crosses it easily: Bard/ChatGPT/etc. These may give you plenty of suggestions which may ruin everything your do so thoroughly you would be forced to go back to old backups and redo everything from scratch.

If you even have these backups.

I can then accept or reject that change as I see fit.

No. You have to read it and think about it. And even these, pretty limited, suggestions are already causing plenty of damage: I have seen plenty of “programmers” who arrive on forums and complain that after applying these fixes their program still doesn't work because they are going in circles.

They don't even stop to think for a minute and then realise that it's their responsibility to produce sensible code, not compiler's. No, they just assume that they may create something by randomly changing code and then following the compiler advice.

And these are “developers”! Trained ones! Who passed few interviews and got good marks!

What happens when “normal people” when they are operating our IT systems is hard to even imagine.

Smuggling email inside of email

Posted Jan 4, 2024 21:41 UTC (Thu) by farnz (subscriber, #17727) [Link]

No, but you made false assumptions about my system, and then, having made those false assumptions, chose to be rude and obnoxious about how I was wrong because you made bad assumptions.

And "as I see fit" includes "reading and thinking about it" - that is one part of forming my opinion. Sure, there's no shortage of idiots out there, but those people would do something stupid whether the computer's error message is ?, or On line 52, column 96, there is a , followed by a ] ending a list. This comma is syntatically invalid; you need to either remove the comma or add another data item.

This is, after all, where Postel's Law comes from; if the only output is either ? or correct results, you're being strict in what you accept, but it's very hard to debug the resulting system. If your error tells you what the computer dislikes about its input, you're being weakly liberal in what you accept, and that makes it easier to debug; Postel's error was generalising this to the case where there's no intelligence in the loop at all.

Smuggling email inside of email

Posted Jan 4, 2024 18:56 UTC (Thu) by wahern (subscriber, #37304) [Link] (1 responses)

> This particular SMTP bug is more of a confusion by 2 common newline types and "streaming" property of SMTP where there is no LEN command unambiguously identifying length of the message (to this day! [citation needed]).

A BDAT command for SMTP was first specified in 1995: https://www.rfc-editor.org/rfc/rfc1830.txt Or 2020, depending on your criteria: https://datatracker.ietf.org/doc/html/rfc3030

Microsoft Exchange uses BDAT, IIRC, but otherwise it's mostly gone unused. Ironically, BDAT would make smuggling exploits even easier unless and until the old DATA command was completely retired.

Smuggling email inside of email

Posted Jan 5, 2024 0:32 UTC (Fri) by auerswal (subscriber, #119876) [Link]

According to the SEC Consult blog post, use of BDAT by Microsoft Exchange Online prevented their SMTP Smuggling attempts when the inbound server supported it.

Smuggling email inside of email

Posted Jan 4, 2024 17:44 UTC (Thu) by pallas (guest, #128204) [Link]

Postel's Law is great for processing that is started directly by the intended user (and usually, a human being) and then completed as soon as possible (with that intended user being notified about it). I suspect it was originally written with that in mind, but because of the era, the condition was never stipulated.

It appears in RFC 761, a TCP spec from 1980.

Smuggling email inside of email

Posted Jan 4, 2024 21:02 UTC (Thu) by gouttegd (guest, #106484) [Link] (1 responses)

I believe that the robustness principle has been misunderstood all along.

While Postel’s initial formulation in RFC 761 [https://datatracker.ietf.org/doc/html/rfc761] was very short and could indeed be interpreted as a recommendation to accept ill-formed input, the extended formulation from RFC 1112 [https://datatracker.ietf.org/doc/html/rfc1122#section-1.2.2] (where it gains its name of “robustness principle”) doesn’t quite say that:

> Software should be written to deal with every conceivable error, no matter how unlikely; sooner or later a packet will come in with that particular combination of errors and attributes, and unless the software is prepared, chaos can ensue. In general, it is best to assume that the network is filled with malevolent entities that will send in packets designed to have the worst possible effect. This assumption will lead to suitable protective design

A robust program is _not_ a program that accepts any ill-formed input as long as the meaning is clear. It’s a program that is _ready to receive_ any ill-formed input and deal with it gracefully (instead of, say, erring into undefined behaviour territory or segfaulting, as a non-robust program would do). Whether it _accepts_ the ill-formed input once it has recognised it for what it was is another question entirely.

Smuggling email inside of email

Posted Jan 4, 2024 22:14 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

The reason that nobody is willing to call that the "robustness principle" is that, by modern standards, it's completely obvious. Of *course* you can't just crash when somebody sends you an ugly packet (for whatever definition of "ugly" is appropriate). If you write software in 2024 that crashes or misbehaves on invalid network traffic, somebody will find it and get it labeled with a CVE number in (hopefully) short order.

Smuggling email inside of email

Posted Jan 4, 2024 13:48 UTC (Thu) by james (guest, #1325) [Link]

I note that for Exim, "disable chunking" by setting
chunking_advertise_hosts =
was also recommended as a workaround for CVE-2017-16943. I must admit that I never came up with a change control justification to turn it back on again.

Now I'm tempted to call that policy. Exim has had its security problems over the years, and turning off optional extras reduces the attack surface.