Igor Ippolitov wants to merge 1 commit from /u/iippolitov/roundup/ to default, 2017-10-14
Hello.
We are using roundup as an issue tracker mostly using email.
Most of our users send us emails in HTML. They either don't know how to switch to plain text or don't will to change email format to plain text.
This patch implements mostly trivial change: it parses HTML parts if there are any. Parsing relies on BS4 which becomes the only dependency of roundup.
If this is unacceptable solution - I can make this a conditional dependency.
Hi Igor, thanks for the merge request.
Can you add some discussion about the security of this change?
(Security concerns have been discussed a number of times regarding HTML attachments.)
From my point of view I think that making BS and the HTML parsing optional will make it easier to merge the request.
Best Regards,
Bernhard
Beautiful soup is throwing errors on my system. Some undefined value from token.py.
I do have a replacement html2text that works. I applied your patch and imported my html2text function.
When I run: ./run_tests test/test_mailgw.py I get 5 failing tests.
The failures are in two classes:
which is a problem I assume with my html2text replacement. If I remove the encode things seems to work (tests are still running).
in extract_content()
...
elif unpack_rfc822 and content_type == 'message/rfc822':
...
new_content, attachments = m.extract_content(m.gettype(), ig,
This is easily fixable adding a html_part to the left hand side I hope.
Is this merge request linked to the roundup issue tracker?
-- rouilj
I am following up on this work at: http://issues.roundup-tracker.org/issue2550799
Igor, I merged/fixed your patch for mailgw.py. Since this was not a direct merge and required editing and creation of tests etc I am marking it as rejected.
Also in writing tests, a text/html part followed by a text/plain part resulted in duplicate text/html attachment. To fix this I changed:
text/plain part found after html
save html as attachment
to remove the second attachments call as the html file was attached in the code path that set html_part_found.
Changesets that incorporate this patch and fixes are:
changeset: 5305:e20f472fde7d
changeset: 5306:91354bf0b683
changeset: 5307:5b4931cfc182
Bern, barring flaws in beautiful soup or html parsing libraries, I don't think there are any worse security implications to this patch. If there was a text/plain version of the file we can still attach the html version of the file to the issue which if downloaded as html could be used as an attack vector.
-- rouilj