Roundup Issue Tracker Code

Simple-to-use/-install issue-tracking system: web, REST, email, CLI

Brought to you by: anthonybaxter, ber, richard, rouilj, and 3 others

Code Merge Request #2: Implement HTML letters parsing with Beautiful Soup (rejected)

Merging...

Merged

Something went wrong. Please, merge manually

Checking if merge is possible...

Something went wrong. Please, merge manually

Igor Ippolitov wants to merge 1 commit from /u/iippolitov/roundup/ to default, 2017-10-14

Hello.

We are using roundup as an issue tracker mostly using email.
Most of our users send us emails in HTML. They either don't know how to switch to plain text or don't will to change email format to plain text.

This patch implements mostly trivial change: it parses HTML parts if there are any. Parsing relies on BS4 which becomes the only dependency of roundup.

If this is unacceptable solution - I can make this a conditional dependency.

Commit	Date
[2ee03a] (default, tip) by Ippolitov A. Igor Implement HTML letters parsing with Beautiful Soup	2017-08-07 13:18:46	Tree

Discussion

Bernhard Reiter - 2017-08-07

Hi Igor, thanks for the merge request.

Can you add some discussion about the security of this change?
(Security concerns have been discussed a number of times regarding HTML attachments.)

From my point of view I think that making BS and the HTML parsing optional will make it easier to merge the request.

Best Regards,
Bernhard

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Rouillard - 2017-10-09

Beautiful soup is throwing errors on my system. Some undefined value from token.py.
I do have a replacement html2text that works. I applied your patch and imported my html2text function.

When I run: ./run_tests test/test_mailgw.py I get 5 failing tests.

The failures are in two classes:

elif content_type == 'text/html':

content = html2text(self.getbody()).encode('utf-8')

E UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 7: ordinal not in range(128)

which is a problem I assume with my html2text replacement. If I remove the encode things seems to work (tests are still running).

in extract_content()
...
elif unpack_rfc822 and content_type == 'message/rfc822':
...
new_content, attachments = m.extract_content(m.gettype(), ig,

unpack_rfc822)

E ValueError: too many values to unpack

This is easily fixable adding a html_part to the left hand side I hope.

Is this merge request linked to the roundup issue tracker?

-- rouilj
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Rouillard - 2017-10-10

I am following up on this work at: http://issues.roundup-tracker.org/issue2550799

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Rouillard - 2017-10-14

Igor, I merged/fixed your patch for mailgw.py. Since this was not a direct merge and required editing and creation of tests etc I am marking it as rejected.

Also in writing tests, a text/html part followed by a text/plain part resulted in duplicate text/html attachment. To fix this I changed:

elif html_part_found:

text/plain part found after html

save html as attachment

attachments.append(cpart.as_attachment())

to remove the second attachments call as the html file was attached in the code path that set html_part_found.

Changesets that incorporate this patch and fixes are:

changeset: 5305:e20f472fde7d
changeset: 5306:91354bf0b683
changeset: 5307:5b4931cfc182

Bern, barring flaws in beautiful soup or html parsing libraries, I don't think there are any worse security implications to this patch. If there was a text/plain version of the file we can still attach the html version of the file to the issue which if downloaded as html could be used as an attack vector.

-- rouilj
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

John Rouillard - 2017-10-14

Status: open --> rejected
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.