PDF to HTML
PDF to HTML
Posted Dec 18, 2014 16:19 UTC (Thu) by anselm (subscriber, #2796)In reply to: PDF to HTML by brouhaha
Parent article: Plug-and-play sanitization of USB thumb drives
It should be possible, and not even exceptionally difficult, to strip Javascript out of PDF files, without having to convert them into a non-PDF format. It also shouldn't be too difficult to validate that the PDF and embedded images, fonts, etc. are well-formed, and strip out any malformed constructs that could break PDF viewers that aren't coded sufficiently defensively.
That sounds like a job for Ghostscript.