Self-modifying code
Self-modifying code
Posted Jul 27, 2024 20:21 UTC (Sat) by khim (subscriber, #9252)In reply to: Self-modifying code by malmedal
Parent article: May the FOLL_FORCE not be with you
> I don't remember if the 386 had an executable-only mode, but we certainly had writable memory that could be executed.
The main issue that we are discussing here revolves around NX bit that allows one to create non-executable code!
On 386 the only way to make code non-executable was to play with segments and their limits. On Unix-like OS the best you may do is split 4GB of virtual address space in two: non-excutable area and executable area.
That means that approach that skissane talks about is just simply not possible on 386! Except if you use extremely weird OS which doesn't use paging, but uses segments for virtual memory.
Such OSes may exist, in theory, but I certainly know none, that actually did this thing in practice, that's why I have become so excited you said you did that with 386.
But it looks more and more likely that you haven't done what we are talking here about at all and are talking about entirely different situation.
> They are recommending a sort of RCU like approach to avoid this:Note that since stores to the instruction stream are observed by the instruction fetcher in program order, one can do multiple modifications to an area of the target thread's code that is beyond reach of the thread's current control flow, followed by a final asynchronous update that alters the control flow to expose the modified code to fetching and execution.
That just happens with JITs automatically: once you have created optimized version of routine there are rarely the need to go back to intepreter. But yeah, usually only one call/jmp instruction is patched.
It's not a fragile thing to do, it is even supported by ld, see the -N option. Seems that interferes with shared libraries, so if you want that you need to use mprotect.
Again: that's different. Keeping something in the write+execute mode is dangerous WRT exploits, but not fragile, but playing with permissions and flipping from read+write to read+execute and back is pretty fragile because you need to ensure that code that you want to patch is not executed on the other core!
> I believe it fell out of favour because the performance advantage became much less when the 486 came with on-chip cache.No, it fell out of favor much later, when people started caring about security and started enforcing W^X property.
First with segment limit tricks and then, later, with hardware NX bit.
Only Apple and only on iOS enforces it so radically as make JITs simply impossible, other OSes provide ways for JITs to work, that we are discussing here.
But all this discussion is happening in an W^X world!
Why do you keep bring W+X examples and keep saying that you can do everything easily if only you remove that restriction… of course it's possible to do, what could be simpler?
That's simply not what we are discussing here! The idea is to ensure that W^X is strictly enforced, maybe even teach kernel not to ever provide W+X mappings at all — and yet still keep JITs working, somehow.