> I have read that self-modifying code on the x86 architecture is pretty dangero...

anfilt · on Dec 31, 2019

Having written self modifying x86 code. The most annoying thing is that instructions are not a fixed width. This means patching code requires you are able to parse every instruction, or a look up table where each instruction starts or just regenerate the entire code whole cloth. This also can cause problems if self modifying code has more than one thread since you suddenly may need to update 1 to 15 bytes atomically.

Generally, easiest to do the last or put NOPs or doing something like windows hot patch point for functions. Where hot patchable functions are preceded with a 5 bytes of nops, and the function always starts with MOV EDI, EDI which again is a pretty much a NOP, but takes two bytes.

This allows one to replace MOV EDI, EDI to a short jump to the start of those 5 bytes which is large enough to hold a long jump to any code. Windows went this route because originally multi byte NOPs where not part of the spec so if you used the one byte NOPs not only would each nop need to be execute slowing down function calls, but in multi-threaded code you would have to lock all threads to edit the code since it would be fetching on byte at time ect...

spc476 · on Jan 1, 2020

The original 8086 had a six byte instruction cache (the 8088 in the original IBM PC had a four byte instruction cache). If you modify an instruction less than six bytes away, it won't be seen by the CPU unless you issue a JMP (or CALL) instruction. It was not normally that big of an issue (just make sure you modify the instruction from far enough away). You can use the fact that the 8086 has a six byte instruction cache and the 8088 a four byte instruction cache to determine which CPU the program is executing on.

These days, I think you would need to 1) have memory pages with code with write permissions, 2) possibly flush the instruction cache and 3) hope no other thread is using said routine. With today's security concerns, 1) will not be likely, 2) possibly requires elevated privileges (I don't recall---I've only really done ring-3 level code on x86) and 3) is probably okay in a single-threaded program.

saagarjha · on Jan 1, 2020

You can get around 1 by having the page be mapped multiple times with different permissions. Modern x86 processors don’t need a cache flush.

stevekemp · on Jan 1, 2020

I remember using the instruction-cache as a way to catch debuggers single-stepping through my code. If you rewrote a near instruction to be a jump-to-self single-steppers would fall victim to it..