[go: up one dir, main page]

Menu

[r10]: / htdocs / articles / smc.html  Maximize  Restore  History

Download this file

367 lines (289 with data), 11.2 kB

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
<html>
<head>
<title>linuxassembly.org - Using self modifying code under Linux</title>
</head>
<body bgcolor="#ffffff" text="#000000">
<h1 align="center">Using self modifying code under Linux</h1>
<p align="center"><font size=-1>by <a href="mailto:linuxassembly@unusedino.de">Karsten Scheibler</a><br>
thanks to Stefan Esser and Maciej Hrebien</font></p>
<dl><dt><b>Remark&nbsp;(a):</b></dt>
<dd>Save this page as <tt>smc.html</tt> and use the following command to extract the source code with a sample Makefile:<br>
<tt>sh&nbsp;-c&nbsp;&quot;( mkdir&nbsp;smc&nbsp;&amp;&amp; cd&nbsp;smc&nbsp;&amp;&amp; awk&nbsp;'/^&lt;!--eof/{d=0}{if(d)print&gt;f}/^&lt;!--sof/{f=\$2;d=1}' )&nbsp;&lt;&nbsp;smc.html&quot;</tt>
</dd><dt><b>Remark&nbsp;(b):</b></dt>
<dd>Some browsers try to be super smart and reformat the html code
so you may have trouble to extract the files. This is the right time to learn more
about wget, a tool to download files via http or ftp ;-)
</dd><dt><b>Remark&nbsp;(c):</b></dt>
<dd>Some versions of nasm are known to generate broken code.
I have used nasm-0.98 for this example which worked for me.
</dd>
</dl>
<p align="justify">I know that this is not the cleanest way of programming, but sometimes this
programming style is faster. Furthermore this method is used by JIT (just in time) compilers.
Transmeta also uses some sort of self modifying code to implement a x86
software emulation. They call it Codemorphing ;-) This example code will show how this
can be done under Linux.</p>
<p align="justify">The idea behind: There is a syscall <b>sys_mprotect</b> which allows a program
to change the flags for (nearly) every page.
A page is the smallest memory unit available for virtual
memory management. On the x86 architecture the page size is 4KB. But as i noticed
we didn't need this call to make a page in the <b>section .bss</b> executable,
because on x86 a readable page is also a executable page, and the
<b>section .bss</b> is read/write. But this
behavior may be change with the appearance of the NX-flag in modern
processors, so the <b>sys_mprotect</b> becomes mandatory.</p>
<p align="justify">The first two examples copy codesnippets to
<b>section .bss</b> and execute them.
Because we have read/write/execute permissions in this
memory area the code is allowed to modify itself.
The <a href="#example1">first</a> example copies a snippet of
code which writes text to screen (via <b>sys_write</b>), but before
calling it we modify some values in the code itself (the start and the
length of the text). The <a href="#example2">second</a> one does some real self modification
(see code2_start). The <b>rep stosb</b> instruction overwrites the first four
<b>inc ebx</b> with <b>nop</b>, so that the line put on screen contains a 04h
instead of the expected 08h. The <a href="#example3">third</a>
example modifies code in <b>section .text</b>.
It looks like an endless loop, but it isn't. Find out yourself why ;-).</p>
<dl><dt><b>Note&nbsp;1:</b></dt>
<dd><p align="justify">The <b>call</b> instruction in this code must be done indirect, because
if the address is given directly it is a relative address (signed 32 Bit)
after copying and starting the code you will get a <b>SEGFAULT</b>. If the address
is given indirectly it is an absolute value, which should work position
independent.</p></dd>
<dt><b>Note&nbsp;2:</b></dt>
<dd><p align="justify">If you see a 08h instead of a 04h in
the output, you get to know another strange effect of self modifying code.
If i remember right there was something like a prefetch queue.
This prefetch queue holds the next bytes after the
currently processed instruction in the processor (on my old 386SX it was
13 Bytes long, examined with a DOS Assemblyprogram long time ago ;-). If you
overwrite this queued instructions it depends on the processor if the
prefetch queue is reloaded or not. I expected
it also on my K6-2, but as Stefan Esser mentioned: <i>&quot;any Clone of the Pentium or above should behave friendly
cause with the Pentium Intel build some Prefetch Queue Modification Detection into
the CPU. If you write to the code that normaly would be inside the prefetch range
the pentium automaticly discards its queque and reloads...&quot;</i>.
If you want to make sure that the
reload happens use a <b>jmp</b> instruction (try a
<b>jmp</b> after the <b>rep stosb</b>).</p></dd></dl>
<center><table><tr><td><pre>
<!--sof smc.n -->
;****************************************************************************
;****************************************************************************
;*
;* USING SELF MODIFYING CODE UNDER LINUX
;*
;* written by Karsten Scheibler, 2004-AUG-09
;*
;****************************************************************************
;****************************************************************************
global smc_start
;****************************************************************************
;* some assign's
;****************************************************************************
%assign SYS_WRITE 4
%assign SYS_MPROTECT 125
%assign PROT_READ 1
%assign PROT_WRITE 2
%assign PROT_EXEC 4
;****************************************************************************
;* data
;****************************************************************************
section .bss
alignb 4
modified_code: resb 0x2000
;****************************************************************************
;* smc_start
;****************************************************************************
section .text
smc_start:
;calculate the address in section .bss, it must lie on a page
;boundary (x86: 4KB = 0x1000)
;NOTE: In this example obsolete because each segment is page
; aligned and we use it only once, so we know that it is
; aligned to a page boundary, but if you have more than
; one section .bss in your code (or link several objects
; together) you can't be sure about that
mov dword ebp, (modified_code + 0x1000)
and dword ebp, 0xfffff000
;change flags of this page to read + write + executable,
;NOTE: On x86 Architecture this call is obsolete, because for
; section .bss PROT_READ and PROT_WRITE are already set.
; PROT_EXEC is on x86 also set if PROT_READ is set, this
; results in rwx for this segment, but this behavior may
; change with appearance of the NX-flag in modern processors
mov dword eax, SYS_MPROTECT
mov dword ebx, ebp
mov dword ecx, 0x1000
mov dword edx, (PROT_READ | PROT_WRITE | PROT_EXEC)
int byte 0x80
test dword eax, eax
js near smc_error
;<a name="example1"></a>execute unmodified code first
code1_start:
mov dword eax, SYS_WRITE
mov dword ebx, 1
mov dword ecx, hello_world_1
code1_mark_1:
mov dword edx, (hello_world_2 - hello_world_1)
code1_mark_2:
int byte 0x80
code1_end:
;copy code snippet from above to our page (address is still in ebp)
mov dword ecx, (code1_end - code1_start)
mov dword esi, code1_start
mov dword edi, ebp
cld
rep movsb
;append 'ret' opcode to it, so that we can do a call to it
mov byte al, [return]
stosb
;change some values in the copied code: start address of the text
;and its length
mov dword eax, hello_world_2
mov dword ebx, (code1_mark_1 - code1_start)
mov dword [ebx + ebp - 4], eax
mov dword eax, (hello_world_3 - hello_world_2)
mov dword ebx, (code1_mark_2 - code1_start)
mov dword [ebx + ebp - 4], eax
;finally call it
call dword ebp
;copy second example
mov dword ecx, (code2_end - code2_start)
mov dword esi, code2_start
mov dword edi, ebp
rep movsb
;do something real nasty: edi points right after the 'rep stosb'
;instruction, so this will really modify itself
mov dword edi, ebp
add dword edi, (code2_mark - code2_start)
call dword ebp
;<a name="example3"></a>modify code in section .text itself
endless:
;allow us to write to section .text
mov dword eax, SYS_MPROTECT
mov dword ebx, smc_start
and dword ebx, 0xfffff000
mov dword ecx, 0x2000
mov dword edx, (PROT_READ | PROT_WRITE | PROT_EXEC)
int byte 0x80
test dword eax, eax
js near smc_error
;write message to screen
mov dword eax, SYS_WRITE
mov dword ebx, 1
mov dword ecx, endless_loop
mov dword edx, (hello_world_1 - endless_loop)
int byte 0x80
;here comes the magic, which prevents endless execution
mov dword ecx, (smc_end_1 - smc_end)
mov dword esi, smc_end
mov dword edi, endless
rep movsb
;do it again
jmp short endless
;****************************************************************************
;* code2
;****************************************************************************
;this is the ret opcode we copy above
;and the nop opcode needed by code2
return:
ret
no_operation:
nop
;<a name="example2"></a>here some real selfmodifying code, if copied
;to .bss and edi correctly loaded ebx should contain 0x4 instead
;of 0x8
code2_start:
mov byte al, [no_operation]
xor dword ebx, ebx
mov dword ecx, 0x04
rep stosb
code2_mark:
inc dword ebx
inc dword ebx
inc dword ebx
inc dword ebx
inc dword ebx
inc dword ebx
inc dword ebx
inc dword ebx
call dword [function_pointer]
ret
code2_end:
align 4
function_pointer: dd write_hex
;****************************************************************************
;* write_hex
;****************************************************************************
write_hex:
mov byte bh, bl
shr byte bl, 4
add byte bl, 0x30
cmp byte bl, 0x3a
jb short .number_1
add byte bl, 0x07
.number_1:
mov byte [hex_number], bl
and byte bh, 0x0f
add byte bh, 0x30
cmp byte bh, 0x3a
jb short .number_2
add byte bh, 0x07
.number_2:
mov byte [hex_number + 1], bh
mov dword eax, SYS_WRITE
mov dword ebx, 1
mov dword ecx, hex_text
mov dword edx, 9
int byte 0x80
ret
section .data
hex_text: db "ebx: "
hex_number: db "00h", 10
;****************************************************************************
;* some text
;****************************************************************************
endless_loop: db "No endless loop here!", 10
hello_world_1: db "Hello World!", 10
hello_world_2: db "This code was modified!", 10
hello_world_3:
;****************************************************************************
;* smc_error
;****************************************************************************
section .text
smc_error:
xor dword eax, eax
inc dword eax
mov dword ebx, eax
int byte 0x80
;****************************************************************************
;* smc_end
;****************************************************************************
section .text
smc_end:
xor dword eax, eax
xor dword ebx, ebx
inc dword eax
int byte 0x80
smc_end_1:
;*********************************************** linuxassembly@unusedino.de *
<!--eof-->
</pre></td></tr></table></center>
<!--sof Makefile
NASM=nasm -w+orphan-labels -w+macro-params -w+number-overflow -f elf
LD=ld -s
STRIP=strip -R .note -R .comment
RM=rm -f
.PHONY: all clean
all: smc
smc: smc.n
${NASM} -o smc.o smc.n
${LD} -e smc_start -o smc smc.o
${STRIP} smc
clean:
${RM} *.bak *~ *.o smc core
<!--eof-->
</body>
</html>