Developers Club geek daily blog

2 years, 9 months ago
In article it is in detail told about self-modified a code (SMK), and about how to use it in the programs. Examples are written on C ++ with use of the built-in assembler. Still I will tell about how to execute a code on a stack that is an essential trump during the writing and execution of SMK.

The self-modified code

1. Introduction


Well, went. Article promises to be long as I want to write it such that you had no questions. On SMK there are already one million articles, but my vision of a problem – after hundreds of hours of writing of SMK is provided here … I will try to push all the works here. Everything, be enough tomato juice (or that you prefer to drink there), do music louder and be going to learn how to relieve the application of the beginning crackers! In passing, I will tell you about memory of Windows and some other things about which you even do not suspect.


2. Short history of the self-modified code


Absolutely until recently programmers had luxury to use the self-modified code where it is necessary for soul. 10-20 years ago more and more or less serious attempts of protection of programs used SMK (the self-modified code). Even some compilers used SMK, operating with a code in memory.

Then in the mid-nineties something occurred. Something was called Windows 95/NT. Suddenly to us, programmers, let know that everything that we did earlier - it is a garbage, and we have to master a new platform. All earlier thought up tricks could be forgotten as now we could not already play without demand with memory, iron and an operating system. At most of people thoughts crept in that writing of SMK will not be further possible without use of VxD for which that is characteristic of Windows, there was no more or less competent documentation. After a while it was revealed that we after all MOZHEM to use SMK in the programs. One of methods is use of the WriteProcessMemory function exported by Kernel32 library, another – code layout on a stack with the subsequent its modification.

The remaining balance of article is generally devoted to Microsoft visual c ++ and 32nd to a bit subsystem.


3. Memory of Windows as it is


It is not as simple to create SMK in Windows as it would be desirable. Here it is necessary to face some reefs which are carefully spread out by creators of Windows. Why? Yes because it is Microsoft.

As you know, Windows takes away 4 gigabytes of virtual memory for each process. For addressing of this memory, two selectors are involved in Windows. One is loaded into a segment register of CS, and another was thrown into the registers DS, SS and ES. All of them use the same base address (equal 0) and are limited to space in 4 gigabytes.

In the program there can be only ONE segment containing both a code and data, also as well as ONE process stack. You can use the NEAR procedure call or transition to the control code located on a stack. In the latter case you should not use SS for the appeal to a stack. Though value of the register CS does not match DS, SS and ES, the commands MOV dest, CS: [src], MOV dest, DS: [src] and MOV dest, SS: [src] address the same section of memory.

Areas of memory (pages) containing data, a code and a stack can have some attributes. For example, at pages of a code, reading and execution, at data pages – a read and write, at a stack – reading, record and execution at the same time are resolved.

Also these pages can have a number of security attributes. I will tell about them a bit later when they are necessary for us.


4. WriteProcessMemory – the new best friend


It is the simplest to change several bytes in process (in my opinion) it is possible having used the WriteProcessMemory function (if protection flags are not set).

The first that needs to be made for this purpose – to get access to the process loaded into memory, by means of the OpenProcess function with attributes of access PROCESS_VM_OPERATION and PROCESS_VM_WRITE. Below the example of the elementary SMK about which we also will talk is given. On C ++ for implementation of this mechanism, we will need some built-in opportunities of language. By itself, all this can be done also in other languages, but only we will talk about it somehow another time. Besides, in other languages all this looks much more difficult.

Listing 1. WriteProcessMemory on service SMK
int WriteMe(void *addr, int wb)
{
	HANDLE
	h=OpenProcess(PROCESS_VM_OPERATION|
	PROCESS_VM_WRITE,
	true, GetCurrentProcessId());
	return WriteProcessMemory(h, addr, &wb;, 1, NULL);
}
int main(int argc, char* argv[])
{
	_asm {
		push 0x74 ; JMP >> JZ
		push offset Here
		call WriteMe
		add esp, 8
Here: JMP short Here
	}

	printf("Holy Sh^&OsIX, it worked! #JMP SHORT $2
			was changed to JZ $2n");
	return 0;
}

As you can see, the program replaces an infinite loop with simple transition of JZ. It allows the program to pass to the following instruction, and we see the message which confirms the replacement fact. Fine, huh? I Bet, now you think … hmm, interestingly, and I could make something similar? Most likely and!

At the same time, such method (use of WriteProcessMemory) has a number of vulnerabilities. First of all, the experienced cracker BUDET to analyze the import table and will find suspicious function. He, most likely, will put several bryak on this challenge, will analyze nearby the standing code and will find that it is necessary for it. Because use of WriteProcessMemory is characteristic only of compilers which collect a code in memory, or for raspakovshchik of the performed files. At the same time, such trick you can freely nonplus the beginning cracker. I often in the programs use such acceptance.

One more saxophone of WriteProcessMemory – impossibility of creation in memory of new pages. The trick with this function works only at the existing pages. Therefore, though there are several methods to lick application of this function into shape, we will pay the attention to execution of a code on a stack.


5. Code layout on a stack, and its execution!


To place a code on a stack not only it is admissible, but sometimes even it is necessary. In particular, it facilitates life to compilers, allowing them to generate a code on the fly. But whether will threaten such liberties with a stack safety of system? By itself, – they can draw troubles upon your bum. Besides, it is not the best technology for your programs as installation of the patch prohibiting code execution on a stack will paralyze work of the majority of your creations. On the other hand, though such patch is — in particular for Linux, for Solaris — and though it is very useful, I think that it is set only by two persons (authors, hi-h).

You still remember the vulnerabilities of WriteProcessMemory mentioned above? The trick with placement of executable code on a stack gives us two pleasant opportunities for their elimination. First, the instructions modifying a code are located in unknown section of memory and therefore the cracker can almost not trace them. For the analysis of the protected code, it should saw a tree of our program under the butt therefore most likely its work will not be crowned with great success! Other argument for benefit of code execution on a stack – the program at any time can select to itself so much memory how many it is necessary, and at any time can release it. By default the operating system selects 1 MB of memory for a stack. If the carried-out task demands bigger memory, the program can request an additional quota.

At the same time, there are several nuances which need to be known before placing the code on a stack … Therefore we now will also talk about them.


6. Why the moved code can be harmful to your health


You have to know that locate Windows 9x, Windows NT and Windows 2k a stack in different places. Therefore in order that your program was cross-platform, it is important to use relative addressing. It is not so difficult to implement this requirement, for all this it is only necessary to conform to several simple rules – be they are damned, these rules!

To our great pleasure, in the world 80x86 all "short-jumps" and "nir-kala" – relative. It means that it is not necessary to use linear addresses, but it is necessary to use a difference between the target address and the address of the following program instruction. Such relative addressing will significantly simplify our life, but even it has the restrictions.

For example, that will occur if the void OSIXDemo function () { printf ("Hi from OSIXn"); } to copy in a stack and then to cause it? Such challenge most likely will lead to an error as the address printf changed.

On the assembler, by means of the register of addressing, we can easily fix this problem. It is possible to implement the moved function call of printf very just, for example, LEA EAX, printfNCALL EAX. Now ABSOLYUTNY the linear address, – not relative, – is placed in the register EAX. Therefore has no value from where the printf function is caused – it will correctly work.

For reproduction of similar tricks, it is necessary that your compiler supported assembly inserts. I know if you are not interested in low-level programming, for you it is a complete saxophone, but precisely MOZHNO to do the same having limited to the arsenal provided by languages of the high level. There is a simple example:

Listing 2. How to copy function in a stack and to start it there
void Demo(int (*_printf) (const char *,...))
{
	_printf("Hello, OSIX!n");
	return;
}

int main(int argc, char* argv[])
{
	char buff[1000];
	int (*_printf) (const char *,...);
	int (*_main) (int, char **);
	void (*_Demo) (int (*) (const char *,...));

	_printf=printf;
	int func_len = (unsigned int) _main ­ (unsigned int)
	_Demo;

	for (int a=0; a<func_len; a++)
		buff[a] = ((char *) _Demo)[a];
	_Demo = (void (*) (int (*) (const char *,...))) &buff;[0];
	_Demo(_printf);
return 0;
}

So do not allow anybody to pull to yourself the wool over the eyes that languages of the high level do not allow to execute a code on a stack.


7. We begin optimization right now!


If you are going to write SMK or to use a code executed on a stack, then you need to approach seriously the choice of the compiler and to study features of its work. Most likely your OBVALITSYA OSHIBKOY code at the first address to him from the program, in particular if your compiler is set in the mode of "optimization".

Why does that happen? Because in such purely high-level programming languages as Xi or Pascal, it is very-devilishly difficult to copy a function code in a stack or somewhere else. The programmer has an opportunity to receive the pointer on function, but at the same time, there are no rules standardizing its use. Among programmers it is called "magic number" of which the compiler knows only.

Fortunately, practically all compilers at code generation use similar logic. These are peculiar unspoken rules of compilation of a code. Therefore the programmer can also use them.

Let's look once again at Listing 2. We fairly assume that the pointer on our Demo function () matches its beginning and that the function body is located at once behind the beginning of this function. The majority of compilers adhere to it "common sense of compilation", but do not calculate that they all from them follow it. Well though, big guys (VC ++, Borland, etc.) after all follow this rule. Therefore if you do not use any unknown or the new compiler, do not worry about absence of "common sense of compilation". One note is relative VC ++: if you work in a debug mode, the compiler inserts certain "adapter" and places function in other place. Devil's Microsoft. But do not worry, be just convinced that in settings the flag of "Link Incrementally" which will force your compiler to generate a good code is set. If your compiler has no such option, you can either not use SMK, or use other compiler!

Other problem consists in determination of length of function. For this purpose there is a simple and reliable trick. In C ++ the instruction of sizeof returns the pointer size on function, but not the amount of the function. At the same time, as a rule compilers select memory under objects, according to order of their emergence in the source code. So … the amount of function is a difference between the pointer on function and the pointer on function, following it. Very simply! Remember this trick, it is useful to you, even in spite of the fact that the optimizing compilers NOT WILL conform to these rules and consequently also the method which I just described will not work. You see why the optimizing compilers are so harmful to your health if you write SMK?!?!?

One more thing which is done by the optimizing compilers, this removal of variables which, as they DUMAYUT, are not used. Returning to our example from listing 2, we will see that in the buff buffer some value, but nothing from there NE CHITAETSYA registers. The majority of compilers is not capable to recognize the fact of transfer of management to the buffer therefore they delete the instructions copying a code in the buffer. Bastards! That is why the control is transferred on the non-initialized buffer, and then … boom. Crash. If such problem takes place to be, deselect a checkbox with "Global optimization", and you will have everything as it should be.

If your program still does not work, do not give up. The probable cause is that the compiler at the end of each function inserts the challenges of subprogrammes controlling a stack. Microsoft VC ++ acts this way. She adds in the debugged projects function calls __ chkesp. Do not trouble yourself search of the description of this function, it is not in documentation! This challenge is relative, and there is no method to exclude it. However, in the final VC project ++ checks a stack status at an output from function therefore your program will work like clock-work.


8. SMK in your own programs


So, at last time for this purpose what all of you waited long ago for came. If you did all this big way described in article I welcome you. (storm of applause)

Well, now you can wonder (or to ask me) "What benefits of execution of a code (function) on a stack?" And the answer such is (please include a drumbeat …) the Function code, placed on a stack, can be changed on the fly, for example by the decoder. In response to it the crowd speaks: Akhkhkhkhkhkhkhkhkhkhkhkh.

The ciphered code is such big splinter in a bum of the cracker who is engaged in disassembling. Of course, using a debugger, it slightly facilitates the life, but all the same encoded code does his / her life incredibly difficult.

For example, the elementary encryption algorithm which is consistently applying operation of the excluding ILI (XOR) to each code line and which at a reuse recovers the source code!

There is an example which reads out contents of our Demo function (), ciphers it and writes result in the file.

Listing 3. How to cipher the Demo function
void _bild()
{
	FILE *f;
	char buff[1000];
	void (*_Demo) (int (*) (const char *,...));
	void (*_Bild) ();

	_Demo=Demo;
	_Bild=_bild;
	int func_len = (unsigned int) _Bild ­ (unsigned int) _Demo;
	f=fopen("Demo32.bin", "wb");
	for (int a=0; a<func_len; a++)
		fputc(((int) buff[a]) ^ 0x77, f);
	fclose(f);
}

The result of enciphering is located in line a variable. Now the Demo function () can be deleted from the source code. In an effect when it is required to us, it can be deciphered, copied in the local buffer and to cause for execution. A kick under the back, huh?

Here example of implementation of this algorithm:

Listing 4. The ciphered program
int main(int argc, char* argv[])
{
	char buff[1000];
	int (*_printf) (const char *,...);
	void (*_Demo) (int (*) (const char *,...));
	char code[]="x22xFCx9BxF4x9Bx67xB1x32x87
		x3FxB1x32x86x12xB1x32x85x1BxB1
		x32x84x1BxB1x32x83x18xB1x32x82
		x5BxB1x32x81x57xB1x32x80x20xB1
		x32x8Fx18xB1x32x8Ex05xB1x32x8D
		x1BxB1x32x8Cx13xB1x32x8Bx56xB1
		x32x8Ax7DxB1x32x89x77xFAx32x87
		x27x88x22x7FxF4xB3x73xFCx92x2A
		xB4";

	_printf=printf;
	int code_size=strlen(&code;[0]);
	strcpy(&buff;[0], &code;[0]);
	for (int a=0; a<code_size; a++)
		buff[a] = buff[a] ^ 0x77;
	_Demo = (void (*) (int (*) (const char *,...))) &buff;[0];
	_Demo(_printf);
	return 0;
}  

Pay attention that the printf function () displays a greeting. At a glance you will not notice anything unusual, but you look where there is a line "Hello, OSIX!". It does not fit in code segment (though Borland for some reasons places lines there), having checked data segment, you are convinced that it where has to be.

Now, even if before eyes of the cracker there will be a source code, for it our program will still remain to one of infernal puzzles. I use this method for concealment of "confidential" information (serial numbers and keys for my programs, etc.).

If you are going to use this method for check of serial number, verification needs to be organized so that even when decoding, the puzzle for the cracker remained. I will show as to make it in the following listing.

Remember, at implementation of SMK, you need to know TOCHNOYE an arrangement of bytes which you are going to change. Therefore instead of languages of the high level, it is necessary to use the assembler. Give, remain with me, we almost finished!

When using the assembler in implementation of the above described method, there is one problem. For change of any byte by means of the instruction of MOV, it is necessary to transfer as parameter ABSOLYUTNY the linear address (which as you likely guessed, before compilation NEIZVESTEN). NO … we can obtain this information during execution of the program. $ CALL + 5/POP REG/MOV of [reg+relative_address], xx – the code enjoying wide popularity among me. It works as follows. As a result of execution of the instruction of CALL on a stack there is an address (or the absolute address of this instruction). This address is used as a code of stack function, basic for addressing.

And here an example of verification of serial number which I promised you …

Listing 5. Generation of serial number and execution on a stack
MyFunc:
push esi		; Сохраняем регистр ESI на стеке
mov esi, [esp+8]	; ESI = &username;[0]
push ebx		; Сохранение других регистров на стеке
push ecx
push edx
xor eax, eax		; Обнуление рабочих регистров
xor edx, edx
RepeatString:		; Цикл побайтной обработки строки

Lodsb			; Чтение очередного байта в AL
test al, al		; Достигнут конец строки?
jz short Exit

; Значение счётчика, который обрабатывает 1 байт стоки должно быть
; выбрано таким образом, чтобы все биты были перемешаны, но чётность
; (нечётность) обеспечивается за счёт преобразований, выполняемых операцией XOR

mov ecx, 21h
RepeatChar:
xor edx, eax		; Многократные замены XOR и ADC
ror eax, 3
rol edx, 5
call $+5		; EBX = EIP
pop ebx ; /
xor byte ptr [ebx­0Dh], 26h;

; Эта инструкция обеспечивает цикл
; Инструкция XOR заменяется ADC.

loop RepeatChar
jmp short RepeatString
Exit:
xchg eax, edx		; Результат работы (сер.ном) в EAX
pop edx		; Восстановление регистров
pop ecx
pop ebx
pop esi
retn			; Возврат из функции

This code looks pretty strange as his repeated challenges, by transfer of the same arguments, on an output give either something identical, or absolutely different results! It depends on user name length. If it odd, XOR at an output from function is replaced with ADC. Otherwise, nothing similar occurs!

There now and all so far. I hope that this article was though something is useful to you. Its printing borrowed me the whole two hours! Back coupling is always welcomed.

English primary source: Giovanni Tropeano. Self modifying code//CodeBreakers Journal. Vol. 1, No. 2, 2006.

This article is a translation of the original post at habrahabr.ru/post/272619/
If you have any questions regarding the material covered in the article above, please, contact the original author of the post.
If you have any complaints about this article or you want this article to be deleted, please, drop an email here: sysmagazine.com@gmail.com.

We believe that the knowledge, which is available at the most popular Russian IT blog habrahabr.ru, should be accessed by everyone, even though it is poorly translated.
Shared knowledge makes the world better.
Best wishes.

comments powered by Disqus