Wednesday, February 27, 2008

Title changed stuff all the same

All visitors, The HCl is now k0r0pt. Well, it means nothing, but that's what I am now.
:)
A new member is about to join me soon.

Tuesday, February 26, 2008

Playing with the stack


Flaws in a connected world

-Xtreme

Introduction

Look around you. You’ll probably see a computer somewhere. No? Well, keep looking, you’ll get one. It is needless to say that the world of computers is a vast one. What I shall describe here is a drop of water in a huge ocean. The computer world, although has improved our lives to some extent, but has on the other hand, taken away our privacy and security.

Here, I am about to describe flaws in any computer program. Programs are the real working components of the computers. Now, without any more crap, I shall jump to the main section. Whatever I shall describe here, is meant with GNU/Linux. So, a minimal idea of GNU/Linux is required. Knowledge of C and Linux assembly will be helpful in understanding this text.

To read the best article of all times, in stack smashing, read Aleph One’s Smashing the Stack

The computer program

The computer program is stored in what is called computer memory. The computer memory is nothing but flip-flops integrated in a single small circuit. A recent memory circuit is capable of storing up to 8*(1024)^3 bits, and that’s 8589934592 1’s and 0’s. Huge indeed.

The computer program structure

A computer program, when executed, is loaded into the memory of the computer memory, and every instruction is executed with the help of registers like PC, AC, BI, IC … oh there are so many, I can’t even remember them :)

The memory allocated to a certain program is mainly divided into four parts

  1. Stack: The stack is that part of the memory of the program, where all local variables are stored. The parameters of function calls are also stored in this very place of memory. It is, as a matter of fact, my personal favorite memory area. You’ll know why.

  1. Data segment: This area contains all constants initialized by the programmer. As an example, a constant declared by

char *s = “Hello”;

will be stored in the data segment.

  1. BSS segment: The term BSS stands for Block Started By Symbol and contains all uninitialized and declared variables. This segment starts at the end of Data segment, and contains all global uninitialized variables. For example, a global declared by static int x; will be stored in the BSS segment.

  1. Heap: The heap starts after that, and grows to higher memories. It is managed by functions like malloc, calloc, realloc and free, which use the brk and sbrk system calls, to adjust its size.

The complete discussion is out of the scope of this document. We shall concentrate on the Stack. This is the memory location, which can actually be exploited. Yes, now I am going to tell how those exploit writers write those elusive exploits and make any computer do anything they want. The amount of power these exploits can give is enormous. These give us power to control anything around the globe, as everything is computerized nowadays.

Function calls

When a function is called in a program, first, it pushes all the arguments in the reverse order. This means, that the last argument is pushed in the stack first, as the stack follows the LIFO technique. Then, to actually make the control jump to the memory where the function is stored, the call instruction is used. The call instruction automatically pushes the return address onto the stack. So, now the structure of the memory is as given below

_____________ <----------- %esp

|_return address_|

|__argument1___|

|__argument2___|

|__argument3___|

| . |

| . |

| . |

|__argumentn___|

The function, which is called, then pops out those arguments. But first, it pushes the %ebp register into the stack. That is, it saves the %ebp register to the stack. It is done, because it is in the Intel architecture, that a function can change any register, but it must save the %ebp register. The %ebp register is the Extended Base Pointer register. It is called extended, because in modern machines, these registers are 32 bit long. So, they are called extended. Similarly the Accumulator would be designated by %eax instead of the traditional ax. The % sign is used to tell the assembler, that it is a register. The next instruction makes the program, save the %esp to %ebp, so that %ebp now points to the place, where the old %ebp is stored. This is done, because the %esp changes its position, and so it is hard to reference memory locations with that.

The stack pointer register %esp “always” points to the top of the stack. The function then allocates space for local variables, by subtracting a certain number of bytes from the current stack pointer's location. So, now, the stack looks somewhat like this:

______________ <----------- %esp

|_localvariables_|

.

.

.

______ . ______

|___old %ebp___| <----------- %ebp

|_return address_|

|__argument1___|

|__argument2___|

|__argument3___|

| . |

| . |

| . |

|__argumentn___|

Examples

Example1:

In the first example, I shall show how to access the return address, and eventually, change that to execute any other code.

/*

*example3.c

*/

#include

void function(int a, int b, int c) {

char buffer1[5];

char buffer2[10];

int *ret;

ret = buffer1 + 13;

(*ret) += 7;

}

int main() {

int x;

x = 0;

function(1,2,3);

x = 1;

printf("%d\n",x);

}

xtreme@linux-cr15:~/bufferoverflow> gcc example3.c -o example

example3.c: In function ‘function’:

example3.c:8: warning: assignment from incompatible pointer type

In this program, all I am doing, is making the program skip the x=1; assignment statement.

xtreme@linux-cr15:~/bufferoverflow> ./example3

0

Looks like it works! Here's how I do this: the assignment statement is actually 7 bytes long (don't ask how I knew that. It was a lot of pain ... to get to know that). What I am doing, is making the buffer1 pointer, point to the return address. Now, why did I add 13 to buffer1? The answer follows. The stack, after the function call, would look like this in this program:

93 |_unused memory|

109 |_buffer2_______|

119 |_buffer1_______|

124 |__empty_______|

128 |__old %ebp____|

132 |_return address_|

Now, you'd ask where did I know it from? Well, I just disassembled it using the GNU Debugger.

xtreme@linux-cr15:~/bufferoverflow> gdb example3

GNU gdb 6.5

Copyright (C) 2006 Free Software Foundation, Inc.

GDB is free software, covered by the GNU General Public License, and you are

welcome to change it and/or distribute copies of it under certain conditions.

Type "show copying" to see the conditions.

There is absolutely no warranty for GDB. Type "show warranty" for details.

This GDB was configured as "i586-suse-linux"...Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) disas function

Dump of assembler code for function function:

0x080483b4 : push %ebp

0x080483b5 : mov %esp,%ebp

0x080483b7 : sub $0x20,%esp

0x080483ba : lea 0xfffffff7(%ebp),%eax

0x080483bd : add $0xd,%eax

0x080483c0 : mov %eax,0xfffffffc(%ebp)

0x080483c3 : mov 0xfffffffc(%ebp),%eax

0x080483c6 : mov (%eax),%eax

0x080483c8 : lea 0x7(%eax),%edx

0x080483cb : mov 0xfffffffc(%ebp),%eax

0x080483ce : mov %edx,(%eax)

0x080483d0 : leave

0x080483d1 : ret

End of assembler dump.

The lea instruction is loading the address %ebp-9, which stores the buffer1. That is, we load the address of buffer1 into the accumulator and then add 0xd to it. Why 0xd? because, this is where the return address is stored. The buffer1 array is 5 bytes long, the old %ebp is one word long (the word length in a 32 bit computer is 32 bits long). And the empty word (the empty word is kept, due to security considerations). So, the offset of the return address from buffer1 is 5+4+4=13=0xd. We make ret point to this place. Then, we increment the value at ret, by 7, so that the return address is the address of the printf function's calling instructions, in which we are pushing in the arguments. So, x is never set to 1, and 0 is printed, when we print x.

Example 2:

Now we jump to some real l33t stuff. We shall jump to raw hex codes, which is the hex representation of the binary code, the way it is all stored in the computer – My favorite form. :)

Here is where I shall introduce “thou” to the SHELLCODES - The potential threat software security in my world. SHELLCODES are codes in raw hex form. For this, we shall do two things, first, we shall make a program in assembly, that will exit the program. The return statement does the same thing in C. We'll set the return code to 0, which will be equivalent to return 0;. And second, we shall make a program, which will jump control to a string. That is, which will edit the return address to the address of a string. This string will actually store the code we shall construct in the first step. This code is what we call shellcode.

#exit.asm

.section .data

.section .text

.globl _start

_start:

mov $1, %eax

mov $0, %ebx

int $0x80

xtreme@linux-cr15:~/bufferoverflow> as exit.asm -o exit.o

xtreme@linux-cr15:~/bufferoverflow> ld exit.o -o exit

xtreme@linux-cr15:~/bufferoverflow> ./exit

xtreme@linux-cr15:~/bufferoverflow> echo $?

0

xtreme@linux-cr15:~/bufferoverflow>

What we do here is something like this. The # is used to specify that it is a comment. The .data is the representator of the data segment. Here we normally declare the global variables. We don't have any here. The real stuff start in the _start label. We move the value 1 to %eax, which is the register used to specify what operation we want to execute, when we call the interrupt 0x80. Then, we move 0 to %ebx, which is the return code here. And after that, call the interrupt 0x80. The bla-bla after the file is what I have done in the terminal. I first assemble exit.asm to exit.o, which is the object code. Then I link the exit.o to exit. Then I execute the program exit with the ./ to specify that it is the exit program in the current directory, and not the normal exit command, which exits me out of the terminal. The echo is a function in the shell, which is used to print things. The $? is a variable, that stores the return code of a program, that recently exited. We see it's 0. So, we know, that it is working well. Well, this is how things work in linux, and I really like it this way ;)

Now, we proceed to obtain the hex code of the executable. We use the objdump utility for this, with the -d option, which tells it that the file argument has to be disassembled.

xtreme@linux-cr15:~/bufferoverflow> objdump -d exit

exit: file format elf32-i386

Disassembly of section .text:

08048054 <_start>:

8048054: b8 01 00 00 00 mov $0x1,%eax

8048059: bb 00 00 00 00 mov $0x0,%ebx

804805e: cd 80 int $0x80

xtreme@linux-cr15:~/bufferoverflow>

Now, you'd ask, how do we represent all this in the string. The hex numbers at the left side of the assembly code, is the equivalent hex code for the instruction. The mov $0x1, %eax has the hex code b8 01 00 00 00. Every hex number is a nibble long, as all must know, so the b8 is a byte long, and the instruction is 5 bytes long. The whole hex code is thus b801000000bb00000000cd80. To represent the hex code in a string, there are two ways. First, find out an ascii table lying around somewhere in your room, and then find the ascii characters from the table for b8, 01 etc. It's a tiresome way. The other method is to write out the hex code in the string itself using \x prefix, to specify that it is a hex code, and it is how it has to be stored in the memory. So, the string would be “\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80”.

But hey, did you notice something? We have 0's in the code, but 0's mark the end of strings in the machine. It won't take anything after the 0. It's not going to work. So, we need to find some alternative to all this, so that we can take care of all these 0's. What do we do? We develop better code.

Now, it's thinking time...

If we xor a number with itself, we get 0. xor yields 1, if and only if the two numbers are different.

So, we xor out eax and ebx and put a 1 in the lowermost byte of accumulator register, to make %eax 1, and then call the interrupt.

#exit.asm

.section .data

.section .text

.globl _start

_start:

xorl %eax, %eax

mov $1, %al

xorl %ebx, %ebx

int $0x80

Notice, that here we have used mov and not movl. The movl instruction stands for move a long value, while mov, here will move a byte. The xorl similarly means xor out two long values. The long is 4 bytes long, independent of what the bus width is in the computer. That is, it is 32 bits in a 16 bit computer or in a 32 bit computer. Now, we assemble and link the program and then execute it.

xtreme@linux-cr15:~/bufferoverflow> as exit.asm -o exit.o

xtreme@linux-cr15:~/bufferoverflow> ld exit.o -o exit

xtreme@linux-cr15:~/bufferoverflow> ./exit

xtreme@linux-cr15:~/bufferoverflow> echo $?

0

xtreme@linux-cr15:~/bufferoverflow>

Okay, it works! Now, we get the hex code.

xtreme@linux-cr15:~/bufferoverflow> objdump -d exit

exit: file format elf32-i386

Disassembly of section .text:

08048054 <_start>:

8048054: 31 c0 xor %eax,%eax

8048056: b0 01 mov $0x1,%al

8048058: 31 db xor %ebx,%ebx

804805a: cd 80 int $0x80

xtreme@linux-cr15:~/bufferoverflow>

See, the 0 bytes are gone. Now, we can safely frame a string, that will contain the shellcode, and then execute it. The string now becomes “\x31\xc0\xb0\x01\x31\xdb\xcd\x80”. Much smaller and much better than the previous.

Now, we move on to the second step, where, we create a program, that returns control to a string, which actually contains the code we just generated. But, before I go any further, I'd like to tell, why I have used the trick in a function, when I could have done it in the main function as well. This is because, after getting wary of the security flaws of this type, compiler developers have come up with method to change the memory of the stack in case of main, in such a way, that the return address is somewhere totally separate from the location of the variables. It is done, so that we cannot access the return address directly. But the functions are still made the same old way. The millions of other manuals that you'll find out there are the classic one's written in olden times. So, don't get frustrated or angry, when they don't work on your computer. They aren't wrong. They're just “old”. One such is the “Smashing the stack” by Aleph One, written out in 1996, for an underground magazine. So, here's the program.

/*

* xxx.c

*/

char shellcode[] = “\x31\xc0\xb0\x01\x31\xdb\xcd\x80”;

void func(){

int *ret;

ret = (int *)&ret + 2;

(*ret) = (int)shellcode;

}

int main() {

func();

return 1;

}

Here, the main trick is being executed in the func function, where we access the return address. Let's delve deeper in this. We declare a pointer in here. Remember, that the pointer is also in the stack. What I mean to say, is that the address of the pointer is in the stack itself. This pointer again stores some other address and thus, “points” to that particular address. The pointer is of integer type, so that it consumes one word, that is 32 bits. “Why 32 bits”, you may ask, “when I know that integers are 16 bits long?” Well, my dear friend, what you have studied is DOS based C. It's the turbo C stuff, which was used in old computers, and generated 16 bit executables. That was used in 16 bit computers. Nowadays, what we use are 32 bit computers. Here, integers are 32 bits long. In AMD Athlon processors or the Intel Xeon processors, which are 64 bit processors, the integer or the word length is 64 bits. So, when we add 2 to ret, we are actually adding 8 bytes to it. Why 8? Because this is where the return address is stored. When we declare a pointer, it points to itself. Now, after the pointer, we have the old %ebp, immediately after which, we have the return address stored. So adding 8 to the current location of ret will get us to the return address. Then we make the value at the address ret the address of the string shellcode. So, when this function returns, control jumps to the string, rather than, back to main.

Now, we shall compile and execute this piece of code. If it works, then we should get a return code of 0. And if not, we should get a return code 1.

xtreme@linux-cr15:~/bufferoverflow> gcc xxx.c -o xxx

xtreme@linux-cr15:~/bufferoverflow> ./xxx

xtreme@linux-cr15:~/bufferoverflow> echo $?

0

xtreme@linux-cr15:~/bufferoverflow>

So, we see, that the shellcode has worked after all! We have made it return 0.

Well, this is all of my research. This, I must say is what you will find in many articles lying out there. The only difference being my own research. The fact that main's address is changed being one of them. This is only the beginning of the work. It is only the tip of the iceberg, the real culprit is still not visible. A lot more has already been done in the fields of shellcodes. With time and research, more and more powerful shellcodes will be developed. I shan't write any longer due to the limitations of our magazine's size. Any more discussion would simply mean a whole book!