Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 General concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The UNIX way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Windows way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Syscall Proxying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A rst glimpse into an implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Optimizing for size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Fat client, thin server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 A sample implementation for Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 What about Windows? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 The real world: applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 The Privilege Escalation phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Redening the word "shellcode" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Abstract
A critical stage in a typical penetration test is the "Privilege Escalation" phase. An auditor faces this stage when access to an intermediate host or application in the target system is gained, by means of a previous successful attack. Access to this intermediate target allows for staging more effective attacks against the system by taking advantage of existing webs of trust and a more privileged position in the target systems network. This "attacker prole" switch is referred to as pivoting along this document. Pivoting on a compromised host can often be an onerous task, sometimes involving porting tools or exploits to a different platform and deploying them. This includes installing required libraries and packages and sometimes even a C compiler in the target system!.
Syscall Proxying is a technique aimed at simplifying the privilege escalation phase. By providing a direct interface into the targets operating system it allows attack code and tools to be automagically in control of remote resources. Syscall Proxying transparently "proxies" a process system calls to a remote server, effectively simulating remote execution. Basic knowledge of exploit coding techniques as well as assembler programming is required.
General concepts
A typical software process interacts, at some point in its execution, with certain resources: a le in disk, the screen, a networking card, a printer, etc. Processes can access these resources through system calls (syscalls for short). These syscalls are operating system services, usually identied with the lowest layer of communication between a user mode process and the OS kernel. The strace tool available in Linux systems process...". From man strace:
1
Students, hackers and the overly-curious will nd that a great deal can be learned about a system and its system calls by tracing even ordinary programs. And programmers will nd that since system calls and signals are events that happen at the user/kernel interface, a close examination of this boundary is very useful for bug isolation, sanity checking and attempting to capture race conditions. For example, this is an excerpt from running strace uname -a in a Linux system (some lines were removed for clarity):
[max@m21 max]$ strace uname -a execve("/bin/uname", ["uname", "-a"], [/* 24 vars */]) = 0 (...) uname({sys="Linux", node="m21.corelabs.core-sdi.com", ...}) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4018e000 write(1, "Linux m21.corelabs.core-sdi.com "..., 85) = 85 munmap(0x4018e000, 4096) = 0 _exit(0) = ?
Linux m21.corelabs.core-sdi.com 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown [max@m21 max]$
The strace process forks and calls the execve syscall to execute the uname program.
1
Similar tools are ktrace in BSD systems and truss in Solaris. Theres a strace like tool for Windows NT, available at http://razor.bindview. com/tools/desc/strace_readme.html.
The uname program calls the uname syscall of the same name (see man 2 uname) to obtain system information. The information is returned inside a struct provided by the caller. The write syscall is called to print the returned information to le descriptor 1, stdout. exit signals the kernel that the process nished with status 0. Different operating systems implement syscall services differently, sometimes depending on the processors architecture. Looking at a reasonable set of "mainstream" operating systems and architectures (i386 Linux, i386 *BSD, i386/SPARC Solaris and i386 Windows NT/2000) two very distinct groups arise: UNIX and Windows.
Due to this "lack of denition" for its system services, the high number of functions in this category (more than 1000), and the fact that the bulk of functionality for some of them is part of large user mode dynamic libraries, well refer to "Windows syscalls" to any function in any dynamic library available to a user mode process. For simplicitys sake, this denition includes higher level functions than those dened in ntdll.dll, and sometimes very far above the user / kernel limit.
Syscall Proxying
From the point of view of a process the resources it has access to, and the kind of access it has on them, denes the "context" on which it is executed. For example, a process that reads data from a le might do so using the open, read and close syscalls (see Figure 1).
Syscall proxying inserts two additional layers between the process and the underlying operating system. These layers are the syscall stub or syscall client layer and the syscall server layer. The syscall client layer acts as the nexus between the running process and the underlying system services. This layer is responsible for marshaling each syscall arguments and generating a proper request that the syscall server can
understand. It is also responsible for sending this request to the syscall server and returning back the results to the calling process. The syscall server layer receives requests from the syscall client to execute specic syscalls using the underlying operating system services. This layer marshals back the syscall arguments from the request in a way that the underlying OS can understand and calls the specic service. After the syscall nishes, its results are marshaled and sent back to the client. For example, the same reader process from Figure 1 with these two additional layers would look like Figure 2.
Now, if these two layers, the syscall client layer and the syscall server layer, are separated by a network link, the process would be reading from a le on a remote system, and it will never know the difference (see Figure 3).
Separating the syscall client from the syscall server effectively changes the "context" on which the process is executing. In fact, the process "seems" to be executing "remotely" at the host where the syscall server is running (and well see later that it executes with the privileges of the remote process running the syscall server). It is important to note that neither the original le reader program was modied (besides changing the calls to the operating systems syscalls to calls into the syscall client) to accomplish remote execution, nor did its inner logic change (for example, if the program counted occurrences of the word coconut in the le, it will still count the same way when working with the remote server). This technique is referred to as Syscall Proxying along this document.
Syscall Proxying can be trivially implemented using an RPC model. Each call to an operating system syscall is replaced by the corresponding client stub. Accordingly, a server that "exports" through the RPC mechanism all these syscalls in the remote system is needed (see Figure 4).
There are several RPC implementations that can be used, among these are: Open Network Computing (ONC) RPC (also known as SunRPC) Distributed Computing Environment (DCE) RPC RPC specication from the International Organization for Standardization (ISO) Custom implementation
Check Comparing Remote Procedure Calls [http://hissa.nist.gov/rbac/5277/titlerpc.html] for a comparison of the rst three implementations. In the RPC model explained above a lot of effort, symmetrically duplicated between the client and the server, is devoted both to converting back and forth from a common data representation format and to communicating through different calling conventions. These conversions made communication possible between a client and a server implemented in different platforms. Also, the RPC model attempts to attain generality, in making it possible to perform ANY procedure call across a network. If we can move all this logic to the client layer (making the client tightly coupled with the specic servers platform) we can drastically reduce the servers size, taking advantage of a generic mechanism for calling any syscall (this mechanism is "generic" when talking about a single platform).
Packing arguments in the client is trivial for integer parameters, but not so for pointers. Theres no relationship at all between the client and the server processes memory space. Addresses from the client probably dont make sense in the server and vice versa.
10
To accommodate for this issue the server, on accepting a new connection from the client, sends back his ESP (the stack pointer). The client then creates extra buffer space in the request, and "relocates" each pointer argument to point to somewhere inside the request. Knowing that the server reads requests straight into its stack, the client calculates the correct value for each pointer, as it will be in the server process stack (this is why we got the servers ESP on the rst place). See Figure 6 for a sample of how open() arguments are marshaled.
11
2. Load the syscalls arguments in the EBX, ECX, EDX, ESI, EDI registers 2 . 3. Call software interrupt 0x80 (int $0x80) 4. Check the returned value in the EAX register. Take a look at Example 1 to see how does the open() syscall in a Linux system looks when examined through gdb.
With some exceptions like the socket functions, where some arguments are passed through the stack.
12
The breakpoint stops execution inside libcs wrapper for the open() syscall. Arguments passed through the stack to libc_open are copied into the respective registers. EAX is loaded with 5, the syscall number for open(). The software interrupt for system services is triggered. So, given that a generic and simple mechanism for calling any syscall is in place, and using the architecture and argument marshaling techniques explained in the section called Fat client, thin server, we can code a simple server that looks like Example 2.
Sets up the communication channel between the client and the server. To keep things simple, well assume this is a socket. Copies the complete request to the servers stack. Remember how does the request look like (see Figure 6). The request block is sent back to the client. This is done to return any OUT 3 pointer arguments back to the client application. This might be redundant in some cases but since our intention is to keep the server simple, it wont handle these cases differently. An excerpt from a simple implementation of a syscall server implementing this behavior can be seen at Example 3. This excerpt refers only to the main read request / process / write response loop.
3
13
# count
14
Read request straight into ESP. Pop registers from the top of the request (the request is in the stack). Invoke the syscall. EAX already holds the syscall number (it was part of the request). Send back the request buffer as response, since buffers might contain data (for example if calling read()). In this way, we are able to code a simple but yet very powerful syscall server in a few bytes. The sample complete implementation of the server described above, with absolutely no effort made on optimizing the code, is about a hundred bytes long.
15
The GetProcAddress function retrieves the address of an exported function or variable from the specied dynamic-link library (DLL). Windows Platform SDK: DLLs, Processes, and Threads Using these two functions along with the capability of calling any function in the servers address space, we can fulll the initial goal of "calling any function on any DLL" (see Example 4).
The addresses of LoadLibrary and GetProcAddress in the servers address space are passed back to the syscall client upon initialization. The client can then use the generic "call any address" mechanism implemented by the servers main loop to call these functions and load new DLLs in memory. The client passes the exact address of the function it wants to call inside the request. The stack is already prepared for the call (as in Figure 6). In this way, an equivalent server can be coded in the Windows platform. Argument marshaling on the client side is not so trivial, since we are providing a method for calling ANY function. A generic method for "packing" any kind of arguments (integers, pointers, structures, pointers to pointer to structures with pointers, etc) is necessary.
16
Code execution vulnerabilities are those that allow an attacker to execute arbitrary code in the target system. Typical incarnations of code injection vulnerabilities are: buffer overows and user-supplied format strings. Attacks for these vulnerabilities usually come in 2 parts 4 : Injection Vector (deployment). The portion of the attack directed at exploiting the specic vulnerability and obtaining control of the instruction pointer (EIP / PC registers).. Payload (deployed). What to execute once we are in control. Not related to the bug at all. A common piece of code used as attack payload is the "shellcode": Shell Code: So now that we know that we can modify the return address and the ow of execution, what program do we want to execute? In most cases well simply want the program to spawn a shell. From the shell we can then issue other commands as we wish. But what if there is no such code in the program we are trying to exploit? How can we place arbitrary instruction into its address space? The answer is to place the code with are trying to execute in the buffer we are overowing, and overwrite the return address so it points back into the buffer. Aleph One, Smashing The Stack For Fun And Prot Shell code allows the attacker to have interactive control of the target system after a successful attack.
Taken from Greg Hoglunds, Advanced Buffer Overow Techniques, Black Hat Briengs USA 2000.
17
Conclusions
Syscall Proxying is a powerful technique when staging attacks against code injection vulnerabilities (buffer overows, user supplied format strings, etc) to successfully turn the compromised host into a new attack vantage point. It can also come handy when "shellcode" customization is needed for a certain attack (calling setuid(0), deactivating signals, etc). Syscall Proxying can be viewed as part of a framework for developing new penetration testing tools. Developing attacks that actively use the Syscall Proxying mechanism effectively raises their value.
Acknowledgements
The term "Syscall Proxying" along with a rst implementation (a RPC client-server model) was originally brought up by Oliver Friedrichs and Tim Newsham. Later on, Gerardo Richarte and Luciano Notarfrancesco from CORE ST rened the concept and created the rst shellcode implementation for Linux. The CORE IMPACT team worked on a generic syscall abstraction creating the ProxyCall client interface, along with several different server implementations as shellcode for Windows, intel *BSD and SPARC Solaris. The ProxyCall interface constitutes the basis for IMPACTs multi-platform module framework and they are the basic building block for IMPACTs agents. "Pivoting" was coined by Ivan Arce in mid 2001.
18