Saturday, July 19, 2014

Hooking functions with Cuckoobox's hooking engine

If you're using Cuckoobox for dynamic analysis, you might come across scenarios where a particular API call that you want to monitor, is not hooked into cuckoomon out of the box. Here's how you might go around doing it.

Cuckoomon's hooking engine, provides abstraction over the dirty details of the actual hooking process, and thus this is going to be a pretty straight forward endeavor.

First off, you can get cuckoomon's source code from git
https://github.com/cuckoobox/cuckoomon

Now, we'll go over some of the relevant source code files for this process.

cuckoomon.c   The entry point source file of dll, with the DllMain routine and other generic definitions
hooks.h   Contains declarations of the functions to be hooked, via the use of Macros HOOKDEF / HOOKDEF2   If you're interested the macro definitions can be found in header file hooking.h
hook_(category).c   Multiple files, that contain definitions for hooked functions, again via the use of HOOKDEF / HOOKDEF2 macros and mechanisms for logging arguments and return values. These are divided between these files based on the functions' apparent categories
log.c / log.h   Contains definitions and declarations for some functions and macros that facilitate in the above mentioned logging process 
logtbl.py   A python file that basically contains a table of all the hooked functions, and summary of their recorded argument/return values, used during C to Python communication

Okay, so now that we have gone over the files that we'll need to modify we can start some actual work! I find it easiest to understand when something is explained by example, so we'll do just that.

As of cuckoomon that ships with version 1.1 of cuckoobox, an API routine QueueUserAPC's hook is missing. QueueUserAPC is very commonly used during code injections into a remote process, and thus hooking it and writing a signature based off of it can be very fruitful for someone running dynamic analysis on a piece of malware.

In order to hook QueueUserAPC, we'll need to look up the function's declaration and library on MSDN
http://msdn.microsoft.com/en-us/library/windows/desktop/ms684954(v=vs.85).aspx
Okay, so now we know from MSDN that QueueUserAPC is located in Kernel32.dll  And the function is declared as

DWORD WINAPI QueueUserAPC(
  _In_  PAPCFUNC pfnAPC,
  _In_  HANDLE hThread,
  _In_  ULONG_PTR dwData
);

From here onwards, I'll break down the procedure into steps.


STEP 1


We'll add QueueUserAPC and its parent library Kernel32.dll to array g_hooks in cuckoomon.c like so

static hook_t g_hooks[] = {
.
.
.

    //
    // Process Hooks
    //
.
.
.
    HOOK(kernel32, VirtualProtectEx),
    HOOK(ntdll, NtFreeVirtualMemory),
    HOOK(kernel32, QueueUserAPC), // Our function

The HOOK Macro is nothing tricky, it just simplifies the array entry for us. For those interested, it is declared somewhere at the top in cuckoomon.c



STEP 2


For our next step, we'll go and add the declaration of our function to be hooked, to the header file hooks.h  This declaration will follow the following generic format
extern HOOKDEF(returntype, api, funcname, arg1, arg2...)
 like so

//
// Process Hooks
//
.
.
.
extern HOOKDEF(DWORD, WINAPI, QueueUserAPC,
    __in  PAPCFUNC pfnAPC,
    __in  HANDLE hThread,
    __in  ULONG_PTR dwData
);

The Macro HOOKDEF is pretty straight forward, however for the sake of completion I've preprocessed our little addition to show you what it actually expands to

extern DWORD (WINAPI *Old_QueueUserAPC)(__in PAPCFUNC pfnAPC, __in HANDLE hThread, __in ULONG_PTR dwData); 
DWORD WINAPI New_QueueUserAPC (__in PAPCFUNC pfnAPC, __in HANDLE hThread, __in ULONG_PTR dwData);


STEP 3


Now, its time to add the definition of our hook's replacement function (the one we've seen prefixed with 'New_' so far). Since QueueUserAPC's functionality deals with processes, we will add the definition to hook_process.c

HOOKDEF(returntype, api, funcname, arg1, arg2...)
{
    // ”is_success” definition macro will come here (explained later)
    int ret = Old_funcname(arg1, arg2...); // original function is called here
    LOQ(fmt, arg1name, arg1, arg2name, arg2...); // LOQ macro (explained later)
    return ret;
}

Before explaining this, let me just show you how we'll go on about with QueueUserAPC's corresponding definition

HOOKDEF(DWORD, WINAPI,QueueUserAPC,
    __in  PAPCFUNC pfnAPC,
    __in  HANDLE hThread,
    __in  ULONG_PTR dwData
) {
    IS_SUCCESS_NONZERO();  // Determine which return values will represent function success

    int ret = Old_QueueUserAPC(pfnAPC, hThread, dwData);  // Original function called
    LOQ("ppp", "FunctionPointer", pfnAPC, "ThreadHandle", hThread, "ArgumentAddress", dwData);  // does logging of arguments/return values
    return ret;
}

Without digging too deep in IS_SUCCESS_NONZERO and LOQ macros, it is pretty easy to understand that the original function is simply being called in this replacement function, the arguments and return values are being recorded, and original function's return value consequently is being passed to the replacement function's return statement.

Since we now have a general understanding, we'll look more closely into the aforementioned macros by preprocessing our code snippet


DWORD (WINAPI *Old_QueueUserAPC)(__in PAPCFUNC pfnAPC, __in HANDLE hThread, __in ULONG_PTR dwData); 
DWORD WINAPI New_QueueUserAPC(__in PAPCFUNC pfnAPC, __in HANDLE hThread, __in ULONG_PTR dwData)
{
    int is_success(int ret) 
    { 
     return ret != 0; 
    };

    int ret = Old_QueueUserAPC(pfnAPC, hThread, dwData);
    static int _index; 
    if(_index == 0) 
     _index = log_resolve_index(&__FUNCTION__[4], 0);
    loq(_index, &__FUNCTION__[4], is_success(ret), (int) ret, "ppp", "FunctionPointer", pfnAPC, "ThreadHandle", hThread, "ArgumentAddress", dwData); 
    return ret;
}

Okay, so we can see now that the IS_SUCCESS_NONZERO macro, just defined a function which returns true, if  ret i.e. original function's return value is not 0. And this function is later being used in the 2nd last line of the code as an argument to the loq function. So basically, the purpose of this macro is to define a function that will return true if the original function call was successful.

As for the loq function, we will discuss it later in a little more detail, however for now we can safely conclude that, almost the exact arguments we passed to LOQ macro are being passed to this function with a few additions. The addition worth highlighting here is again the call to is_success function, which is telling loq whether the original function call was successful or not.

In the last 6 arguments we basically tell loq the arguments of the original function we want to log, and the labels we want to log them with. For example, here we are saying that we want to record argument pfnAPC with the label FunctionPointer and so on.

"ppp" is the format specifier, which will become more clear once we get an overview of the function loq. For now, suffice to know that it tells loq the number and type of arguments that we are setting up to be logged.

In order to understand the essentials of loq, I'll post a couple of snippets from its code in log.c


snippet 1
            //now ignore the values
            if(key == 's') {
                (void) va_arg(args, const char *);
            }
            else if(key == 'S') {
                (void) va_arg(args, int);
                (void) va_arg(args, const char *);
            }
            else if(key == 'u') {
                (void) va_arg(args, const wchar_t *);
            }
            else if(key == 'U') {
                (void) va_arg(args, int);
                (void) va_arg(args, const wchar_t *);
            }
            else if(key == 'b') {
                (void) va_arg(args, size_t);
                (void) va_arg(args, const char *);
            }
            else if(key == 'B') {
                (void) va_arg(args, size_t *);
                (void) va_arg(args, const char *);
            }
            else if(key == 'i') {
                (void) va_arg(args, int);
            }
            else if(key == 'l' || key == 'p') {
                (void) va_arg(args, long);
            }

snippet 2
        // log the value
        if(key == 's') {
            const char *s = va_arg(args, const char *);
            if(s == NULL) s = "";
            log_string(s, -1);
        }
        else if(key == 'S') {
            int len = va_arg(args, int);
            const char *s = va_arg(args, const char *);
            if(s == NULL) { s = ""; len = 0; }
            log_string(s, len);
        }
        else if(key == 'u') {
            const wchar_t *s = va_arg(args, const wchar_t *);
            if(s == NULL) s = L"";
            log_wstring(s, -1);
        }
        else if(key == 'U') {
            int len = va_arg(args, int);
            const wchar_t *s = va_arg(args, const wchar_t *);
            if(s == NULL) { s = L""; len = 0; }
            log_wstring(s, len);
        }
        else if(key == 'b') {
            size_t len = va_arg(args, size_t);
            const char *s = va_arg(args, const char *);
            log_buffer(s, len);
        }
        else if(key == 'B') {
            size_t *len = va_arg(args, size_t *);
            const char *s = va_arg(args, const char *);
            log_buffer(s, *len);
        }
        else if(key == 'i') {
            int value = va_arg(args, int);
            log_int32(value);
        }
        else if(key == 'l' || key == 'p') {
            long value = va_arg(args, long);
            log_int32(value);
        }

We'll first look at snippet #2, where certain keys are being checked then their values are being logged. Each of these keys actually represent a datatype of an argument, which is then retrieved and logged. For example in our case, QueueUserAPC has 3 arguments, all of which are essentially addresses and can be resolved to the datatype long. Thus all of our arguments can be represented by the key "p".

Now, if we look at snippet #1, its almost the same except that the arguments aren't being logged explicitly. This part of the code actually interacts with mongodb. However what we need to know is that all datatypes of the arguments we want to log, must be handled in this function right here. If any particular datatype is not being handled, then we'll have to explicitly add an extra elif statement in both snippet regions of loq, with a unique key to represent that exact datatype.



        else if(key == 'U') {
            int len = va_arg(args, int);
            const wchar_t *s = va_arg(args, const wchar_t *);
            if(s == NULL) { s = L""; len = 0; }
            log_wstring(s, len);
        }

Okay, so the above little snippet is just to highlight, how "buffers" can be logged using this scheme. Here we are essentially logging a wide character string, with first retrieving the buffer length and then its address.

Now that we understand how argument logging is being done we'll go back to QueueUserAPC's format specifier which we wrote down as "ppp". I think we can easily conclude now that the fmt (format specifier) basically lists down the keys of the datatypes of the arguments of a function that we are logging in the exact order. So in our case since all 3 arguments we are logging essentially resolve down to the same datatype, our fmt became "ppp".

However theres another little trick up here, if we have have multiple consecutive arguments of the same datatype, instead of repeating the key, we can also represent it by preceding it with a count number of how many times it is being repeated consecutively. For instance "ppp" here can also be written as "3p".


There's just one little loose end that we need to tie up before moving on to the next step. A little up we used the IS_SUCCESS_NONZERO macro, and also established its purpose. Now, the thing to be mindful of is that not all of the return types and their corresponding  success values would necessarily have been handled by cuckoomon. You can look at all the available IS_SUCCESS* macros in log.h.

As it turned out the success scenario of QueueUserAPC also had not been handled by default. QueueUserAPC returns a non zero value when successfully executed. So we add this self explanatory little macro to log.h as shown below.


#define IS_SUCCESS_NONZERO() int is_success(int ret) { \
    return ret != 0; }





STEP 4


Now we turn to the logtbl.py file and add our hooked function summary to the table list. The format of this entry will be something like
(“funcname”, “category”, (“fmt”, “arg1label”, “arg2label”.....)),
"funcname" is obviously the function name, "category" is the category we chose to place our function in. For example we had chosen the "process" category for QueueUserAPC. "fmt" is the format specifier, and "arg*labels" are the labels of the arguments we are logging that we passed to the LOQ macro. The reason I refer to them as argument labels and not as argument names is because they can be different from actual names used in the original function definition. QueueUserAPC's table entry would be something like

("QueueUserAPC", "process", ("ppp", "FunctionPointer", "ThreadHandle", "ArgumentAddress")),

Note: This step needs to be repeated for an identical file in cuckoo's code itself
cuckoo/lib/cuckoo/common/logtbly.py




STEP 5


Finally its time for compilation, the process of which can vary a little from box to box, however I'll be cross compiling this DLL on an ubuntu box. Here's how you would go about it

Install mingw32
sudo apt-get install mingw32
Edit makefile, changing CC value from gcc to
/usr/bin/i586-mingw32msvc-gcc
After changing directory to cuckoomon's root, run make


We'll place the newly compiled DLL in cuckoo's directory structure at the following path, after backing up the original
cuckoo/analyzer/windows/dll

And after submitting an executable that calls QueueUserAPC, we can finally see the fruit of our hard word!



The behavioral analysis report of the submitted binary shows QueueUserAPC being called, and its execution information. We can now use cuckoobox's signature API to write a dynamic analysis signature based off of QueueUserAPC's calls.