Tag Archives: ollydbg

First steps to Reverse Engineering an Executable

Reverse engineering has always been an obsession of mine. As a child, I used to go around garage sales, looking for old electronics for the sole purpose of opening them up to mess around with the insides. There is just something gratifying about opening up a closed system to see how it works, where it cuts corners, how it could be modified to work better. Software is no different. This the main reason I love open source: the code is available and for someone like me, who has an affinity for finding (read causing) bugs, available source makes it easy to find areas which need fixing.

But nothing easy is ever fun.

I have developed a strong interest in reverse engineering projects such as: wine, OpenMW, Freeablo, OpenTTD, OpenRCT2, Mono, Monogame, limetext and many others. It’s the sort of thrill that says: Anything you can do I can do better and make it free.

Personally, I’ve been interested in reverse engineering 3DO’s 1999 classic Heroes of Might and Magic 3, even though it is set for re-release by Ubisoft in 2015. The plan is to use LÖVE and take advantage of its high portability and simple 2D game API.

The first step I took is reversing the map file format. This is a tedious process as I could not find any documentation about it, unlike the abundance of documentation you will find for the Elder’s Scrolls series. Luckily, the game comes with a map editor and with the help of a hex editor and a tool I wrote to document the data structures, I was able to identify many parts of the data structures used. That, however, is the subject of another post.

The subject of this post is another strategy I am working on: reverse engineering the map editor’s code in order to extract the map loading logic. This means diving into the compiled code and changing the Portable Executable (PE) structure and inserting our code before the call to the main function.

The software I used in this guide are Ollydbg, a hex editor, mingw x86 compiler and CFF Explorer, the only non-open source software.

Adding a dll to the import table

The strategy I am using is the exact same that is used in OpenRCT2. This involves adding a custom dll for which we have full control of the functions and source code to the compiled executable’s dll import table.

First, we create a dll. Let’s call it divertedmain.dll:

__declspec(dllexport) int __cdecl DivertedMain()
{
    return 42;
}

We only want something dead simple. A main function which returns 42 won’t add more than it needs to and will return a 42 return code, letting us know that it has indeed worked.

To compile and link the dll:


i686-w64-mingw32-gcc -c -o divertedmain.o divertedmain.c

i686-w64-mingw32-gcc -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows

Now let’s fire up CFF Explorer and add our new dll with the import adder, making sure it’s in the same directory as the executable and importing by name the DivertedMain function.

Rebuild the import table and save the executable. To make sure it’s importing the dll’s exported function you can check the Import Directory in CFF Explorer. Another way would be to simply remove the divertedmain.dll file and there should be an error message when trying to load the executable.


$ wine executable-with-imported-dll.exe    
err:module:import_dll Library divertedmain.dll (which is needed by L"executable-with-imported-dll.exe") not found
err:module:LdrInitializeThunk Main exe initialization for L"executable-with-imported-dll.exe" failed, status c0000135

Overriding main() function

This part is a little more tricky and requires some trial and error.

We return 42 in the DivertedMain function. Once compiled, this number can be easily found. To see what the function will look like, we can compile the dll source to assembly. Note that gcc returns AT&T style assembly by default and Ollydbg uses Intel style assembly, the -masm=intel flag fixes this:

// i686-w64-mingw32-gcc -S -masm=intel -c -o divertedmain.s divertedmain.c
    .file    "divertedmain.c"
    .intel_syntax noprefix
    .text
    .globl    _DivertedMain
    .def    _DivertedMain;    .scl    2;    .type    32;    .endef
_DivertedMain:
    push    ebp
    mov    ebp, esp
    mov    eax, 42
    pop    ebp
    ret
    .ident    "GCC: (GNU) 4.9.2"
    .section .drectve
    .ascii " -export:\"DivertedMain\""

This snippet will be important to spot while decompiling.

Finding exported function

Loading the modified executable in Ollydbg, we will go to the divertedmain section in the Executable modules section (Alt+E).

This section tells us that the divertedmain code is located in the 0x66B00000 - 0x66B0B000 range. Also note that the executable address space starts at 0x00400000 and that all addresses will be offset by that much. In the divertedmain range, we should find the assembly representation of our DivertedMain() function.

Indeed, in this case it is at the address 0x66B014B0:

It is easily spotted thanks to the assembly compilation from earlier and the flag 2A which represents 42. Again, this requires the need to convert AT&T style asm and intel style asm.

Finding the main() function

We’re looking for the main function, which usually would take the command line parameters as arguments; it is also the function which does most of the logic, so a call which starts the whole program execution is likely to be the main function.

Using Ollydbg, we will step through the functions one instruction at a time. Setting arguments to the call can make it easy to find the function which uses them. In File|Set new arguments…, set “Look for me” as the new arguments.

Using the Step over button  (F8), we will go through each call, looking at the registers for clues.

At about 0x004E7F6A we find a call to KERNEL32.GetCommandLineA which is important to note, since this is where we will get the argument list e.g.argv and indeed, at the next instruction, Ollydbg shows us the executable and the arguments “Look for me” in EAX. This is very important since it means one of the next instructions will call main(). Some of the following calls strip out the executable name from the arguments which is actually the default behaviour for a call to WinMain().

Around 0x004E7FAF, we start seeing a lot of PUSH instructions, a call to KERNEL32.GetModuleHandleA leading up to a call to executable-with-imported-dll.004FBD57. This is a clue that there is a large number of parameters being passed to a function. Since our DivertedMain() function doesn’t take any parameters yet, these will be lost. This, however, should not be a problem for us, yet since the parameters can be added later.

The call to executable-with-imported-dll.004FBD57 starts the application, so we know that this is the call which needs to be edited. For me, it was at address 0x004E7FBF.

Replacing the address of the call to main()

At address 0x004E7FBF, note the current binary form for the call. This will be important to call the original function later.

At that line in Ollydbg, assemble some new code by right clicking on the line and choosing “Assemble…”, or by pressing space. Replace the call address with the address of my DivertedMain() function which was found earlier to be 0x66B014B0. The disassembled view substitutes the address with CALL DivertedMain which is a good sign.
CALL DivertedMain

Note the binary form: E8 EC946166

When Stepping into Screenshot from 2014-12-26 15:04:39(F7) it, the execution starts the DivertedMain() assembly and returns 42.

Saving the changes to the call to DivertedMain()

We now have the address of the call to main() and the binary form of the call to DivertedMain() given by Ollydbg: 0x004E7FBF and E8 EC946166. As we noted earlier, the addresses of the executable are offset by 0x00400000, therefore, in a hex editor, the call is actually at the address 0x000E7FBF.

Opening a hex editor, the byte at 0x000E7FBF is E8 which is the first byte of our new instructions and the x86 opcode for Call Procedure. Looking at the following 4 bytes, we can confirm that it is indeed same call we had in Ollydbg, before we edited it. Using a hex editor, we replace the bytes in 0x000E7FBF - 0x000E7FC3 to be E8 EC946166. Careful to replace and not insert.

We save it as executable-with-divertedmain.exe and run it:

$ wine executable-with-divertedmain.exe
$ echo $?
42

The program didn’t start, but instead quit with the return code 42. This is perfect and it means the main function was diverted into our dll which returns 42. There is no more need to edit the executable and we can now rely on our own code as long as DivertedMain() is the first defined function in our dll.

To demonstrate this, we can get the DivertedMain() function to print out “DIVERTED!” to the console.

#include "stdio.h"

__declspec(dllexport) int __cdecl DivertedMain()
{
    printf("DIVERTED!\n");
    return 42;
}

Recompiling the dll and running the executable should give us:

$ i686-w64-mingw32-gcc -g -c -o divertedmain.o divertedmain.c
$ i686-w64-mingw32-gcc -g -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows
$ wine executable-with-divertedmain.exe
DIVERTED!
$ echo $?
42

Calling the original main function from inside our divertedmain

Our Entry point puts itself between the call to the main function and the program execution, prematurely exiting. In order to reverse engineer, we need to place ourselves between those points without actually stopping the execution. To do that, our first step will be to call the actual main function from the diverted main.

Earlier, we spotted command line parameters being pushed in the disassembled view. These parameters are those of the WinMain function which we replaced.

Before we can call the original main function, we need to get its parameters.

Getting the command line parameters

We simply add the parameters to the function signature and to make sure it works, print out the arguments:

#include
#include "stdio.h"

__declspec(dllexport) int __cdecl DivertedMain(
    HINSTANCE hInstance,
    HINSTANCE hPrevInstance,
    LPSTR lpCmdLine,
    int nCmdShow)
{
    printf("Diverted!: %s\n", lpCmdLine);
    return 42;
}

Recompiling the dll and running the executable should give us:

$ i686-w64-mingw32-gcc -g -c -o divertedmain.o divertedmain.c
$ i686-w64-mingw32-gcc -g -o divertedmain.dll -s -shared divertedmain.o -Wl,--subsystem,windows
$ wine executable-with-divertedmain.exe some arguments
DIVERTED: some arguments
$ echo $?
42

The fact that the executable name was missing seemed odd to me, but that’s just the way WinMain‘s lpCmdLine parameter works.

Fetching and calling the original main Function

Earlier when we replaced the call to main, we overrode an address with the address of our diverted main. That address is the original address to WinMain and we will use it in order to get a pointer to the function. We then call that function.
Replace 0x00000000 with the original address:

#include
#include "stdio.h"

// Address of original call to WinMain
#define WINMAINADDR 0x00000000

__declspec(dllexport) int __cdecl DivertedMain(
    HINSTANCE hInstance,
    HINSTANCE hPrevInstance,
    LPSTR lpCmdLine,
    int nCmdShow)
{
    printf("Diverted: %s\n", lpCmdLine);
    void(* WinMain)(HINSTANCE, HINSTANCE, LPSTR, int) = (void*)WINMAINADDR;
    WinMain(hInstance, hPrevInstance, lpCmdLine, nCmdShow);
    return 42;
}

And there we, go. The original WinMain function is called and we can use this technique to call any other function in the original executable, provided we know the address.

Acknowledgments

I couldn’t have figured this out without the helpful tips of IntelOrca and the detailed articles of Ashkbiz Danehkar.
I would also like to thank my friend Sophy for proofreading.

Further Reading