Binary Translator (Total 30 points + bonus 10 points)

Assignment instruction

All homework problems are to be done individually. For all programming problems you will be required to submit source code, a README file documenting your files and code, and a test run of your programs. The README should explain any way in which your solution differs from what was assigned, and any assumptions you made. For your convenience, all programming can be developed on any machine that can run Linux. In this homework you will construct a program that can profile the execution of a function using binary patching and write a code for the basic x86 disassembly.

We have provided you with a source skeleton which you should use for this assignment. A couple of notes about the source skeleton provided. The skeleton should compile and run out-of-the-box. In the source you will find several IMPLEMENT macros. You job will be to implement these pieces.

Step 0: Patching (5 pt)

This first step will get you warmed-up and comfortable with patching code. Look at the bottom of main(). Just before main calls fib() it calls StartProfiling. StartProfiling is your hook. It gives you and opportunity to inspect and/or modify the target function, fib() in this case, before it starts executing.

Your objective in step 0 is to use StartProfiling to binary patch fib() to immediately return.

[Hint] What is ‘ret’ instruction in byte value?

cc -g -m32  assign2.c assign2.S -static -Wl,-N -o assign2
objdump -d assign2 > assign2.dis
➜  part0 ./assign2 1     # Return immediately from fib() and print random value.
134514958

Step 1: Callout (5 pt)

In this step you should accomplish the same thing as step 0 but this time using a callout that emulates a RET. The trickiness about callouts is that they need to save all the registers and EFLAGS because the code was not expecting a callout. The normal gcc calling conventions don’t work. Find the glue code in assign2.S.

You should patch the first instruction on fib() with a callout. The callout should emulate the behavior of the RET by returning not to the calling site of the callout (which is the normal behavior) but directly to the return PC on the stack.

[Hint] You will patch fib() to call glue code. What is the format for ‘call’ instruction?

➜  part1 ./assign2 3
In handleRetCallout    # this line is printed from handleRetCallout()
66   # random value

Part 2: Callout and return (5 pt)

In this step, you should patch the beginning of fib() to callout and return to the original code to resume execution.

To achieve this you need to store the original instructions before patching, and unpatch the instructions from handler function. Then the execution can return to the original location to run the fib().

[Hint] If you use ‘call’ instruction here, you need handle return IP (instruction pointer) pushed to the stack. How are you going to handle this?

➜  part2 ./assign2 10
In handleCallCout  # this line is printed from handleCallCallout() 
89   # Correct value for fib(10)

Part 3: Disassembly (15pt)

The goal of this step is to use InitProfile to decode a block of instructions. You need only to decode enough of each instruction to determine its length. By doing this you should be able to decode a block of instructions of arbitrary length. StartProfiling should print the address, opcode, and length of the first 24 instructions of fib().

Use the pseudo-code figures presented from the class slide as a guide. An ia32 opcode map has been provided in ia32DecodeTable to save having to type all this out. Use it as you see fit.

There are also many online resources you can refer to.

➜  part3 ./assign2 3
address 0x8048b31, opcode: 55, len: 1
address 0x8048b32, opcode: 89, len: 2
address 0x8048b34, opcode: 53, len: 1
address 0x8048b35, opcode: 83, len: 3
address 0x8048b38, opcode: e8, len: 5
address 0x8048b3d, opcode: 5, len: 5
address 0x8048b42, opcode: 83, len: 4
address 0x8048b46, opcode: 7f, len: 2
address 0x8048b48, opcode: b8, len: 5
address 0x8048b4d, opcode: eb, len: 2
address 0x8048b4f, opcode: 8b, len: 3
address 0x8048b52, opcode: 83, len: 3
address 0x8048b55, opcode: 83, len: 3
address 0x8048b58, opcode: 50, len: 1
address 0x8048b59, opcode: e8, len: 5
address 0x8048b5e, opcode: 83, len: 3
address 0x8048b61, opcode: 89, len: 2
address 0x8048b63, opcode: 8b, len: 3
address 0x8048b66, opcode: 83, len: 3
address 0x8048b69, opcode: 83, len: 3
address 0x8048b6c, opcode: 50, len: 1
address 0x8048b6d, opcode: e8, len: 5
address 0x8048b72, opcode: 83, len: 3
address 0x8048b75, opcode: 1, len: 2
3

Part 4: Trampoline (bonus 10 pt)

In this step, you will extend part 2 patching the beginning of fib() so as it can callout as many times as the fib() is called. This would require you to go over the process of patch, unpatch, patch again the beginning of fib() function.

➜  part4 ./assign2 3
fib(3) called
fib(2) called
fib(1) called
fib(0) called
fib(1) called
3

Resources

You may find this reference helpful for PC assembly language programming. You will need the Intel IA32 manuals for exact instruction formats and decoding rules. You can find them here: