Sunday, October 20, 2013

Tricky and powerful anti-tracing mechanisms with BTF and LBR

If you haven't already read this, you probably should. It covers the fundamentals of what will be discussed here. That way, I can assume you already know what is going on and I don't have to cover all the miniscule details in this post :)

Back already eh?

Simply setting the trap flag with an iret/popf variant has always been a common technique to thwart single-stepping. There are also API's to offer similar functionality, we wont cover them today because that isn't really the scope here.

One of the most common is something similar to this:

pushf
or word ptr [sp], 0100h
popf
xor eax, eax
xor ebx, ebx
nop

As you know,  when the boundary of xor eax, eax is reached, we will have an int 01 trap with a saved IP of whatever follows it. Again as you should hopefully know, this is common method to trick a debugger that is already single stepping this sequence into thinking that it caused the exception and to continue right along. Now any debugger worth its weight in (bytes? gold? plugins?), or a user who isn't just auto-tracing and looking manually, should catch this.

There are a few plugins already for various debuggers that check the trap flag status prior to popf/iret/syscall/ints and attempt to act accordingly, like resuming the trace operations at KiUserExceptionDispatcher.

Now lets look at this sequence again, but imagine that BTF is enabled.

pushf
or word ptr [sp], 0100h
popf
xor eax, eax
xor ebx, ebx

nop
 Now lets just assume for a minute that no debugger is attached. Execution will continue right along after popf/popfd and no trap will be recognized. This as you know is because even though TF is set, we haven't hit a taken branch. Thus no trap. We could then modify our sequence a bit into something like this:

pushf
or word ptr [sp], 0100h
popf
xor eax, eax
xor ebx, ebx

nop
jmp 02h
xor eax, eax
nop

The trap will occur after the boundary of the unconditional jump is reached. The application can then handle accordingly.

Now lets throw OllyDbg into the mix and step through this sequence. You will notice how Olly will single step normally normally over the sequence. Olly will mask off Dr7.BTF after debug event, even if it passes the event back to user code. This means the following situations could easily happen:

-A user or a plugin unaware of this during a trace could mistakenly let the application process a single step exception which followed an instruction that set EFLAGS.TF. The application would see this and act accordingly (like.. explode or something.)

-Ollydbg AND WinDbg both mask off Dr7.BTF when sending an exception back to KiUserExceptionDispatcher. This means that for the duration of the exception chain dispatching, BTF will have no effect.

So the following scenario would ensue:

The following is executed while debugger is attached.

 mov rcx, hThread //thread handle
mov rdx, context //context setting LBR and BTF
call SetThreadContext
//random crap
nop
xor ebx, ebx
mov eax, 0x1
shl rax, 0x10
//cause some kind of event
int 3
  The application must have wanted this, so pass it back. But since the debugger masked Dr7.BTF, setting the trap flag in your exception handler with popf/iret will cause a trap at following instruction boundary. Otherwise nothing would happen until you either A. reset the flag, or b, hit a taken branch. This is ample evidence that a debugger is involved.

IDA's win32 debugger and Cheat Engine do not have this problem, but don't worry, we have something up our sleeve for them. Also a quick side-note here; a year or so ago, a colleague of mine made some real fun of me for using Cheat Engine as a dynamic analysis and debugging tool. Contrary to whatever he thinks, anyone who does this as a passion loves Cheat Engine. The arsenal just isn't complete without it.

Here is how we can fool them all.

Reminder: LBR data will only be written to the ExceptionInformation structure if the trap flag is set when a #DB exception occurs. In this case we use ICEBP for our #DB. ICEBP for all intents and purposes is a #DB exception.

So if we single step OR branch step over the following magical sequence, it will easily be detectable:

//LBR and BTF already set

 inc eax
cmp eax, 0x5
je 02h
xor ebx, ebx
mov ecx, edx
popfd                             //sets trap flag
icebp
 Our first assumption is that the debugger is smart enough to detect ICEBP, whether it be by decoding the instruction stream or checking Dr6, and then passing the exception back to the application. If this isn't happening then the application already wins this round because the exception chain was never dispatched.

If no debugger is tracing this sequence, the ExceptionInformation fields rendered to our application via the EXCEPTION_RECORD structure will contain the linear address of the 'je 02h' instruction, and the second field will contain the linear address of the 'mov ecx, edx' instruction.

If a debugger were single stepping over this sequence, it's implied that it masked Dr7.BTF, and maybe even Dr7.LBR. In either case, even if it only masked one, the ExceptionInformation fields will have a null index, and no data.

Furthermore, if the debugger were branch tracing instead of single stepping over this sequence meaning it left BTF and LBR on, the ExceptionInformation data would contain the linear address of KiDebugTrapOrFault's IRET instruction, followed by the linear address of 'mov ecx, edx. If the debugger for some reason decided to mask off LBR but leave BTF enabled, ExceptionInformation index would be null and the fields would be empty.


In either of the above case, if the debugger didn't preserve LBR or BTF, the improper values would be stored in the ExceptionInformation fields, and we could assume a debugger is attached.

The BTF and LBR Dr7 backdoors exist from XP to Windows 8 in both 32 and 64 bit editions of Windows making this a highly portable anti-debug/trace technique.


Tuesday, October 15, 2013

User/kernel shared page continued...

 This is a continuation of the original post
 
Finally had some time to look this one over. As you hopefully recall in the previous installment I mentioned how I noticed data fluctuation in the same area of the page for 32 bit builds of Windows 7 (haven't checked 8 for either build yet).

As I guessed it's pretty much the same functionality (garbage stack portion) and can be used to infer /debug. This is the mode where a kernel debugger is not necessarily attached, but can be at anytime. Other indicators such as KdDebuggerEnabled at 0x2D4 or KdDebuggerNotPresent which as you know can be queried with NtQuerySystemInformation will not be of any value.

Anyways in this case, it's close to the same but not entirely. KdInitSystem parses the load options, if /debug is set, we expand our stack further than anticipated for a normal boot phase and land at DbgLoadImageSymbols which uses int 2D (debugger services, like symbols ;p) regardless of whether or not a KD is actually present, if not it's just caught by exception handlers in this case.

Now since we grew the stack quite a bit, and the stack pages were zeroed to begin with, we find ourselves at KiInitializeXStatePolicy. This function writes vendor specific extended processor feature bits into the shared page. It allocates a good 0x450 bytes, which then uncovers the garbage left behind (or is it?) from the DbgLoadImageSymbols interrupt control transfer and exception dispatch.

If the value at 0x4C0 is non-zero, this is enough to indicate. It is highly improbable that the Xsave features will extend that far, but starting at Xsave and searching at a 4 byte boundary for 0xFFFFFD34 would be a more appropriate solution. Similar to the 4 byte 'DBGP' signature for 64 bit builds.

This applies to an original deployed 32 bit copy, all the way to the most recent Windows updates.

Keep in mind this is only for 32 bit builds of Windows 7. The same deal exists in x86/64 targets but is a slightly different story.