Hello all and welcome back. I was busy wasting time at work and was researching some interesting topics I could write about when I found this article. It blew my mind, namely the last section, and I definitely and immediately wanted to write about it.
Let’s go over the concept. It has to do with assembly so make sure you’ve brushed up on your x64, contrary to our commonly used x86.
Look at this assembly code with me for a bit. What does it look like to you?
do_some_math PROC
push rbp
push rdi
movabs rax,0xeecc04eb28ec8348
movabs rbx,0x5eb00007fffbe41
add rax,rbx
movabs rbx,0x112207eb20e6c149
or rax,rbx
movabs rbx,0x14eb419e1780bf41
xor rax,rbx
mov rbx,QWORD PTR [rsp]
add rax,rbx
mov rbx,QWORD PTR [rsp+0x4]
add rax,rbx
movabs rbx,0x5ebd6ff41fe014d
sub rax,rbx
movabs rbx,0x778877c328c48348
imul rax,rbx
shr rax,0x4
pop rdi
pop rbp
do_some_math ENDP
You don’t need to be an ASM pro to figure out what this function is doing. Slow down and look at the key instructions. It’s taking very large numbers and performing some serious arithmetic on them. Looks harmless, right?
Well, what if I told you there is another function inside of this assembly code?
What? Where?
Yeah. That was my reaction too.
Under the right circumstances, if you place the instruction pointer within this function, something else entirely will happen, minus a crash, of course.
Now, I’m not saying you’re just patching over this function with another function that you want. This is something different entirely.
Let’s zoom out a little bit and look at the opcodes that would be generated by this code.
55 push rbp
57 push rdi
48 b8 48 83 ec 28 eb movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00 movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3 add rax,rbx
48 bb 49 c1 e6 20 eb movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3 or rax,rbx
48 bb 41 bf 80 17 9e movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3 xor rax,rbx
48 8b 1c 24 mov rbx,QWORD PTR [rsp]
48 03 c3 add rax,rbx
48 8b 5c 24 04 mov rbx,QWORD PTR [rsp+0x4]
48 03 c3 add rax,rbx
48 bb 4d 01 fe 41 ff movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3 sub rax,rbx
48 bb 48 83 c4 28 c3 movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3 imul rax,rbx
48 c1 e8 04 shr rax,0x4
5f pop rdi
5d pop rbp
Now, what if I told you that the opcode for a short relative jump was EB. Look a little closer. I’ll bold them for you.
55 push rbp
57 push rdi
48 b8 48 83 ec 28 eb movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00 movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3 add rax,rbx
48 bb 49 c1 e6 20 eb movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3 or rax,rbx
48 bb 41 bf 80 17 9e movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3 xor rax,rbx
48 8b 1c 24 mov rbx,QWORD PTR [rsp]
48 03 c3 add rax,rbx
48 8b 5c 24 04 mov rbx,QWORD PTR [rsp+0x4]
48 03 c3 add rax,rbx
48 bb 4d 01 fe 41 ff movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3 sub rax,rbx
48 bb 48 83 c4 28 c3 movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3 imul rax,rbx
48 c1 e8 04 shr rax,0x4
5f pop rdi
5d pop rbp
Do you see them now? A short relative jump sends the instruction pointer up N bytes. So these can be followed to find the next instruction that would be executed. Let’s look one more time.
55 push rbp
57 push rdi
48 b8 48 83 ec 28 eb movabs rax,0xeecc04eb28ec8348
04 cc ee
48 bb 41 be ff 7f 00 movabs rbx,0x5eb00007fffbe41
00 eb 05
48 03 c3 add rax,rbx
48 bb 49 c1 e6 20 eb movabs rbx,0x112207eb20e6c149
07 22 11
48 0b c3 or rax,rbx
48 bb 41 bf 80 17 9e movabs rbx,0x14eb419e1780bf41
41 eb 14
48 33 c3 xor rax,rbx
48 8b 1c 24 mov rbx,QWORD PTR [rsp]
48 03 c3 add rax,rbx
48 8b 5c 24 04 mov rbx,QWORD PTR [rsp+0x4]
48 03 c3 add rax,rbx
48 bb 4d 01 fe 41 ff movabs rbx,0x5ebd6ff41fe014d
d6 eb 05
48 2b c3 sub rax,rbx
48 bb 48 83 c4 28 c3 movabs rbx,0x778877c328c48348
77 88 77
48 0f af c3 imul rax,rbx
48 c1 e8 04 shr rax,0x4
5f pop rdi
5d pop rbp
Look at all of those bytes. Now, to extract them and disassemble them. You can use an online disassembler.
C3 is the opcode for ret, which ends a subroutine.
sub rsp,0x28
jmp 0xa
mov r14d,0x7fff
jmp 0x13
shl r14,0x20
jmp 0x1b
mov r15d,0x419e1780
jmp 0x30
add r14,r15
call r14
jmp 0x29
add rsp,0x28
ret
Let’s get rid of those jmp instructions.
sub rsp,0x28
mov r14d,0x7fff
shl r14,0x20
mov r15d,0x419e1780
add r14,r15
call r14
add rsp,0x28
ret
This looks very interesting. What’s happening here is an address is loaded into a register (R14
), which is called. Let’s look at how this is done.
Firstly, the stack frame is prepared
sub rsp,0x28
Then, the lower 32 bits of R14
are set to 0x7FFF
.
mov r14d,0x7fff
R14
is shifted left by 32 (0x20
) bits. This sets its value to 0x7FFF00000000
.
shl r14,0x20
The lower 32 bits of R15
are set to 0x419e1780
.
mov r15d,0x419e1780
R14 and R15 are added together to equal 0x7FFF419e1780
.
add r14,r15
Now that I’m writing this. I think I could’ve just done
mov r14d value
and be done with it. Oh well.
Lastly, the address that R14
contains is called.
call r14
And cleanup.
add rsp,0x28
ret
All of this hidden within another function. Not very easy for disassemblers to catch something like that, hm?
I wrote a program and pairing DLL that hides and calls a wrapped function. It’s on my Tor site under programs called “asmwrap”. Let’s look at the DLL in IDA for a little bit.
There’s a function called func_with_no_xrefs
. It, of course, has no xrefs.
I had to compile the DLL under Visual Studio’s Debug configuration since a function with no xrefs would be optimized out.
There’s also a function called do_some_math
. Look familiar?
This looks completely harmless. However, the large 8-byte numbers are going to be replaced with the custom call function. test.cpp does this. It’s pretty ugly, so I won’t paste it here. It loads the asmwrap DLL, patches over the instructions with the wrapped code, and calls it. It even invokes the math function before and after patching over it.
That’s all I wanted to share for this post. It’s pretty technical. I usually leave that for my paid posts, but this was so cool, I wanted to share it with everyone.
This is such a cool concept, and it’s scalable too! Imagine hiding an encryption key or some malicious shellcode inside of a function that’s actually useful. Very, very cool.
Have a great weekend!
Go!
-BowTiedCrawfish
Have you ever encountered this in the wild?